CN113806606A - Three-dimensional scene-based electric power big data rapid visual analysis method and system - Google Patents

Three-dimensional scene-based electric power big data rapid visual analysis method and system Download PDF

Info

Publication number
CN113806606A
CN113806606A CN202111046249.2A CN202111046249A CN113806606A CN 113806606 A CN113806606 A CN 113806606A CN 202111046249 A CN202111046249 A CN 202111046249A CN 113806606 A CN113806606 A CN 113806606A
Authority
CN
China
Prior art keywords
data
computing
calculation
task
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111046249.2A
Other languages
Chinese (zh)
Inventor
高菘
姚明亮
张龙浩
付恩狄
莫理
梁宇柔
刘永辉
李德华
胡道平
陈远政
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information Communication Branch of Peak Regulation and Frequency Modulation Power Generation of China Southern Power Grid Co Ltd
Original Assignee
Information Communication Branch of Peak Regulation and Frequency Modulation Power Generation of China Southern Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information Communication Branch of Peak Regulation and Frequency Modulation Power Generation of China Southern Power Grid Co Ltd filed Critical Information Communication Branch of Peak Regulation and Frequency Modulation Power Generation of China Southern Power Grid Co Ltd
Priority to CN202111046249.2A priority Critical patent/CN113806606A/en
Publication of CN113806606A publication Critical patent/CN113806606A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/904Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of big data analysis, and discloses a three-dimensional scene-based electric big data rapid visual analysis method and a three-dimensional scene-based electric big data rapid visual analysis system, wherein the visual analysis method comprises the following processing flows: s1: collecting data, serializing the data into a big data file system (HDFS), and meanwhile, persisting the data into a database HBase; s2: determining which scheme is adopted to mine and analyze the data, and adopting a corresponding intelligent algorithm; s3: and mapping the output result set to a visualization module to form graphic information, integrating the result set with a scene by a visualization engine, and outputting the result set to a user in a three-dimensional space field visualization mode. According to the method and the system for rapidly visualizing the electric power big data based on the three-dimensional scene, the big data ecosystem and the visualization module are integrated, so that the graphic calculation and the numerical calculation of the data analysis of the three-dimensional scene are integrated in the same frame, the storage mode and the calculation mode are universal, and the multi-service collaborative visualization analysis work is realized.

Description

Three-dimensional scene-based electric power big data rapid visual analysis method and system
Technical Field
The invention relates to the technical field of big data analysis, in particular to a three-dimensional scene-based electric big data rapid visual analysis method and system.
Background
Big data is a new subject which is emerging in recent years, and research in the field starts soon, and no more intensive research and mature application exist in the field of electric power big data visualization analysis.
The application of the rapid data analysis method and the rapid scene drawing method to the electric power big data requires an integrated platform special for an electric power system. The existing big data application system does not have a specially designed framework for the power system, and the existing national power grid service system does not have an application specially designed for the big data, so that various data and services are separated from each other, and a user cannot perform visual analysis work on a unified platform. The mode of respectively processing different data on different platforms and then trying to integrate the data into a visual analysis system not only brings much inconvenience to users, but also cannot obtain and visually display the correlation among various data in the data mining process.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method and a system for rapidly visualizing analysis of electric power big data based on a three-dimensional scene, and provides an integrated model for visualizing analysis of electric power big data, a big data ecosystem and a visualization module are integrated, so that the graphic calculation of the three-dimensional scene and the numerical calculation of data analysis are integrated in the same frame, and the storage mode and the calculation mode are universal, thereby realizing multi-service collaborative visualization analysis work without adopting the traditional processing mode of performing data mining and exporting before importing data into a visualization system for analysis.
In order to achieve the purpose of the method and the system for rapidly and visually analyzing the electric power big data based on the three-dimensional scene, the invention provides the following technical scheme: the method for rapidly and visually analyzing the big electric power data based on the three-dimensional scene comprises the following processing flows of a visual analysis task:
the method comprises the following steps: collecting data, serializing the data into a big data file system (HDFS), and meanwhile, persisting the data into a database HBase;
step two: determining which scheme is adopted to mine and analyze the data, and adopting a corresponding intelligent algorithm;
step three: and mapping the output result set to a visualization module to form graphic information, integrating the result set with a scene by a visualization engine, and outputting the result set to a user in a three-dimensional space field visualization mode.
The visual analysis system has the following functional modules:
(1) a service module: providing a high-level abstract interface to realize the service requirement of a user layer;
(2) a visualization engine: as a core subsystem of a model, an integration method of data and scenes should be realized, and rapid rendering of a large-scale three-dimensional scene is realized to meet the requirements of practical application;
(3) a calculation module: the method is used for implementing various intelligent algorithms to complete data mining work;
(4) a control module: the method is used for finishing the scheduling function of big data jobs and realizing reasonable load balancing control;
(5) a storage module: for storing data, large data file systems and database systems need to be implemented.
Preferably, the system is layered by combining the possibly generated services with the modules, namely an interface layer, an engine layer, a calculation layer, a control layer and a persistence layer;
(1) durable layer
The method comprises the following steps that a Hadoop file system HDFS and a database system HBase are adopted and used for storing all types of data, including scene data, numerical data, log data generated in actual operation and the like;
(2) control layer
A Hadoop task scheduling module ZooKeeper and a chip-level parallel technology MPI are respectively used for controlling different types of computing tasks, the ZooKeeper is used for controlling a data intensive computing mode STORM, and the MPI is used for controlling a compute intensive computing mode CUDA. The control modules complete task scheduling, low-level load balancing and simple fault-tolerant processing;
(3) computing layer
The method comprises the steps of classifying calculation tasks, not distinguishing the calculation tasks by graph calculation and data calculation, but classifying the calculation tasks into data intensive type and calculation intensive type according to calculation characteristics, wherein the classification method is used for improving the calculation efficiency as much as possible; all computation intensive computations are distributed to the CUDA module for execution;
(4) engine layer
The engine is a core subsystem in the model, the module realizes a fast rendering engine, a plurality of optimization algorithms and strategies are designed for a large-scale three-dimensional scene under a big data environment, the real-time rendering efficiency of the scene can meet the actual application requirement, the real-time rendering problem of the large-scale scene is always a research hotspot, and two methods are respectively provided for accelerating rendering aiming at the problem: one method is a visibility elimination method based on octree, and the other method is a multi-resolution drawing method based on weight function LOD, wherein the two methods will introduce an algorithm and an implementation process in detail in the subsequent sections and verify the efficiency through experiments.
(5) Interface layer
The interface layer is used as a high-level abstraction, and a series of interfaces are defined for operations directly performed by a user, wherein the interfaces comprise operations on a scene and data and other operations such as scene import, data analysis, log export and the like, and the interfaces simultaneously reserve fields for indicating whether the calculation type of a requested task is data intensive or calculation intensive; for tasks that do not require parallel computation, this field should be set to NULL to avoid unnecessary task scheduling and inter-node communication time loss.
Preferably, when the interface layer receives a computing task request, for a data-intensive computing task, the parallelism of the data amount should be considered to be maximized, so a distributed real-time computing framework STORM is adopted; for the calculation intensive calculation task, the parallelism of the function thread should be maximized, so a super computing framework CUDA is adopted;
the STORM only carries out parallel computation on the CPU array, the CUDA only carries out parallel computation on the GPU array, the two middle-layer computing frames do not need to divide own scope in cluster hardware independently, the cluster can adopt a plurality of machines as nodes to realize cooperative parallel computation on heterogeneous resources, the computing layer judges whether a computing task belongs to data intensive type or computing intensive type according to the task type defined by the interface layer, decomposes and distributes the task to the corresponding computing frame, carries out parallel processing on computing resources of different types, the computing task in the CUDA frame can be distributed to all GPU resources in the cluster by Hadoop to carry out parallel computing processing, and the computing task in the STORM frame can be distributed to all CPU resources in the cluster by Hadoop to carry out parallel computing processing.
Preferably, the STORM distributed computing framework uses a single control node (Master) named as Nimbus, when an interface layer receives a service request, the interface layer analyzes which type of computing task is according to the interface type, if the computing task is data intensive, the computing task is submitted to the Nimbus for topology generation operation, the Nimbus sends a generated task topology sequence to a Zookeeper of a control layer, the Zookeeper uniformly schedules the tasks, the STORM computing nodes (Slave) are divided into two types, one type is a spout (Supervisor) for distributing primitives, the other type is a bolt (worker) for computing the primitives, the STORM does not require all the primitives to perform the same operation and is more suitable for processing the data intensive tasks, and the STORM can realize better acceleration ratio for non-iterative tasks (data intensive tasks) in numerical computation; STORM can also achieve better efficiency for computational tasks that are not suitable for acceleration by CUDA in graphics tasks.
Preferably, the CUDA architecture is based on a Single Instruction Multiple Thread (SIMT) model, and is an extension of a Single Instruction Multiple Data (SIMD) model, a function executed on a GPU is called a kernel function (kernel), when the kernel function is executed, the kernel function is concurrently transmitted to all stream processors sp (stream processors) in an array, one kernel is only a function, but not a complete program, before the kernel is executed, the CPU is required to assist in completing Data preprocessing and device initialization, the CUDA calculation process is divided into three stages, i.e., input, execution and output, and in the first stage, a GPU memory space is allocated for the main program for input and output Data, and input Data is transmitted from a CPU memory to a GPU memory; in the second stage, the main program starts a kernel program on the GPU and executes tasks in parallel; in the third stage, when the kernel program is finished, the main program transmits the output data of the kernel program from the GPU memory to the CPU memory so as to obtain an output result;
the CUDA architecture divides computing resources into two classes: the CUDA is a parallel mode of a single control node, a Grid-Block-Thread three-layer model is adopted in a CUDA programming mode, each layer has different indexes, a synchronous mode, a shared memory mode and a collaborative computing mode, a computing task is gradually refined in granularity, Thread with the finest granularity has the highest parallelism, and the CUDA is suitable for large-scale and balanced high-concurrency computing tasks.
Compared with the prior art, the invention provides a method and a system for rapidly and visually analyzing electric power big data based on a three-dimensional scene, and the method and the system have the following beneficial effects:
the electric power big data rapid visual analysis method and system based on the three-dimensional scene is based on a Hadoop ecosystem design, the Hadoop system is widely practical and easy to use in the big data processing field, after being pushed out, the Hadoop system quickly obtains wide attention and research of academia, and is also popularized and applied in the industry, the Hadoop is the most successful and widely accepted big data processing mainstream technology and system platform at present, various functional modules required by complete distributed cluster computing are provided, the Hadoop platform has evolved into a complete ecosystem so far, the Hadoop platform runs on a computing cluster consisting of common commercial servers and even cheap machines, and a cheap, convenient and telescopic big data solution is provided;
the invention provides an integrated system for power big data visual analysis, which integrates a big data ecosystem with a visual module, integrates the graphic calculation of a three-dimensional scene and the numerical calculation of data analysis in the same frame, and enables the storage mode and the calculation mode to be universal, thereby realizing multi-service collaborative visual analysis work without adopting the traditional processing mode of firstly mining and exporting data and then importing the data into a visual system for analysis.
Drawings
FIG. 1 is a flow chart of a visualized analysis task of a power system in a big data environment according to the present invention;
FIG. 2 is a functional block diagram of the electric power big data visual analysis platform of the present invention;
FIG. 3 is a hierarchical architecture diagram of a power big data visualization analysis model according to the present invention;
FIG. 4 is a diagram of resource allocation for different types of tasks in the compute layer of the present invention;
FIG. 5 is a block diagram of the STORM calculation framework of the present invention;
FIG. 6 is a CUDA computational framework diagram in accordance with the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, in the method for rapidly visualizing and analyzing the big power data based on the three-dimensional scene, the processing flow of the visualization analysis task is as follows:
the method comprises the following steps: collecting data, serializing the data into a big data file system (HDFS), and meanwhile, persisting the data into a database HBase;
step two: determining which scheme is adopted to mine and analyze the data, and adopting a corresponding intelligent algorithm;
step three: and mapping the output result set to a visualization module to form graphic information, integrating the result set with a scene by a visualization engine, and outputting the result set to a user in a three-dimensional space field visualization mode.
Under a big data environment, data, services and scenes are complex, the requirement of practical application can be met only by reasonably organizing the whole task processing flow, based on the Hadoop ecosystem design, the Hadoop system has wide practicability and good usability in the big data processing field, after being pushed out, the Hadoop system quickly obtains wide attention and research of academic circles, and is also popularized and applied in the industrial circles, the Hadoop is the most successful and widely accepted big data processing mainstream technology and system platform at present, various functional modules required by complete distributed cluster computing are provided, the Hadoop platform has evolved into a complete ecosystem so far, and the Hadoop platform runs on a computing cluster consisting of common commercial servers and even cheap machines, and a cheap, convenient and telescopic big data solution is provided.
The visual analysis system has the following functional modules:
(1) a service module: providing a high-level abstract interface to realize the service requirement of a user layer;
(2) a visualization engine: as a core subsystem of a model, an integration method of data and scenes should be realized, and rapid rendering of a large-scale three-dimensional scene is realized to meet the requirements of practical application;
(3) a calculation module: the method is used for implementing various intelligent algorithms to complete data mining work;
(4) a control module: the method is used for finishing the scheduling function of big data jobs and realizing reasonable load balancing control;
(5) a storage module: for storing data, large data file systems and database systems need to be implemented.
Referring to fig. 2-3, the system is layered by combining the possible generated services with the above modules, which are an interface layer, an engine layer, a computation layer, a control layer and a persistence layer;
(1) durable layer
The method comprises the following steps that a Hadoop file system HDFS and a database system HBase are adopted and used for storing all types of data, including scene data, numerical data, log data generated in actual operation and the like;
(2) control layer
A Hadoop task scheduling module ZooKeeper and a chip-level parallel technology MPI are respectively used for controlling different types of computing tasks, the ZooKeeper is used for controlling a data intensive computing mode STORM, and the MPI is used for controlling a compute intensive computing mode CUDA. The control modules complete task scheduling, low-level load balancing and simple fault-tolerant processing;
(3) computing layer
The method comprises the steps of classifying calculation tasks, not distinguishing the calculation tasks by graph calculation and data calculation, but classifying the calculation tasks into data intensive type and calculation intensive type according to calculation characteristics, wherein the classification method is used for improving the calculation efficiency as much as possible; all computation intensive computations are distributed to the CUDA module for execution;
(4) engine layer
The engine is a core subsystem in the model, the module realizes a fast rendering engine, a plurality of optimization algorithms and strategies are designed for a large-scale three-dimensional scene under a big data environment, the real-time rendering efficiency of the scene can meet the actual application requirement, the real-time rendering problem of the large-scale scene is always a research hotspot, and two methods are respectively provided for accelerating rendering aiming at the problem: one method is a visibility elimination method based on octree, and the other method is a multi-resolution drawing method based on weight function LOD, wherein the two methods will introduce an algorithm and an implementation process in detail in the subsequent sections and verify the efficiency through experiments.
(5) Interface layer
The interface layer is used as a high-level abstraction, and a series of interfaces are defined for operations directly performed by a user, wherein the interfaces comprise operations on a scene and data and other operations such as scene import, data analysis, log export and the like, and the interfaces simultaneously reserve fields for indicating whether the calculation type of a requested task is data intensive or calculation intensive; for tasks that do not require parallel computation, this field should be set to NULL to avoid unnecessary task scheduling and inter-node communication time loss.
Referring to fig. 4, when the interface layer receives a computation task request, for a data intensive computation task, it should be considered to maximize the parallelism of the data amount, so a distributed real-time computation framework STORM is adopted; for the calculation intensive calculation task, the parallelism of the function thread should be maximized, so a super computing framework CUDA is adopted;
the STORM only carries out parallel computation on the CPU array, the CUDA only carries out parallel computation on the GPU array, the two middle-layer computing frames do not need to divide own scope in cluster hardware independently, the cluster can adopt a plurality of machines as nodes to realize cooperative parallel computation on heterogeneous resources, the computing layer judges whether a computing task belongs to data intensive type or computing intensive type according to the task type defined by the interface layer, decomposes and distributes the task to the corresponding computing frame, carries out parallel processing on computing resources of different types, the computing task in the CUDA frame can be distributed to all GPU resources in the cluster by Hadoop to carry out parallel computing processing, and the computing task in the STORM frame can be distributed to all CPU resources in the cluster by Hadoop to carry out parallel computing processing.
Referring to fig. 5, when a server distributed computing framework uses a single control node (Master) named Nimbus, and an interface layer receives a service request, it analyzes which type of computing task is according to the interface type, if the computing task is data intensive, the computing task is submitted to Nimbus for topology generation operation, Nimbus sends the generated task topology sequence to Zookeeper of a control layer, and the Zookeeper uniformly schedules the tasks, and the computing nodes (Slave) of the server are divided into two types, one type is spout (hypervisor) for distributing primitives, and the other type is bolt (worker) for computing primitives, and the server does not require all the primitives to perform the same operation, and is more suitable for processing the data intensive tasks, and the server can achieve a better acceleration ratio for non-iterative tasks (data intensive tasks) in numerical computation; STORM can also achieve better efficiency for computational tasks that are not suitable for acceleration by CUDA in graphics tasks.
Referring to fig. 6, the CUDA architecture is based on a Single Instruction Multiple Thread (SIMT) model, and is an extension of a Single Instruction Multiple Data (SIMD) model, a function executed on a GPU is called a kernel function (kernel), when the kernel function is executed, kernel function instructions are concurrently transmitted to all stream processors sp (stream processors) in an array, a kernel is only a function, but not a complete program, before the kernel is executed, the CPU is required to assist in completing Data preprocessing and device initialization, the CUDA calculation process is divided into three stages, i.e., input, execution, and output, and in the first stage, the main program allocates a GPU memory space for input and output Data, and transmits the input Data from the CPU memory to the GPU memory; in the second stage, the main program starts a kernel program on the GPU and executes tasks in parallel; in the third stage, when the kernel program is finished, the main program transmits the output data of the kernel program from the GPU memory to the CPU memory so as to obtain an output result;
the CUDA architecture divides computing resources into two classes: the CUDA is a parallel mode of a single control node, a Grid-Block-Thread three-layer model is adopted in a CUDA programming mode, each layer has different indexes, a synchronous mode, a shared memory mode and a collaborative computing mode, a computing task is gradually refined in granularity, Thread with the finest granularity has the highest parallelism, and the CUDA is suitable for large-scale and balanced high-concurrency computing tasks.
When the calculation type field of the interface layer service request function is correctly designed, the STORM and CUDA work cooperatively to achieve extremely high efficiency.
The working use process and the installation method are that when the electric power big data rapid visual analysis method and the electric power big data rapid visual analysis system based on the three-dimensional scene are used, the system is designed based on a Hadoop ecosystem, the Hadoop system is rapidly concerned and researched widely in academic circles after being pushed out due to wide practicability and good usability in the big data processing field, and is popularized and applied in the industrial circles, the Hadoop is the most successful and widely accepted big data processing mainstream technology and system platform at present, various functional modules required by complete distributed cluster calculation are provided, the Hadoop platform has evolved into a complete ecosystem so far, the Hadoop platform runs on a calculation cluster composed of a common commercial server or even a cheap machine, and a cheap, convenient and telescopic big data solution is provided; the invention provides an integrated system for power big data visual analysis, which integrates a big data ecosystem with a visual module, integrates the graphic calculation of a three-dimensional scene and the numerical calculation of data analysis in the same frame, and enables the storage mode and the calculation mode to be universal, thereby realizing multi-service collaborative visual analysis work without adopting the traditional processing mode of firstly mining and exporting data and then importing the data into a visual system for analysis.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (6)

1. The method for rapidly and visually analyzing the electric power big data based on the three-dimensional scene is characterized by comprising the following steps of: the processing flow of the visual analysis method is as follows:
the method comprises the following steps: collecting data, serializing the data into a big data file system (HDFS), and meanwhile, persisting the data into a database HBase;
step two: determining which scheme is adopted to mine and analyze the data, and adopting a corresponding intelligent algorithm;
step three: and mapping the output result set to a visualization module to form graphic information, integrating the result set with a scene by a visualization engine, and outputting the result set to a user in a three-dimensional space field visualization mode.
2. Electric power big data quick visual analysis system based on three-dimensional scene, its characterized in that: the visual analysis system has the following functional modules:
(1) a service module: providing a high-level abstract interface to realize the service requirement of a user layer;
(2) a visualization engine: as a core subsystem of a model, an integration method of data and scenes should be realized, and rapid rendering of a large-scale three-dimensional scene is realized to meet the requirements of practical application;
(3) a calculation module: the method is used for implementing various intelligent algorithms to complete data mining work;
(4) a control module: the method is used for finishing the scheduling function of big data jobs and realizing reasonable load balancing control;
(5) a storage module: for storing data, large data file systems and database systems need to be implemented.
3. The system for rapidly visualizing and analyzing the electric power big data based on the three-dimensional scene according to claim 2, is characterized in that: the system is layered by combining the possibly generated services with the modules, namely an interface layer, an engine layer, a calculation layer, a control layer and a persistence layer;
(1) durable layer
The method comprises the following steps that a Hadoop file system HDFS and a database system HBase are adopted and used for storing all types of data, including scene data, numerical data, log data generated in actual operation and the like;
(2) control layer
A Hadoop task scheduling module ZooKeeper and a chip-level parallel technology MPI are respectively used for controlling different types of computing tasks, the ZooKeeper is used for controlling a data intensive computing mode STORM, and the MPI is used for controlling a compute intensive computing mode CUDA. The control modules complete task scheduling, low-level load balancing and simple fault-tolerant processing;
(3) computing layer
The method comprises the steps of classifying calculation tasks, not distinguishing the calculation tasks by graph calculation and data calculation, but classifying the calculation tasks into data intensive type and calculation intensive type according to calculation characteristics, wherein the classification method is used for improving the calculation efficiency as much as possible; all computation intensive computations are distributed to the CUDA module for execution;
(4) engine layer
The engine is a core subsystem in the model, the module realizes a fast rendering engine, a plurality of optimization algorithms and strategies are designed for a large-scale three-dimensional scene under a big data environment, the real-time rendering efficiency of the scene can meet the actual application requirement, the real-time rendering problem of the large-scale scene is always a research hotspot, and two methods are respectively provided for accelerating rendering aiming at the problem: one method is a visibility elimination method based on octree, and the other method is a multi-resolution drawing method based on weight function LOD, wherein the two methods will introduce an algorithm and an implementation process in detail in the subsequent sections and verify the efficiency through experiments.
(5) Interface layer
The interface layer is used as a high-level abstraction, and a series of interfaces are defined for operations directly performed by a user, wherein the interfaces comprise operations on a scene and data and other operations such as scene import, data analysis, log export and the like, and the interfaces simultaneously reserve fields for indicating whether the calculation type of a requested task is data intensive or calculation intensive; for tasks that do not require parallel computation, this field should be set to NULL to avoid unnecessary task scheduling and inter-node communication time loss.
4. The system for rapidly visualizing and analyzing the electric power big data based on the three-dimensional scene according to claim 3, wherein: when an interface layer receives a computing task request, for a data intensive computing task, the parallelism of data quantity should be considered to be maximized, so a distributed real-time computing framework STORM is adopted; for the calculation intensive calculation task, the parallelism of the function thread should be maximized, so a super computing framework CUDA is adopted;
the STORM only carries out parallel computation on the CPU array, the CUDA only carries out parallel computation on the GPU array, the two middle-layer computing frames do not need to divide own scope in cluster hardware independently, the cluster can adopt a plurality of machines as nodes to realize cooperative parallel computation on heterogeneous resources, the computing layer judges whether a computing task belongs to data intensive type or computing intensive type according to the task type defined by the interface layer, decomposes and distributes the task to the corresponding computing frame, carries out parallel processing on computing resources of different types, the computing task in the CUDA frame can be distributed to all GPU resources in the cluster by Hadoop to carry out parallel computing processing, and the computing task in the STORM frame can be distributed to all CPU resources in the cluster by Hadoop to carry out parallel computing processing.
5. The system for rapidly visualizing and analyzing the electric power big data based on the three-dimensional scene according to claim 4, wherein: the STORM distributed computing framework uses a single control node (Master) named as Nimbus, when an interface layer receives a service request, the interface layer firstly analyzes which type of computing task is according to the interface type, if the computing task is data intensive, the computing task is submitted to the Nimbus to perform topology generation operation, the Nimbus sends a generated task topology sequence to a Zookeeper of a control layer, the Zookeeper uniformly performs task scheduling, the computing nodes (Slave) of the STORM are divided into two types, one type is spout (Supervisor) used for distributing primitives, the other type is bolt (worker) used for computing the primitives, the STORM does not require all the primitives to perform the same operation and is more suitable for processing the data intensive tasks, and the STORM can realize better speed-up ratio for non-iterative tasks (data intensive tasks) in numerical computation; STORM can also achieve better efficiency for computational tasks that are not suitable for acceleration by CUDA in graphics tasks.
6. The system for rapidly visualizing and analyzing the electric power big data based on the three-dimensional scene according to claim 4, wherein: the CUDA architecture is based on a Single Instruction Multiple Thread (SIMT) model, is an extension of a Single Instruction Multiple Data (SIMD) model, a function executed on a GPU is called a kernel function (kernel), when the CUDA architecture is operated, kernel function instructions are parallelly transmitted to all Stream Processors (SP) (stream processors) in an array, one kernel is only a function but not a complete program, before the kernel is executed, the CPU is required to assist in completing Data preprocessing and equipment initialization, the CUDA calculation process is divided into an input stage, an execution stage and an output stage, a first stage is that a GPU memory space is allocated to input and output Data by a main program, and the input Data are transmitted to a GPU memory from the CPU memory; in the second stage, the main program starts a kernel program on the GPU and executes tasks in parallel; in the third stage, when the kernel program is finished, the main program transmits the output data of the kernel program from the GPU memory to the CPU memory so as to obtain an output result;
the CUDA architecture divides computing resources into two classes: the CUDA is a parallel mode of a single control node, a Grid-Block-Thread three-layer model is adopted in a CUDA programming mode, each layer has different indexes, a synchronous mode, a shared memory mode and a collaborative computing mode, a computing task is gradually refined in granularity, Thread with the finest granularity has the highest parallelism, and the CUDA is suitable for large-scale and balanced high-concurrency computing tasks.
CN202111046249.2A 2021-09-07 2021-09-07 Three-dimensional scene-based electric power big data rapid visual analysis method and system Pending CN113806606A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111046249.2A CN113806606A (en) 2021-09-07 2021-09-07 Three-dimensional scene-based electric power big data rapid visual analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111046249.2A CN113806606A (en) 2021-09-07 2021-09-07 Three-dimensional scene-based electric power big data rapid visual analysis method and system

Publications (1)

Publication Number Publication Date
CN113806606A true CN113806606A (en) 2021-12-17

Family

ID=78940758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111046249.2A Pending CN113806606A (en) 2021-09-07 2021-09-07 Three-dimensional scene-based electric power big data rapid visual analysis method and system

Country Status (1)

Country Link
CN (1) CN113806606A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117130491A (en) * 2023-10-26 2023-11-28 航天宏图信息技术股份有限公司 Mixed reality multi-group cooperation method, system, electronic equipment and storage medium
CN117196421A (en) * 2023-09-15 2023-12-08 国能智慧科技发展(江苏)有限公司 Visual system based on power plant energy scheduling and calibration method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329982A (en) * 2017-06-01 2017-11-07 华南理工大学 A kind of big data parallel calculating method stored based on distributed column and system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329982A (en) * 2017-06-01 2017-11-07 华南理工大学 A kind of big data parallel calculating method stored based on distributed column and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张子谦;张艳燕;陈俣;赵希超;: "一种电力大数据三维可视化管理方法研究", 能源与环保, no. 03 *
黄靖媛: "基于三维场景的电力大数据快速可视化分析模型研究", 中国优秀硕士学位论文全文数据库 信息科技辑, pages 14 - 20 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196421A (en) * 2023-09-15 2023-12-08 国能智慧科技发展(江苏)有限公司 Visual system based on power plant energy scheduling and calibration method
CN117196421B (en) * 2023-09-15 2024-05-10 大庆市庆翔热电有限公司 Visual system based on power plant energy scheduling and calibration method
CN117130491A (en) * 2023-10-26 2023-11-28 航天宏图信息技术股份有限公司 Mixed reality multi-group cooperation method, system, electronic equipment and storage medium
CN117130491B (en) * 2023-10-26 2024-02-06 航天宏图信息技术股份有限公司 Mixed reality multi-group cooperation method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN111327681A (en) Cloud computing data platform construction method based on Kubernetes
CN107329828A (en) A kind of data flow programmed method and system towards CPU/GPU isomeric groups
CN102929718A (en) Distributed GPU (graphics processing unit) computer system based on task scheduling
CN113806606A (en) Three-dimensional scene-based electric power big data rapid visual analysis method and system
CN104239144A (en) Multilevel distributed task processing system
CN103677759A (en) Objectification parallel computing method and system for information system performance improvement
CN110659278A (en) Graph data distributed processing system based on CPU-GPU heterogeneous architecture
Morad et al. Generalized MultiAmdahl: Optimization of heterogeneous multi-accelerator SoC
CN103617494A (en) Wide-area multi-stage distributed parallel power grid analysis system
CN107329822A (en) Towards the multi-core dispatching method based on super Task Network of multi-source multiple nucleus system
Ranaweera et al. Scheduling of periodic time critical applications for pipelined execution on heterogeneous systems
He et al. Haas: Cloud-based real-time data analytics with heterogeneity-aware scheduling
CN107908459B (en) Cloud computing scheduling system
Li et al. Parallel computing: review and perspective
You et al. High-performance polyline intersection based spatial join on GPU-accelerated clusters
Wang et al. A CUDA-enabled parallel implementation of collaborative filtering
Hu et al. Cluster-scheduling big graph traversal task for parallel processing in heterogeneous cloud based on DAG transformation
Pilla et al. Asymptotically optimal load balancing for hierarchical multi-core systems
CN110879753A (en) GPU acceleration performance optimization method and system based on automatic cluster resource management
Liu et al. A hybrid parallel genetic algorithm with dynamic migration strategy based on sunway many-core processor
Hong et al. An optimized model for MapReduce based on Hadoop
de Rezende et al. MapReduce with components for processing big graphs
CN113076191A (en) Cluster GPU resource scheduling system
Hippold et al. Task pool teams for implementing irregular algorithms on clusters of SMPs
Singh et al. Live virtual machine migration techniques in cloud computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination