CN114978962A - Model algorithm type selection evaluation method for power grid big data analysis - Google Patents
Model algorithm type selection evaluation method for power grid big data analysis Download PDFInfo
- Publication number
- CN114978962A CN114978962A CN202210900978.8A CN202210900978A CN114978962A CN 114978962 A CN114978962 A CN 114978962A CN 202210900978 A CN202210900978 A CN 202210900978A CN 114978962 A CN114978962 A CN 114978962A
- Authority
- CN
- China
- Prior art keywords
- data
- time
- power grid
- monitoring
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007405 data analysis Methods 0.000 title claims abstract description 17
- 238000011156 evaluation Methods 0.000 title claims abstract description 17
- 238000012544 monitoring process Methods 0.000 claims abstract description 87
- 238000012545 processing Methods 0.000 claims abstract description 42
- 238000003780 insertion Methods 0.000 claims abstract description 20
- 230000037431 insertion Effects 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 5
- 238000000034 method Methods 0.000 claims description 17
- 230000008569 process Effects 0.000 claims description 13
- 238000004458 analytical method Methods 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 9
- 238000001228 spectrum Methods 0.000 claims description 9
- 238000010276 construction Methods 0.000 claims description 8
- 238000012163 sequencing technique Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 230000005540 biological transmission Effects 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 4
- 238000011897 real-time detection Methods 0.000 claims description 3
- 239000000758 substrate Substances 0.000 claims description 3
- 210000001503 joint Anatomy 0.000 claims 1
- 238000001914 filtration Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
- H04L43/0888—Throughput
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J13/00—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
- H02J13/00001—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by the display of information or by user interaction, e.g. supervisory control and data acquisition systems [SCADA] or graphical user interfaces [GUI]
-
- H—ELECTRICITY
- H02—GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
- H02J—CIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
- H02J13/00—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
- H02J13/00002—Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by monitoring
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0604—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
- H04L41/0609—Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time based on severity or priority
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/028—Capturing of monitoring data by filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/12—Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Power Engineering (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Environmental & Geological Engineering (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a model algorithm type selection evaluation method for power grid big data analysis, which comprises the following steps of: step S1, a Storm platform is used as a front-end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, and a power grid distributed scheduling module is combined with the cloud platform framework; s2, sorting the priority of the monitoring data by adopting an event insertion algorithm based on a priority event queue, and extracting a monitoring data stream from the alarm data by adopting a double-threshold parameter extraction algorithm; step S3, introducing a model mark language into the cloud platform framework to acquire data intercommunication between a Storm platform and a Spark platform; and step S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the power equipment monitoring data, realizing the intercommunication of model numbers of Storm and Spark, realizing the cross-platform sharing of different systems under the cloud platform, improving the data processing speed and improving the reliability of data receiving.
Description
Technical Field
The invention relates to the technical field of power grid big data processing, in particular to a model algorithm model selection evaluation method for power grid big data analysis.
Background
In the process of development and intelligentization promotion of an intelligent power grid, state monitoring is more and more frequently used on power grid equipment, power grid data are developing towards multiple ways, multiple types and large scale along with the increase of power transmission and transformation equipment, a big data era on the power grid comes, the power grid data have great commercial value and social value, the processing requirements on the data in power production operation and management are becoming stricter day by day, a plurality of opportunities are hidden in the aspect of the power grid big data, the monitoring data quantity of the power equipment rises rapidly, the types are also increasing gradually, the time and efficiency of the existing system of the power grid are difficult to guarantee when the existing system of the power grid faces the massive monitoring data, when extreme weather occurs, such as fierce wind, snow ice, hail and the like, the data value generated by monitoring of the power equipment is frequently out of limit, alarm data are sent continuously, so that a monitoring center generates massive monitoring data in a short time interval, the method also challenges a power grid system, and the increase of power grid data brings considerable social and commercial values, so that the method is not only challenging but also has opportunities.
The power grid monitoring streaming type big data has the characteristics of complicated big data type, high change speed and the like, and also has the characteristics of high streaming data flow, high flow rate and difficulty in storage, if the fault finding time is too late, inestimable serious consequences can be caused by wrong type judgment or improper processing, so that the power grid monitoring streaming data is reliably received and prevented from being lost, and the fast processing of the data is essential for effectively obtaining the equipment state information.
The analysis of the power grid monitoring streaming type big data by using the Hadoop scene mainly has the following problems: firstly, in the case of a large amount of data, the response time is several minutes or longer, and if the processed data has the characteristic of real time, the Hadoop scene may not be processed; and secondly, in the aspect of data, the requirement on the data collecting mode and the technical real-time performance of big data is high, the power grid monitoring data becomes more diversified and more numerous along with the development and the perfection of a power monitoring system, wherein a large amount of monitoring parameters, configuration files for monitoring related devices, lightning weather and other monitoring record data are also stored in the power monitoring system, and when the power grid fault and individual severe weather are faced, alarm data frequently generated due to the fact that the monitoring value is out of limit can only be continuously accumulated in a monitoring background and cannot be effectively processed.
Disclosure of Invention
The invention aims to provide a model algorithm type selection evaluation method for power grid big data analysis, and aims to solve the technical problems that the real-time data processing progress is poor, and the alarm data frequently generated due to the fact that the monitoring value is out of limit is continuously accumulated in a monitoring background and cannot be effectively processed in the prior art.
In order to solve the technical problems, the invention specifically provides the following technical scheme:
a model algorithm model selection evaluation method for power grid big data analysis comprises the following steps:
step S1, a Storm platform is used as a front-end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, a power grid distributed scheduling module is combined with the cloud platform framework, and power grid data flow is analyzed to monitor power grid equipment alarm data;
s2, sorting the priority of the monitoring data by adopting an event insertion algorithm based on a priority event queue, extracting a monitoring data stream from the alarm data by adopting a double-threshold parameter extraction algorithm, and acquiring the priority of the monitoring data of the power equipment;
step S3, introducing a model mark language into the cloud platform framework to acquire data intercommunication between a Storm platform and a Spark platform so as to rapidly process and monitor alarm data;
and step S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the power equipment monitoring data, and monitoring the data processing reliability in real time.
As a preferred scheme of the present invention, in step S1, the cloud platform framework uses a grouping mode to dock the power grid distributed scheduling modules, and the power grid distributed scheduling modules timely acquire power grid data through the distributed scheduling cluster and perform data distributed scheduling under monitoring of the cloud platform framework.
As a preferred scheme of the present invention, the cloud platform framework employs two JDK resource packets of a master node and a working node for real-time communication, the master node and the working node communicate with each other in real time through a Ping statement, and the operation of the working node is displayed in real time through a UI interface of a Spark platform.
As a preferred scheme of the present invention, the cloud platform orders the monitoring data according to the time-based requirement of monitoring the grid data, specifically:
according to the real-time limit requirement of the power grid alarm dataTime delay of reception and distribution of alarm dataTime to actually process alarm dataCalculating the time margin of alarm data processing on the cloud platformThe time marginThe expression of (a) is:
in any time period, the time margin is measured according to the priority queue propertySorting and calculating any time in the queueAlarm dataTime margin ofThe expression is as follows:
wherein,indicating an arbitrary timeThe actual processing time of the substrate is,,indicating alarm dataThe sum of the data to be processed shared in the front.
As a preferable aspect of the present invention, the alarm data is given according to the arbitrary timeTime margin ofAnd performing priority sequencing on the alarm events by adopting an event insertion algorithm based on a priority event queue, wherein the event insertion algorithm specifically comprises the following steps:
according to the alarm dataTime margin ofDetermining the priority number, real-time performance and inserted queue of the event;
by estimating real-time marginsDetermining the position of insertion, checking alarm dataThe real-time of the time at the location, if alarm dataIf the time meets the real-time requirement, the estimated position is directly inserted, and if the alarm data is detectedIf the time does not meet the real-time requirement, the time is inserted into the next bit, and if the last bit is inserted, the time still does not meet the alarm dataStopping inserting if the real-time performance of the time is required;
after the event insertion is finished, performing real-time detection on the event after the event position is inserted, and if the real-time performance of the subsequent event is not met, putting the event to be inserted into the event set which can not be inserted;
and monitoring the queue of the inserted events with a minimum time margin, and if the queue of the inserted events does not meet the condition, putting the inserted events into the event set which cannot be inserted.
As a preferred scheme of the invention, a double-threshold parameter extraction algorithm is adopted to extract monitoring data flow for the sequencing priority of the alarm events, a discharge model under a Storm platform framework is adopted to simulate the situation that a plurality of high-voltage power facility power grid alarm data are gushed into a monitoring center under extreme conditions in real time, the partial discharge data in the power grid data are processed and analyzed, and the partial discharge data parameters are combined with the Storm platform to construct real-time map analysis.
As a preferred scheme of the present invention, the dual-threshold parameter extraction algorithm specifically includes:
firstly, the Storm platform is utilized to monitor the source serial number of partial discharge in real timeAnd monitoring the sequence number of the atlas to be drawn currently by the sourceAccording to the map numberThe atlas analysis of (2) requires setting the number of signals;
Secondly, calculating and extractingRelative to each other when monitoring the sourceParameters of each spectrum period, and according to the parameter phase of the spectrum period, obtaining the count of the signal when the spectrum is drawnJudging whether the number of times of extracting the monitoring source is greater than that of the total monitoring source;
finally, counting according to the signalAnd a monitoring sourceDrawing the firstA monitoring source ofAnd (4) each spectrogram.
As a preferred scheme of the invention, a model markup language is introduced into the cloud platform framework according to the real-time monitoring data of the Storm platform to obtain data intercommunication between the Storm platform and the Spark platform, a PMML-based distributed file system is introduced into the cloud platform framework to receive and distribute power grid data in real time, and an allxio model is introduced into the Spark platform to store the power grid data in real time.
As a preferred scheme of the present invention, the total amount of discharge data that can be processed by the cloud platform framework in unit time is tested in the cloud platform framework according to the priority of the power grid alarm data to evaluate the alarm data processing throughput, specifically:
introducing a mode construction module, a source component FileFromDirSpout and a data processing component BinaryDecimalmalbolt into the Spark platform, controlling the number of work processes to be unchanged, and setting a fixed work process asThe transmission interval is set to a single cycle number of the mapAnd calculating the throughput of the mode construction module to evaluate the reliability of the real-time monitoring data processing.
Compared with the prior art, the invention has the following beneficial effects:
in order to improve the efficiency of streaming data processing, the invention combines the parameter analysis of discharge data with a cloud platform, designs and uses a dual-threshold filtering parameter extraction algorithm under the cloud platform, and submits the algorithm to Storm flat, thereby improving the efficiency of parameter extraction and mode recognition, and accelerating the data processing speed; and the structure of the cloud platform distributed scheduling system is improved by adding communication among the power grid distributed scheduling modules, and meanwhile, the reliability of data receiving is improved by adopting an event insertion algorithm based on a priority event queue.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
Fig. 1 is a flowchart of a model algorithm model selection evaluation method for power grid big data analysis according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the invention provides a model algorithm model selection evaluation method for power grid big data analysis, which comprises the following steps:
step S1, a Storm platform is used as a front end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, a power grid distributed scheduling module is combined with the cloud platform framework, and power grid data flow is analyzed to monitor power grid equipment alarm data;
s2, sorting the priority of the monitoring data by adopting an event insertion algorithm based on a priority event queue, extracting a monitoring data stream from the alarm data by adopting a double-threshold parameter extraction algorithm, and acquiring the priority of the monitoring data of the power equipment;
step S3, introducing a model mark language into the cloud platform framework to acquire data intercommunication between a Storm platform and a Spark platform so as to rapidly process and monitor alarm data;
and S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the monitoring data of the power equipment, and monitoring the data processing reliability in real time.
In the embodiment, in order to solve the problem that monitoring stream alarm data are accumulated in a monitoring background and are difficult to process, a Storm platform is used as a front-end framework of electric power data alarm, and a Spark is used as a training model platform to form a total cloud platform framework, so that real-time monitoring and processing of data are realized.
In this embodiment, the data source is processed through a dual-threshold filtering parameter extraction algorithm under the Storm platform, so as to achieve the purpose of rapidly processing the monitoring stream data.
In this embodiment, a model markup language and a dual-threshold parameter extraction algorithm are introduced to a cloud platform composed of the Storm and Spark platforms to achieve fast processing of data, and then reliable reception of data under the cloud platform is achieved through an improved cloud platform distributed scheduling system and an event insertion algorithm based on a priority event queue.
In the step S1, the cloud platform framework is connected to the grid distributed scheduling module in a packet mode, and the grid distributed scheduling module obtains grid data in time through the distributed scheduling cluster and performs distributed data scheduling under monitoring of the cloud platform framework.
In this embodiment, the cloud platform-based power grid distributed scheduling module adds communication between clusters, which can retain the advantages of distributed type, solve the problem of cache of a large amount of data congestion, overcome the defect that communication cannot be performed between clusters, and control global information in time.
The cloud platform framework adopts two JDK resource packets of a main node and a working node for real-time communication, the main node and the working node are in real-time communication through Ping sentences, and the operation of the working node is displayed in real time through a UI (user interface) of a Spark platform.
In this embodiment, the cloud platform framework uses a Storm cluster to form a master node and working nodes, the master node is mainly responsible for monitoring the working conditions of each node and sending codes and tasks to each node, and the working nodes are mainly used for controlling the process-related states.
The cloud platform sequences the monitoring data according to the time-limit requirement of the monitoring power grid data, and specifically comprises the following steps:
according to the real-time limit requirement of the power grid alarm dataTime delay of reception and distribution of alarm dataTime of actual processing of alarm dataCalculating the time margin of alarm data processing on the cloud platformThe time marginThe expression of (a) is:
in any time period, the time margin is measured according to the priority queue propertySorting and calculating any time in the queueAlarm dataTime margin ofThe expression is as follows:
wherein,time of dayThe actual processing time of the substrate is,,indicating alarm dataSum of the data to be processed shared in the front.
In this embodiment, according to the monitoring of the grid data, a priority is given to each event, which may be understood as a priority, and each event in the same queue is arranged according to a requirement that can meet a real-time margin, and parallel processing is performed between different queues, and if there are queues with different priorities, a queue with a high priority is processed preferentially, and if a queue with a higher priority appears during processing, the queue with the higher priority is processed by suspending the event being processed immediately.
In the embodiment, the monitoring data are sequenced according to the time-limited requirement of monitoring the power grid data, the cloud platform distributed scheduling system is improved, and the phenomenon that the data are possibly received unreliably under the extreme weather of the cloud platform is relieved.
According to the alarm data at any timeTime margin ofAnd performing priority sequencing on the alarm events by adopting an event insertion algorithm based on a priority event queue, wherein the event insertion algorithm specifically comprises the following steps:
according to the alarm dataTime margin ofDetermining the priority number, real-time performance and inserted queue of the event;
by estimating real-time marginsDetermining the position of insertion, checking alarm dataThe real-time of the time at the location, if alarm dataIf the time meets the real-time requirement, the estimated position is directly inserted, and if the alarm data is detectedIf the time does not meet the real-time requirement, the time is inserted into the next bit, and if the last bit is inserted, the time still does not meet the alarm dataStopping inserting if the real-time performance of the time is required;
after the event insertion is finished, performing real-time detection on the event after the event position is inserted, and if the real-time performance of the subsequent event is not met, putting the event to be inserted into the event set which can not be inserted;
and monitoring the queue of the inserted events with a minimum time margin, and if the queue of the inserted events does not meet the condition, putting the inserted events into the event set which cannot be inserted.
In the embodiment, when an event is inserted, the priority number of the event is considered first, and then the inserted position is determined according to the real-time margin of the event; in addition, if the position of the event insertion does not meet the real-time requirement, the event cannot be inserted into the queue, so the priority of the event in the queue is also considered after the event is added into the queue, because the monitoring data of the power equipment relates to more than one detection equipment, in order to ensure that the alarm data with high priority can be processed preferentially, the monitoring data of each equipment has different priority, and in addition, the timeliness of the monitoring data of different equipment is also considered on the basis of the priority, and part of the equipment can have higher requirements on the data processing speed and time.
And extracting monitoring data flow from the alarm event sequencing priority by adopting a dual-threshold parameter extraction algorithm, simulating the condition that a plurality of high-voltage power facility power grid alarm data are gushed into a monitoring center under extreme conditions in real time by adopting a discharge model under a Storm platform frame, processing and analyzing partial discharge data in the power grid data, and combining the partial discharge data parameters with the Storm platform to construct real-time map analysis.
The double-threshold parameter extraction algorithm specifically comprises the following steps:
firstly, the Storm platform is utilized to monitor the source serial number of partial discharge in real timeAnd monitoring the current atlas serial number to be drawn by the sourceAccording to the map numberThe atlas analysis of (2) requires setting the number of signals;
Secondly, calculating and extractingPersonal watchRelative to each other when measuring sourceParameters of each spectrum period, and according to the parameter phase of the spectrum period, obtaining the count of the signal when the spectrum is drawnJudging whether the number of times of extracting the monitoring source is greater than that of the total monitoring source;
finally, counting according to the signalAnd a monitoring sourceDrawing the firstA monitoring source ofAnd (4) each spectrogram.
In this embodiment, a dual-threshold filtering method is used for extracting basic parameters, because the method is simple and effective and is easy to implement under the Storm platform, a discharge is determined by performing double filtering on a local extreme point by using a combination of a vertical direction representing a discharge amplitude and a horizontal direction representing the discharge amplitude, the Storm platform is mainly used for testing the performance of data processing during mode construction, the speed and efficiency of the mode construction are improved by virtue of a streaming processing platform, and the main test indexes are throughput, processing delay and accuracy.
Introducing a model markup language into the cloud platform framework according to the real-time monitoring data of the Storm platform to obtain data intercommunication of the Storm platform and a Spark platform, introducing a PMML-based distributed file system into the cloud platform framework to receive and distribute power grid data in real time, and introducing an Alluxio model into the Spark platform to store the power grid data in real time.
In this embodiment, a model markup language and an Alluxio model are used to realize the cooperative work of the Storm platform and the Spark platform, and complete the model intercommunication, so as to evaluate the speed of the system for improving the data processing.
Testing the total discharge data amount which can be processed by the cloud platform framework in unit time according to the priority of the power grid alarm data in the cloud platform framework so as to evaluate the alarm data processing throughput rate, wherein the testing method specifically comprises the following steps:
introducing a mode construction module, a source component FileFromDirSpout and a data processing component BinaryDecimalmalbolt into the Spark platform, controlling the number of work processes to be unchanged, and setting a fixed work process asThe transmission interval is set to a single cycle number of the mapAnd calculating the throughput of the mode construction module to evaluate the reliability of the real-time monitoring data processing.
In the embodiment, a discharge point test based on a phase distribution mode is adopted to mainly describe the power frequency phase, the discharge capacity or the discharge amplitude, the discharge rate or the discharge frequency corresponding to the partial discharge pulse in the partial discharge pulse.
In order to improve the efficiency of streaming data processing, the invention combines the parameter analysis of discharge data with a cloud platform, designs and uses a dual-threshold filtering parameter extraction algorithm under the cloud platform, and submits the algorithm to Storm flat, thereby improving the efficiency of parameter extraction and mode recognition, and accelerating the data processing speed; and the structure of the cloud platform distributed scheduling system is improved by adding communication among the power grid distributed scheduling modules, and meanwhile, the reliability of data receiving is improved by adopting an event insertion algorithm based on a priority event queue.
The above embodiments are only exemplary embodiments of the present application, and are not intended to limit the present application, and the protection scope of the present application is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present application and such modifications and equivalents should also be considered to be within the scope of the present application.
Claims (9)
1. A model algorithm model selection evaluation method for power grid big data analysis is characterized by comprising the following steps:
step S1, a Storm platform is used as a front end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, a power grid distributed scheduling module is combined with the cloud platform framework, and power grid data flow is analyzed to monitor power grid equipment alarm data;
s2, sequencing the priority of monitoring data by adopting an event insertion algorithm based on a priority event queue, extracting monitoring data flow from the alarm data by adopting a double-threshold parameter extraction algorithm, and acquiring the priority of the monitoring data of the power equipment;
step S3, a model mark language is introduced into the cloud platform framework to obtain data intercommunication between a Storm platform and a Spark platform so as to rapidly process and monitor alarm data;
and step S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the power equipment monitoring data, and monitoring the data processing reliability in real time.
2. The model algorithm type selection evaluation method for power grid big data analysis according to claim 1, wherein the step S1 includes:
the cloud platform framework is in butt joint with the power grid distributed scheduling modules in a grouping mode, the power grid distributed scheduling modules acquire power grid data in time through the distributed scheduling clusters, and data distributed scheduling is carried out under monitoring of the cloud platform framework.
3. The model algorithm type selection evaluation method for power grid big data analysis according to claim 2, wherein the cloud platform framework adopts two JDK resource packets for real-time communication between a main node and a working node, the main node and the working node are in real-time communication through Ping sentences, and the operation of the working node is displayed in real time through a UI (user interface) of a Spark platform.
4. The model algorithm type selection evaluation method for power grid big data analysis according to claim 3, characterized by further comprising: the cloud platform sequences the monitoring data according to the time limit requirement of the monitoring power grid data;
the cloud platform sequences the monitoring data according to the time-based requirement of monitoring the power grid data, and comprises the following steps:
according to the real-time limit requirement of the power grid alarm dataTime delay of reception and distribution of alarm dataTime of actual processing of alarm dataCalculating the time margin of alarm data processing on the cloud platform(ii) a The time marginThe expression of (a) is:
in any time period, the time margin is measured according to the priority queue propertySorting and calculating any time in the queueAlarm dataTime margin ofThe expression is as follows:
5. The model algorithm type selection evaluation method for power grid big data analysis according to claim 4, further comprising: according to the alarm data at any timeTime margin ofAdopting an event insertion algorithm based on a priority event queue to perform priority sequencing on the alarm events;
the event insertion algorithm comprises:
according to the alarm dataTime margin ofDetermining the priority number, real-time performance and inserted queue of the event;
by estimating real-time marginsDetermining the position of insertion, checking alarm dataThe real-time of the time at the location, if alarm dataIf the time meets the real-time requirement, the estimated position is directly inserted, and if the alarm data is detectedIf the time does not meet the real-time requirement, the time is inserted into the next bit, and if the last bit is inserted, the time still does not meet the alarm dataStopping inserting if the real-time performance of the time is required;
after the event insertion is finished, performing real-time detection on the event after the event position is inserted, and if the real-time performance of the subsequent event is not met, putting the event to be inserted into the event set which can not be inserted;
and monitoring the queue of the inserted events with a minimum time margin, and if the queue of the inserted events does not meet the condition, putting the inserted events into the event set which cannot be inserted.
6. The model algorithm model selection evaluation method for power grid big data analysis as claimed in claim 5, wherein a dual-threshold parameter extraction algorithm is used to extract monitoring data stream for the sequencing priority of the alarm event, a discharge model under a Storm platform framework is used to simulate the situation that a plurality of high-voltage power facility power grid alarm data are flooded into a monitoring center under extreme conditions in real time, the partial discharge data in the power grid data are processed and analyzed, and the partial discharge data parameters and the Storm platform are combined to construct a real-time map analysis.
7. The model algorithm type selection evaluation method for power grid big data analysis according to claim 6, wherein the dual-threshold parameter extraction algorithm is specifically as follows:
firstly, the Storm platform is utilized to monitor the source serial number of partial discharge in real timeAnd monitoring the current atlas serial number to be drawn by the sourceAccording to the map numberThe atlas analysis of (2) requires setting the number of signals;
Secondly, calculating and extractingRelative to each other when monitoring the sourceParameters of each spectrum period, and according to the parameter phase of the spectrum period, obtaining the count of the signal when the spectrum is drawnJudging whether the number of times of extracting the monitoring source is greater than that of the total monitoring source;
8. The model algorithm model selection evaluation method for power grid big data analysis as claimed in claim 6, characterized in that a model markup language is introduced into the cloud platform framework according to the real-time monitoring data of the Storm platform to obtain data intercommunication between the Storm platform and Spark platform;
the model markup language introduced into the cloud platform framework is a PMML-based distributed file system and is used for receiving and distributing power grid data in real time, and the model markup language introduced into the Spark platform is used for storing the power grid data in real time for the Alluxio model.
9. The model algorithm type selection evaluation method for power grid big data analysis according to claim 8, further comprising:
testing the total discharge data which can be processed by the cloud platform frame in unit time according to the priority of the power grid alarm data in the cloud platform frame so as to evaluate the alarm data processing throughput rate;
the step of testing the total discharge data amount which can be processed by the cloud platform framework in unit time according to the priority of the power grid alarm data in the cloud platform framework so as to evaluate the alarm data processing throughput rate comprises the following steps:
introducing a mode construction module, a source component FileFromDirSpout and a data processing component BinaryDecimal bolt in the Spark platform, controlling the number of working processes to be unchanged, and setting a fixed working process asThe transmission interval is set to a single cycle number of the mapAnd calculating the throughput of the mode construction module to evaluate the reliability of the real-time monitoring data processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210900978.8A CN114978962A (en) | 2022-07-28 | 2022-07-28 | Model algorithm type selection evaluation method for power grid big data analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210900978.8A CN114978962A (en) | 2022-07-28 | 2022-07-28 | Model algorithm type selection evaluation method for power grid big data analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114978962A true CN114978962A (en) | 2022-08-30 |
Family
ID=82970463
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210900978.8A Pending CN114978962A (en) | 2022-07-28 | 2022-07-28 | Model algorithm type selection evaluation method for power grid big data analysis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114978962A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107968840A (en) * | 2017-12-15 | 2018-04-27 | 华北电力大学(保定) | A kind of extensive power equipment monitoring, alarming Real-time Data Processing Method and system |
CN108156192A (en) * | 2016-12-02 | 2018-06-12 | 联芯科技有限公司 | Android RIL message handling systems and method |
CN111432295A (en) * | 2020-03-18 | 2020-07-17 | 北京科东电力控制系统有限责任公司 | Power consumption information acquisition master station system based on distributed technology |
-
2022
- 2022-07-28 CN CN202210900978.8A patent/CN114978962A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108156192A (en) * | 2016-12-02 | 2018-06-12 | 联芯科技有限公司 | Android RIL message handling systems and method |
CN107968840A (en) * | 2017-12-15 | 2018-04-27 | 华北电力大学(保定) | A kind of extensive power equipment monitoring, alarming Real-time Data Processing Method and system |
CN111432295A (en) * | 2020-03-18 | 2020-07-17 | 北京科东电力控制系统有限责任公司 | Power consumption information acquisition master station system based on distributed technology |
Non-Patent Citations (1)
Title |
---|
赵铭滕: "云平台下电网监测流式大数据的可靠接收与快速处理", 《中国优秀硕士学位论文 工程科技II辑,2022年》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109495317B (en) | Data network flow prediction method and device | |
CN105740975B (en) | A kind of equipment deficiency assessment and prediction technique based on data correlation relation | |
EP2266254B1 (en) | Available bandwidth estimation in a packet-switched communication network | |
CN107070683A (en) | The method and apparatus of data prediction | |
CN110348615B (en) | Cable line fault probability prediction method based on ant colony optimization support vector machine | |
CN103580905B (en) | A kind of method for predicting, system and flow monitoring method, system | |
CN110907755A (en) | Power transmission line online monitoring fault recognition method | |
CN105281945A (en) | Data flow-based deterministic network integrity fault detection method | |
CN104392069B (en) | A kind of WAMS delay character modeling method | |
CN102263676A (en) | Network bottleneck detection method | |
CN107402851A (en) | A kind of data recovery control method and device | |
CN105260253A (en) | Server failure measurement and calculation method and device | |
CN116484554A (en) | Topology identification method, device, equipment and medium for power distribution network | |
CN103246569A (en) | Method and device for representing high-performance calculation application characteristics | |
KR100576511B1 (en) | System and method for calculating real-time voltage stability risk index in power system using time series data | |
CN114978962A (en) | Model algorithm type selection evaluation method for power grid big data analysis | |
CN103529337A (en) | Method for recognizing nonlinear correlation between equipment failures and electric quantity information | |
CN109375146A (en) | A kind of filling mining method, system and the terminal device of electricity consumption data | |
CN106209404A (en) | Analyzing abnormal network flow method and system | |
CN106066415A (en) | For the method detecting the swindle in supply network | |
CN115225455B (en) | Abnormal device detection method and device, electronic device and storage medium | |
CN117130851A (en) | High-performance computing cluster operation efficiency evaluation method and system | |
CN106777313A (en) | Based on holographic time scale measurement electric network data calculated value and calculated value Component Analysis method | |
CN115660314A (en) | Shadow shielding diagnosis method and device, electronic equipment and storage medium | |
CN109462493A (en) | A kind of local area network monitoring method of PIN-based G |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220830 |