CN114978962A - Model algorithm type selection evaluation method for power grid big data analysis - Google Patents

Model algorithm type selection evaluation method for power grid big data analysis Download PDF

Info

Publication number
CN114978962A
CN114978962A CN202210900978.8A CN202210900978A CN114978962A CN 114978962 A CN114978962 A CN 114978962A CN 202210900978 A CN202210900978 A CN 202210900978A CN 114978962 A CN114978962 A CN 114978962A
Authority
CN
China
Prior art keywords
data
time
power grid
monitoring
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210900978.8A
Other languages
Chinese (zh)
Inventor
罗金满
邹钟璐
赵善龙
叶思琪
余凌
袁咏诗
高承芳
冷颖雄
董彩红
刘丽媛
刘飘
林浩钊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd
Original Assignee
Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd filed Critical Dongguan Power Supply Bureau of Guangdong Power Grid Co Ltd
Priority to CN202210900978.8A priority Critical patent/CN114978962A/en
Publication of CN114978962A publication Critical patent/CN114978962A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • H04L43/0888Throughput
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00001Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by the display of information or by user interaction, e.g. supervisory control and data acquisition systems [SCADA] or graphical user interfaces [GUI]
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J13/00Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network
    • H02J13/00002Circuit arrangements for providing remote indication of network conditions, e.g. an instantaneous record of the open or closed condition of each circuitbreaker in the network; Circuit arrangements for providing remote control of switching means in a power distribution network, e.g. switching in and out of current consumers by using a pulse code signal carried by the network characterised by monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0604Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time
    • H04L41/0609Management of faults, events, alarms or notifications using filtering, e.g. reduction of information by using priority, element types, position or time based on severity or priority
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Power Engineering (AREA)
  • Human Computer Interaction (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a model algorithm type selection evaluation method for power grid big data analysis, which comprises the following steps of: step S1, a Storm platform is used as a front-end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, and a power grid distributed scheduling module is combined with the cloud platform framework; s2, sorting the priority of the monitoring data by adopting an event insertion algorithm based on a priority event queue, and extracting a monitoring data stream from the alarm data by adopting a double-threshold parameter extraction algorithm; step S3, introducing a model mark language into the cloud platform framework to acquire data intercommunication between a Storm platform and a Spark platform; and step S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the power equipment monitoring data, realizing the intercommunication of model numbers of Storm and Spark, realizing the cross-platform sharing of different systems under the cloud platform, improving the data processing speed and improving the reliability of data receiving.

Description

Model algorithm type selection evaluation method for power grid big data analysis
Technical Field
The invention relates to the technical field of power grid big data processing, in particular to a model algorithm model selection evaluation method for power grid big data analysis.
Background
In the process of development and intelligentization promotion of an intelligent power grid, state monitoring is more and more frequently used on power grid equipment, power grid data are developing towards multiple ways, multiple types and large scale along with the increase of power transmission and transformation equipment, a big data era on the power grid comes, the power grid data have great commercial value and social value, the processing requirements on the data in power production operation and management are becoming stricter day by day, a plurality of opportunities are hidden in the aspect of the power grid big data, the monitoring data quantity of the power equipment rises rapidly, the types are also increasing gradually, the time and efficiency of the existing system of the power grid are difficult to guarantee when the existing system of the power grid faces the massive monitoring data, when extreme weather occurs, such as fierce wind, snow ice, hail and the like, the data value generated by monitoring of the power equipment is frequently out of limit, alarm data are sent continuously, so that a monitoring center generates massive monitoring data in a short time interval, the method also challenges a power grid system, and the increase of power grid data brings considerable social and commercial values, so that the method is not only challenging but also has opportunities.
The power grid monitoring streaming type big data has the characteristics of complicated big data type, high change speed and the like, and also has the characteristics of high streaming data flow, high flow rate and difficulty in storage, if the fault finding time is too late, inestimable serious consequences can be caused by wrong type judgment or improper processing, so that the power grid monitoring streaming data is reliably received and prevented from being lost, and the fast processing of the data is essential for effectively obtaining the equipment state information.
The analysis of the power grid monitoring streaming type big data by using the Hadoop scene mainly has the following problems: firstly, in the case of a large amount of data, the response time is several minutes or longer, and if the processed data has the characteristic of real time, the Hadoop scene may not be processed; and secondly, in the aspect of data, the requirement on the data collecting mode and the technical real-time performance of big data is high, the power grid monitoring data becomes more diversified and more numerous along with the development and the perfection of a power monitoring system, wherein a large amount of monitoring parameters, configuration files for monitoring related devices, lightning weather and other monitoring record data are also stored in the power monitoring system, and when the power grid fault and individual severe weather are faced, alarm data frequently generated due to the fact that the monitoring value is out of limit can only be continuously accumulated in a monitoring background and cannot be effectively processed.
Disclosure of Invention
The invention aims to provide a model algorithm type selection evaluation method for power grid big data analysis, and aims to solve the technical problems that the real-time data processing progress is poor, and the alarm data frequently generated due to the fact that the monitoring value is out of limit is continuously accumulated in a monitoring background and cannot be effectively processed in the prior art.
In order to solve the technical problems, the invention specifically provides the following technical scheme:
a model algorithm model selection evaluation method for power grid big data analysis comprises the following steps:
step S1, a Storm platform is used as a front-end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, a power grid distributed scheduling module is combined with the cloud platform framework, and power grid data flow is analyzed to monitor power grid equipment alarm data;
s2, sorting the priority of the monitoring data by adopting an event insertion algorithm based on a priority event queue, extracting a monitoring data stream from the alarm data by adopting a double-threshold parameter extraction algorithm, and acquiring the priority of the monitoring data of the power equipment;
step S3, introducing a model mark language into the cloud platform framework to acquire data intercommunication between a Storm platform and a Spark platform so as to rapidly process and monitor alarm data;
and step S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the power equipment monitoring data, and monitoring the data processing reliability in real time.
As a preferred scheme of the present invention, in step S1, the cloud platform framework uses a grouping mode to dock the power grid distributed scheduling modules, and the power grid distributed scheduling modules timely acquire power grid data through the distributed scheduling cluster and perform data distributed scheduling under monitoring of the cloud platform framework.
As a preferred scheme of the present invention, the cloud platform framework employs two JDK resource packets of a master node and a working node for real-time communication, the master node and the working node communicate with each other in real time through a Ping statement, and the operation of the working node is displayed in real time through a UI interface of a Spark platform.
As a preferred scheme of the present invention, the cloud platform orders the monitoring data according to the time-based requirement of monitoring the grid data, specifically:
according to the real-time limit requirement of the power grid alarm data
Figure 362362DEST_PATH_IMAGE001
Time delay of reception and distribution of alarm data
Figure 279503DEST_PATH_IMAGE002
Time to actually process alarm data
Figure 744726DEST_PATH_IMAGE003
Calculating the time margin of alarm data processing on the cloud platform
Figure 139935DEST_PATH_IMAGE004
The time margin
Figure 891990DEST_PATH_IMAGE004
The expression of (a) is:
Figure 714453DEST_PATH_IMAGE005
in any time period, the time margin is measured according to the priority queue property
Figure 981486DEST_PATH_IMAGE004
Sorting and calculating any time in the queue
Figure 180386DEST_PATH_IMAGE006
Alarm data
Figure 849265DEST_PATH_IMAGE007
Time margin of
Figure 29579DEST_PATH_IMAGE008
The expression is as follows:
Figure 783909DEST_PATH_IMAGE009
wherein,
Figure 520921DEST_PATH_IMAGE010
indicating an arbitrary time
Figure 44306DEST_PATH_IMAGE006
The actual processing time of the substrate is,
Figure 474150DEST_PATH_IMAGE011
Figure 653459DEST_PATH_IMAGE012
indicating alarm data
Figure 928582DEST_PATH_IMAGE013
The sum of the data to be processed shared in the front.
As a preferable aspect of the present invention, the alarm data is given according to the arbitrary time
Figure 260469DEST_PATH_IMAGE014
Time margin of
Figure 595635DEST_PATH_IMAGE015
And performing priority sequencing on the alarm events by adopting an event insertion algorithm based on a priority event queue, wherein the event insertion algorithm specifically comprises the following steps:
according to the alarm data
Figure 324557DEST_PATH_IMAGE014
Time margin of
Figure 403371DEST_PATH_IMAGE015
Determining the priority number, real-time performance and inserted queue of the event;
by estimating real-time margins
Figure 901349DEST_PATH_IMAGE015
Determining the position of insertion, checking alarm data
Figure 610679DEST_PATH_IMAGE014
The real-time of the time at the location, if alarm data
Figure 561317DEST_PATH_IMAGE014
If the time meets the real-time requirement, the estimated position is directly inserted, and if the alarm data is detected
Figure 443822DEST_PATH_IMAGE014
If the time does not meet the real-time requirement, the time is inserted into the next bit, and if the last bit is inserted, the time still does not meet the alarm data
Figure 983257DEST_PATH_IMAGE014
Stopping inserting if the real-time performance of the time is required;
after the event insertion is finished, performing real-time detection on the event after the event position is inserted, and if the real-time performance of the subsequent event is not met, putting the event to be inserted into the event set which can not be inserted;
and monitoring the queue of the inserted events with a minimum time margin, and if the queue of the inserted events does not meet the condition, putting the inserted events into the event set which cannot be inserted.
As a preferred scheme of the invention, a double-threshold parameter extraction algorithm is adopted to extract monitoring data flow for the sequencing priority of the alarm events, a discharge model under a Storm platform framework is adopted to simulate the situation that a plurality of high-voltage power facility power grid alarm data are gushed into a monitoring center under extreme conditions in real time, the partial discharge data in the power grid data are processed and analyzed, and the partial discharge data parameters are combined with the Storm platform to construct real-time map analysis.
As a preferred scheme of the present invention, the dual-threshold parameter extraction algorithm specifically includes:
firstly, the Storm platform is utilized to monitor the source serial number of partial discharge in real time
Figure 925805DEST_PATH_IMAGE016
And monitoring the sequence number of the atlas to be drawn currently by the source
Figure 363740DEST_PATH_IMAGE017
According to the map number
Figure 784357DEST_PATH_IMAGE018
The atlas analysis of (2) requires setting the number of signals
Figure 194610DEST_PATH_IMAGE019
Secondly, calculating and extracting
Figure 308059DEST_PATH_IMAGE020
Relative to each other when monitoring the source
Figure 233290DEST_PATH_IMAGE021
Parameters of each spectrum period, and according to the parameter phase of the spectrum period, obtaining the count of the signal when the spectrum is drawn
Figure 457598DEST_PATH_IMAGE022
Judging whether the number of times of extracting the monitoring source is greater than that of the total monitoring source;
finally, counting according to the signal
Figure 519095DEST_PATH_IMAGE023
And a monitoring source
Figure 494791DEST_PATH_IMAGE016
Drawing the first
Figure 907317DEST_PATH_IMAGE016
A monitoring source of
Figure 935316DEST_PATH_IMAGE024
And (4) each spectrogram.
As a preferred scheme of the invention, a model markup language is introduced into the cloud platform framework according to the real-time monitoring data of the Storm platform to obtain data intercommunication between the Storm platform and the Spark platform, a PMML-based distributed file system is introduced into the cloud platform framework to receive and distribute power grid data in real time, and an allxio model is introduced into the Spark platform to store the power grid data in real time.
As a preferred scheme of the present invention, the total amount of discharge data that can be processed by the cloud platform framework in unit time is tested in the cloud platform framework according to the priority of the power grid alarm data to evaluate the alarm data processing throughput, specifically:
introducing a mode construction module, a source component FileFromDirSpout and a data processing component BinaryDecimalmalbolt into the Spark platform, controlling the number of work processes to be unchanged, and setting a fixed work process as
Figure 116899DEST_PATH_IMAGE025
The transmission interval is set to a single cycle number of the map
Figure 306572DEST_PATH_IMAGE026
And calculating the throughput of the mode construction module to evaluate the reliability of the real-time monitoring data processing.
Compared with the prior art, the invention has the following beneficial effects:
in order to improve the efficiency of streaming data processing, the invention combines the parameter analysis of discharge data with a cloud platform, designs and uses a dual-threshold filtering parameter extraction algorithm under the cloud platform, and submits the algorithm to Storm flat, thereby improving the efficiency of parameter extraction and mode recognition, and accelerating the data processing speed; and the structure of the cloud platform distributed scheduling system is improved by adding communication among the power grid distributed scheduling modules, and meanwhile, the reliability of data receiving is improved by adopting an event insertion algorithm based on a priority event queue.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
Fig. 1 is a flowchart of a model algorithm model selection evaluation method for power grid big data analysis according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the invention provides a model algorithm model selection evaluation method for power grid big data analysis, which comprises the following steps:
step S1, a Storm platform is used as a front end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, a power grid distributed scheduling module is combined with the cloud platform framework, and power grid data flow is analyzed to monitor power grid equipment alarm data;
s2, sorting the priority of the monitoring data by adopting an event insertion algorithm based on a priority event queue, extracting a monitoring data stream from the alarm data by adopting a double-threshold parameter extraction algorithm, and acquiring the priority of the monitoring data of the power equipment;
step S3, introducing a model mark language into the cloud platform framework to acquire data intercommunication between a Storm platform and a Spark platform so as to rapidly process and monitor alarm data;
and S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the monitoring data of the power equipment, and monitoring the data processing reliability in real time.
In the embodiment, in order to solve the problem that monitoring stream alarm data are accumulated in a monitoring background and are difficult to process, a Storm platform is used as a front-end framework of electric power data alarm, and a Spark is used as a training model platform to form a total cloud platform framework, so that real-time monitoring and processing of data are realized.
In this embodiment, the data source is processed through a dual-threshold filtering parameter extraction algorithm under the Storm platform, so as to achieve the purpose of rapidly processing the monitoring stream data.
In this embodiment, a model markup language and a dual-threshold parameter extraction algorithm are introduced to a cloud platform composed of the Storm and Spark platforms to achieve fast processing of data, and then reliable reception of data under the cloud platform is achieved through an improved cloud platform distributed scheduling system and an event insertion algorithm based on a priority event queue.
In the step S1, the cloud platform framework is connected to the grid distributed scheduling module in a packet mode, and the grid distributed scheduling module obtains grid data in time through the distributed scheduling cluster and performs distributed data scheduling under monitoring of the cloud platform framework.
In this embodiment, the cloud platform-based power grid distributed scheduling module adds communication between clusters, which can retain the advantages of distributed type, solve the problem of cache of a large amount of data congestion, overcome the defect that communication cannot be performed between clusters, and control global information in time.
The cloud platform framework adopts two JDK resource packets of a main node and a working node for real-time communication, the main node and the working node are in real-time communication through Ping sentences, and the operation of the working node is displayed in real time through a UI (user interface) of a Spark platform.
In this embodiment, the cloud platform framework uses a Storm cluster to form a master node and working nodes, the master node is mainly responsible for monitoring the working conditions of each node and sending codes and tasks to each node, and the working nodes are mainly used for controlling the process-related states.
The cloud platform sequences the monitoring data according to the time-limit requirement of the monitoring power grid data, and specifically comprises the following steps:
according to the real-time limit requirement of the power grid alarm data
Figure 206395DEST_PATH_IMAGE027
Time delay of reception and distribution of alarm data
Figure 710188DEST_PATH_IMAGE028
Time of actual processing of alarm data
Figure 11857DEST_PATH_IMAGE029
Calculating the time margin of alarm data processing on the cloud platform
Figure 559382DEST_PATH_IMAGE030
The time margin
Figure 946501DEST_PATH_IMAGE030
The expression of (a) is:
Figure 316302DEST_PATH_IMAGE031
in any time period, the time margin is measured according to the priority queue property
Figure 206898DEST_PATH_IMAGE030
Sorting and calculating any time in the queue
Figure 3952DEST_PATH_IMAGE032
Alarm data
Figure 612788DEST_PATH_IMAGE033
Time margin of
Figure 723964DEST_PATH_IMAGE034
The expression is as follows:
Figure 734645DEST_PATH_IMAGE035
wherein,
Figure 702601DEST_PATH_IMAGE036
time of day
Figure 798733DEST_PATH_IMAGE006
The actual processing time of the substrate is,
Figure 510337DEST_PATH_IMAGE037
Figure 63940DEST_PATH_IMAGE038
indicating alarm data
Figure 202798DEST_PATH_IMAGE033
Sum of the data to be processed shared in the front.
In this embodiment, according to the monitoring of the grid data, a priority is given to each event, which may be understood as a priority, and each event in the same queue is arranged according to a requirement that can meet a real-time margin, and parallel processing is performed between different queues, and if there are queues with different priorities, a queue with a high priority is processed preferentially, and if a queue with a higher priority appears during processing, the queue with the higher priority is processed by suspending the event being processed immediately.
In the embodiment, the monitoring data are sequenced according to the time-limited requirement of monitoring the power grid data, the cloud platform distributed scheduling system is improved, and the phenomenon that the data are possibly received unreliably under the extreme weather of the cloud platform is relieved.
According to the alarm data at any time
Figure 786226DEST_PATH_IMAGE033
Time margin of
Figure 973625DEST_PATH_IMAGE034
And performing priority sequencing on the alarm events by adopting an event insertion algorithm based on a priority event queue, wherein the event insertion algorithm specifically comprises the following steps:
according to the alarm data
Figure 958898DEST_PATH_IMAGE033
Time margin of
Figure 268657DEST_PATH_IMAGE034
Determining the priority number, real-time performance and inserted queue of the event;
by estimating real-time margins
Figure 339381DEST_PATH_IMAGE034
Determining the position of insertion, checking alarm data
Figure 392788DEST_PATH_IMAGE033
The real-time of the time at the location, if alarm data
Figure 419518DEST_PATH_IMAGE033
If the time meets the real-time requirement, the estimated position is directly inserted, and if the alarm data is detected
Figure 900178DEST_PATH_IMAGE033
If the time does not meet the real-time requirement, the time is inserted into the next bit, and if the last bit is inserted, the time still does not meet the alarm data
Figure 192619DEST_PATH_IMAGE033
Stopping inserting if the real-time performance of the time is required;
after the event insertion is finished, performing real-time detection on the event after the event position is inserted, and if the real-time performance of the subsequent event is not met, putting the event to be inserted into the event set which can not be inserted;
and monitoring the queue of the inserted events with a minimum time margin, and if the queue of the inserted events does not meet the condition, putting the inserted events into the event set which cannot be inserted.
In the embodiment, when an event is inserted, the priority number of the event is considered first, and then the inserted position is determined according to the real-time margin of the event; in addition, if the position of the event insertion does not meet the real-time requirement, the event cannot be inserted into the queue, so the priority of the event in the queue is also considered after the event is added into the queue, because the monitoring data of the power equipment relates to more than one detection equipment, in order to ensure that the alarm data with high priority can be processed preferentially, the monitoring data of each equipment has different priority, and in addition, the timeliness of the monitoring data of different equipment is also considered on the basis of the priority, and part of the equipment can have higher requirements on the data processing speed and time.
And extracting monitoring data flow from the alarm event sequencing priority by adopting a dual-threshold parameter extraction algorithm, simulating the condition that a plurality of high-voltage power facility power grid alarm data are gushed into a monitoring center under extreme conditions in real time by adopting a discharge model under a Storm platform frame, processing and analyzing partial discharge data in the power grid data, and combining the partial discharge data parameters with the Storm platform to construct real-time map analysis.
The double-threshold parameter extraction algorithm specifically comprises the following steps:
firstly, the Storm platform is utilized to monitor the source serial number of partial discharge in real time
Figure 49717DEST_PATH_IMAGE039
And monitoring the current atlas serial number to be drawn by the source
Figure 744003DEST_PATH_IMAGE040
According to the map number
Figure 333248DEST_PATH_IMAGE040
The atlas analysis of (2) requires setting the number of signals
Figure 378564DEST_PATH_IMAGE041
Secondly, calculating and extracting
Figure 773773DEST_PATH_IMAGE039
Personal watchRelative to each other when measuring source
Figure 8052DEST_PATH_IMAGE042
Parameters of each spectrum period, and according to the parameter phase of the spectrum period, obtaining the count of the signal when the spectrum is drawn
Figure 96094DEST_PATH_IMAGE043
Judging whether the number of times of extracting the monitoring source is greater than that of the total monitoring source;
finally, counting according to the signal
Figure 363127DEST_PATH_IMAGE043
And a monitoring source
Figure 562027DEST_PATH_IMAGE039
Drawing the first
Figure 965327DEST_PATH_IMAGE039
A monitoring source of
Figure 161953DEST_PATH_IMAGE040
And (4) each spectrogram.
In this embodiment, a dual-threshold filtering method is used for extracting basic parameters, because the method is simple and effective and is easy to implement under the Storm platform, a discharge is determined by performing double filtering on a local extreme point by using a combination of a vertical direction representing a discharge amplitude and a horizontal direction representing the discharge amplitude, the Storm platform is mainly used for testing the performance of data processing during mode construction, the speed and efficiency of the mode construction are improved by virtue of a streaming processing platform, and the main test indexes are throughput, processing delay and accuracy.
Introducing a model markup language into the cloud platform framework according to the real-time monitoring data of the Storm platform to obtain data intercommunication of the Storm platform and a Spark platform, introducing a PMML-based distributed file system into the cloud platform framework to receive and distribute power grid data in real time, and introducing an Alluxio model into the Spark platform to store the power grid data in real time.
In this embodiment, a model markup language and an Alluxio model are used to realize the cooperative work of the Storm platform and the Spark platform, and complete the model intercommunication, so as to evaluate the speed of the system for improving the data processing.
Testing the total discharge data amount which can be processed by the cloud platform framework in unit time according to the priority of the power grid alarm data in the cloud platform framework so as to evaluate the alarm data processing throughput rate, wherein the testing method specifically comprises the following steps:
introducing a mode construction module, a source component FileFromDirSpout and a data processing component BinaryDecimalmalbolt into the Spark platform, controlling the number of work processes to be unchanged, and setting a fixed work process as
Figure 916282DEST_PATH_IMAGE044
The transmission interval is set to a single cycle number of the map
Figure 653294DEST_PATH_IMAGE045
And calculating the throughput of the mode construction module to evaluate the reliability of the real-time monitoring data processing.
In the embodiment, a discharge point test based on a phase distribution mode is adopted to mainly describe the power frequency phase, the discharge capacity or the discharge amplitude, the discharge rate or the discharge frequency corresponding to the partial discharge pulse in the partial discharge pulse.
In order to improve the efficiency of streaming data processing, the invention combines the parameter analysis of discharge data with a cloud platform, designs and uses a dual-threshold filtering parameter extraction algorithm under the cloud platform, and submits the algorithm to Storm flat, thereby improving the efficiency of parameter extraction and mode recognition, and accelerating the data processing speed; and the structure of the cloud platform distributed scheduling system is improved by adding communication among the power grid distributed scheduling modules, and meanwhile, the reliability of data receiving is improved by adopting an event insertion algorithm based on a priority event queue.
The above embodiments are only exemplary embodiments of the present application, and are not intended to limit the present application, and the protection scope of the present application is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present application and such modifications and equivalents should also be considered to be within the scope of the present application.

Claims (9)

1. A model algorithm model selection evaluation method for power grid big data analysis is characterized by comprising the following steps:
step S1, a Storm platform is used as a front end framework of power grid data alarm, a Spark platform is used as a training model platform to form a cloud platform framework, a power grid distributed scheduling module is combined with the cloud platform framework, and power grid data flow is analyzed to monitor power grid equipment alarm data;
s2, sequencing the priority of monitoring data by adopting an event insertion algorithm based on a priority event queue, extracting monitoring data flow from the alarm data by adopting a double-threshold parameter extraction algorithm, and acquiring the priority of the monitoring data of the power equipment;
step S3, a model mark language is introduced into the cloud platform framework to obtain data intercommunication between a Storm platform and a Spark platform so as to rapidly process and monitor alarm data;
and step S4, evaluating the system data throughput of the power grid distributed scheduling module according to the priority of the power equipment monitoring data, and monitoring the data processing reliability in real time.
2. The model algorithm type selection evaluation method for power grid big data analysis according to claim 1, wherein the step S1 includes:
the cloud platform framework is in butt joint with the power grid distributed scheduling modules in a grouping mode, the power grid distributed scheduling modules acquire power grid data in time through the distributed scheduling clusters, and data distributed scheduling is carried out under monitoring of the cloud platform framework.
3. The model algorithm type selection evaluation method for power grid big data analysis according to claim 2, wherein the cloud platform framework adopts two JDK resource packets for real-time communication between a main node and a working node, the main node and the working node are in real-time communication through Ping sentences, and the operation of the working node is displayed in real time through a UI (user interface) of a Spark platform.
4. The model algorithm type selection evaluation method for power grid big data analysis according to claim 3, characterized by further comprising: the cloud platform sequences the monitoring data according to the time limit requirement of the monitoring power grid data;
the cloud platform sequences the monitoring data according to the time-based requirement of monitoring the power grid data, and comprises the following steps:
according to the real-time limit requirement of the power grid alarm data
Figure 934065DEST_PATH_IMAGE001
Time delay of reception and distribution of alarm data
Figure 301592DEST_PATH_IMAGE002
Time of actual processing of alarm data
Figure 277638DEST_PATH_IMAGE003
Calculating the time margin of alarm data processing on the cloud platform
Figure 818341DEST_PATH_IMAGE004
(ii) a The time margin
Figure 461812DEST_PATH_IMAGE005
The expression of (a) is:
Figure 796979DEST_PATH_IMAGE006
in any time period, the time margin is measured according to the priority queue property
Figure 525900DEST_PATH_IMAGE007
Sorting and calculating any time in the queue
Figure 791665DEST_PATH_IMAGE008
Alarm data
Figure 289643DEST_PATH_IMAGE009
Time margin of
Figure 795710DEST_PATH_IMAGE010
The expression is as follows:
Figure 11928DEST_PATH_IMAGE011
wherein,
Figure 894433DEST_PATH_IMAGE012
indicating an arbitrary time
Figure 184601DEST_PATH_IMAGE008
The actual processing time of the substrate is,
Figure 127149DEST_PATH_IMAGE013
Figure 565083DEST_PATH_IMAGE014
indicating alarm data
Figure 985700DEST_PATH_IMAGE015
The sum of the data to be processed shared in the front.
5. The model algorithm type selection evaluation method for power grid big data analysis according to claim 4, further comprising: according to the alarm data at any time
Figure 458270DEST_PATH_IMAGE016
Time margin of
Figure 306140DEST_PATH_IMAGE017
Adopting an event insertion algorithm based on a priority event queue to perform priority sequencing on the alarm events;
the event insertion algorithm comprises:
according to the alarm data
Figure 922716DEST_PATH_IMAGE016
Time margin of
Figure 147024DEST_PATH_IMAGE018
Determining the priority number, real-time performance and inserted queue of the event;
by estimating real-time margins
Figure 474100DEST_PATH_IMAGE018
Determining the position of insertion, checking alarm data
Figure 696134DEST_PATH_IMAGE016
The real-time of the time at the location, if alarm data
Figure 108661DEST_PATH_IMAGE016
If the time meets the real-time requirement, the estimated position is directly inserted, and if the alarm data is detected
Figure 871081DEST_PATH_IMAGE016
If the time does not meet the real-time requirement, the time is inserted into the next bit, and if the last bit is inserted, the time still does not meet the alarm data
Figure 239614DEST_PATH_IMAGE016
Stopping inserting if the real-time performance of the time is required;
after the event insertion is finished, performing real-time detection on the event after the event position is inserted, and if the real-time performance of the subsequent event is not met, putting the event to be inserted into the event set which can not be inserted;
and monitoring the queue of the inserted events with a minimum time margin, and if the queue of the inserted events does not meet the condition, putting the inserted events into the event set which cannot be inserted.
6. The model algorithm model selection evaluation method for power grid big data analysis as claimed in claim 5, wherein a dual-threshold parameter extraction algorithm is used to extract monitoring data stream for the sequencing priority of the alarm event, a discharge model under a Storm platform framework is used to simulate the situation that a plurality of high-voltage power facility power grid alarm data are flooded into a monitoring center under extreme conditions in real time, the partial discharge data in the power grid data are processed and analyzed, and the partial discharge data parameters and the Storm platform are combined to construct a real-time map analysis.
7. The model algorithm type selection evaluation method for power grid big data analysis according to claim 6, wherein the dual-threshold parameter extraction algorithm is specifically as follows:
firstly, the Storm platform is utilized to monitor the source serial number of partial discharge in real time
Figure 694866DEST_PATH_IMAGE019
And monitoring the current atlas serial number to be drawn by the source
Figure 594689DEST_PATH_IMAGE020
According to the map number
Figure 160800DEST_PATH_IMAGE020
The atlas analysis of (2) requires setting the number of signals
Figure 196889DEST_PATH_IMAGE021
Secondly, calculating and extracting
Figure 823042DEST_PATH_IMAGE019
Relative to each other when monitoring the source
Figure 147844DEST_PATH_IMAGE022
Parameters of each spectrum period, and according to the parameter phase of the spectrum period, obtaining the count of the signal when the spectrum is drawn
Figure 252066DEST_PATH_IMAGE023
Judging whether the number of times of extracting the monitoring source is greater than that of the total monitoring source;
finally, counting according to the signal
Figure 408241DEST_PATH_IMAGE023
And a monitoring source
Figure 205296DEST_PATH_IMAGE019
Drawing the first
Figure 502547DEST_PATH_IMAGE019
A first monitoring source
Figure 676040DEST_PATH_IMAGE024
And (4) each spectrogram.
8. The model algorithm model selection evaluation method for power grid big data analysis as claimed in claim 6, characterized in that a model markup language is introduced into the cloud platform framework according to the real-time monitoring data of the Storm platform to obtain data intercommunication between the Storm platform and Spark platform;
the model markup language introduced into the cloud platform framework is a PMML-based distributed file system and is used for receiving and distributing power grid data in real time, and the model markup language introduced into the Spark platform is used for storing the power grid data in real time for the Alluxio model.
9. The model algorithm type selection evaluation method for power grid big data analysis according to claim 8, further comprising:
testing the total discharge data which can be processed by the cloud platform frame in unit time according to the priority of the power grid alarm data in the cloud platform frame so as to evaluate the alarm data processing throughput rate;
the step of testing the total discharge data amount which can be processed by the cloud platform framework in unit time according to the priority of the power grid alarm data in the cloud platform framework so as to evaluate the alarm data processing throughput rate comprises the following steps:
introducing a mode construction module, a source component FileFromDirSpout and a data processing component BinaryDecimal bolt in the Spark platform, controlling the number of working processes to be unchanged, and setting a fixed working process as
Figure 686721DEST_PATH_IMAGE025
The transmission interval is set to a single cycle number of the map
Figure 654677DEST_PATH_IMAGE026
And calculating the throughput of the mode construction module to evaluate the reliability of the real-time monitoring data processing.
CN202210900978.8A 2022-07-28 2022-07-28 Model algorithm type selection evaluation method for power grid big data analysis Pending CN114978962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210900978.8A CN114978962A (en) 2022-07-28 2022-07-28 Model algorithm type selection evaluation method for power grid big data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210900978.8A CN114978962A (en) 2022-07-28 2022-07-28 Model algorithm type selection evaluation method for power grid big data analysis

Publications (1)

Publication Number Publication Date
CN114978962A true CN114978962A (en) 2022-08-30

Family

ID=82970463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210900978.8A Pending CN114978962A (en) 2022-07-28 2022-07-28 Model algorithm type selection evaluation method for power grid big data analysis

Country Status (1)

Country Link
CN (1) CN114978962A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107968840A (en) * 2017-12-15 2018-04-27 华北电力大学(保定) A kind of extensive power equipment monitoring, alarming Real-time Data Processing Method and system
CN108156192A (en) * 2016-12-02 2018-06-12 联芯科技有限公司 Android RIL message handling systems and method
CN111432295A (en) * 2020-03-18 2020-07-17 北京科东电力控制系统有限责任公司 Power consumption information acquisition master station system based on distributed technology

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108156192A (en) * 2016-12-02 2018-06-12 联芯科技有限公司 Android RIL message handling systems and method
CN107968840A (en) * 2017-12-15 2018-04-27 华北电力大学(保定) A kind of extensive power equipment monitoring, alarming Real-time Data Processing Method and system
CN111432295A (en) * 2020-03-18 2020-07-17 北京科东电力控制系统有限责任公司 Power consumption information acquisition master station system based on distributed technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赵铭滕: "云平台下电网监测流式大数据的可靠接收与快速处理", 《中国优秀硕士学位论文 工程科技II辑,2022年》 *

Similar Documents

Publication Publication Date Title
CN109495317B (en) Data network flow prediction method and device
CN105740975B (en) A kind of equipment deficiency assessment and prediction technique based on data correlation relation
EP2266254B1 (en) Available bandwidth estimation in a packet-switched communication network
CN107070683A (en) The method and apparatus of data prediction
CN110348615B (en) Cable line fault probability prediction method based on ant colony optimization support vector machine
CN103580905B (en) A kind of method for predicting, system and flow monitoring method, system
CN110907755A (en) Power transmission line online monitoring fault recognition method
CN105281945A (en) Data flow-based deterministic network integrity fault detection method
CN104392069B (en) A kind of WAMS delay character modeling method
CN102263676A (en) Network bottleneck detection method
CN107402851A (en) A kind of data recovery control method and device
CN105260253A (en) Server failure measurement and calculation method and device
CN116484554A (en) Topology identification method, device, equipment and medium for power distribution network
CN103246569A (en) Method and device for representing high-performance calculation application characteristics
KR100576511B1 (en) System and method for calculating real-time voltage stability risk index in power system using time series data
CN114978962A (en) Model algorithm type selection evaluation method for power grid big data analysis
CN103529337A (en) Method for recognizing nonlinear correlation between equipment failures and electric quantity information
CN109375146A (en) A kind of filling mining method, system and the terminal device of electricity consumption data
CN106209404A (en) Analyzing abnormal network flow method and system
CN106066415A (en) For the method detecting the swindle in supply network
CN115225455B (en) Abnormal device detection method and device, electronic device and storage medium
CN117130851A (en) High-performance computing cluster operation efficiency evaluation method and system
CN106777313A (en) Based on holographic time scale measurement electric network data calculated value and calculated value Component Analysis method
CN115660314A (en) Shadow shielding diagnosis method and device, electronic equipment and storage medium
CN109462493A (en) A kind of local area network monitoring method of PIN-based G

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220830