CN104616205A - Distributed log analysis based operation state monitoring method of power system - Google Patents

Distributed log analysis based operation state monitoring method of power system Download PDF

Info

Publication number
CN104616205A
CN104616205A CN201410681737.4A CN201410681737A CN104616205A CN 104616205 A CN104616205 A CN 104616205A CN 201410681737 A CN201410681737 A CN 201410681737A CN 104616205 A CN104616205 A CN 104616205A
Authority
CN
China
Prior art keywords
log
log information
power system
point
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410681737.4A
Other languages
Chinese (zh)
Other versions
CN104616205B (en
Inventor
曹宇
王梓
张岩
孟伶智
郄洪涛
舒力
李华
阎博
王桂茹
张�浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Tianjin Electric Power Co Ltd
Beijing Kedong Electric Power Control System Co Ltd
State Grid Jibei Electric Power Co Ltd
Original Assignee
State Grid Tianjin Electric Power Co Ltd
Beijing Kedong Electric Power Control System Co Ltd
State Grid Jibei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Tianjin Electric Power Co Ltd, Beijing Kedong Electric Power Control System Co Ltd, State Grid Jibei Electric Power Co Ltd filed Critical State Grid Tianjin Electric Power Co Ltd
Priority to CN201410681737.4A priority Critical patent/CN104616205B/en
Publication of CN104616205A publication Critical patent/CN104616205A/en
Application granted granted Critical
Publication of CN104616205B publication Critical patent/CN104616205B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Water Supply & Treatment (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Power Engineering (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a distributed log analysis based operation state monitoring method of a power system. The method comprises the steps of S1, acquiring log information of the power system, and combining into a log file; S2, segmenting the log file, processing to obtain the log information with the same format, respectively sequentially outputting the log information in the log file into a distributed storing system; S3, extracting the log information from the distributed storing system, classifying the log information by the log analysis algorithm based on state noise removing and clustering according to the Map-Reduce mechanism, and analyzing the classified log information to monitor the system operation state. With the adoption of the method, the abnormality in operation state of the power system can be timely found if any and can be handled at the first time, so that the requirement on timely and efficient operation of the power system can be effectively met.

Description

A kind of operation states of electric power system supervision method analyzed based on distributed information log
Technical field
The present invention relates to a kind of operation states of electric power system supervision method, particularly relate to a kind of operation states of electric power system supervision method analyzed based on distributed information log, belong to electric power system dispatching technical field.
Background technology
Along with the continuous increase of expansion and the complexity increasingly of electrical network scale, extra-high voltage interconnected power grid runs the integration of electrical network and unified cooperation control proposes new requirement, and the requirement that country runs power grid security, stable, economy, environmental protection is also more and more higher.The large data of electric power are arisen at the historic moment, it is that large data theory, techniques and methods are in the practice of power industry, the large data of electric power relate to generating, transmission of electricity, power transformation, distribution, electricity consumption, each link of scheduling, combine across unit, multi-disciplinary, trans-sectoral business data analysis, excavate and the function of data visualization.
In electric power system dispatching link, along with putting into operation of intelligent grid supporting system technology, electric network data acquisition range and type are constantly expanded, and serve vital role meeting in the omnibearing real-time monitoring of interconnected power grid etc.At present, a series of scheduling production management operational systems that it is core that regulation and control center at different levels builds with intelligent grid supporting system technology, mainly contain the systems such as SCADA/EMS, WAMS, water power and new forms of energy, secondary device in-service monitoring and analysis, operation plan, Security Checking, management and running, system puts into operation, basic satisfied scheduling need of production, plays a significant role in scheduling production management.
The safe and stable operation of electric system needs the protection of the subterranean equipment such as relay protection and aut.eq.; but only rely on these subterranean equipment can't ensure the safe operation of electric system completely; because these devices are all often the faults processing electric system according to the information of local; and the various challenges can not predicted with the information of the overall situation, occur in the ruuning situation of analytic system and disposal system; for this reason, the log analysis technology monitored for system running state is urgently developed.
At present, the syslog analysis technology of domestic electrical enterprise is still immature, the discovery of most systems fault also depends on fault alarm and manpower is verified, and in a lot of situation, when fault alarm or manpower verify discovery fault, system there occurs misoperation for a long time, can not the operation exception of Timeliness coverage system, and process in the very first time, substantially prolongs the O&M time of system, timely, the efficient service requirement of network system can not be met.In addition, electric power enterprise may have much different data analysis requirements every day, and the daily record data provided also is diversified, how to carry out united analysis process to diversified daily record data, is also a urgent problem.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of operation states of electric power system supervision method analyzed based on distributed information log.
For achieving the above object, the present invention adopts following technical scheme:
A kind of operation states of electric power system supervision method analyzed based on distributed information log, comprises the steps:
S1, obtains the log information of electric system, and is merged into journal file;
S2, splits journal file, processes to it log information obtaining consolidation form, make the log information in journal file one by one serializing output in distributed memory system;
S3, log information is extracted from distributed memory system, in conjunction with Map-Reduce mechanism, adopt the log analysis algorithm removing cluster based on state noise to carry out classification process to log information, and monitor operation states of electric power system by carrying out analysis to sorted log information.
Wherein more preferably, in step sl, the log scan grasping means based on syslog mode is adopted when obtaining described log information.
Wherein more preferably, described log scan grasping means comprises the steps:
S11, is undertaken choosing merging by the log information that each seed module be positioned on each node of electric system captures, obtains all kinds of log informations of this node;
S12, in the regional of electric system, carries out crawl to all kinds of log informations of each node and merges, obtain the integral data in each region, and be sent to local area data processing node and process data, be stored in journal file;
S13, all kinds of log informations of merging are chosen in acquisition, and obtain from the node of crawl log information and capture record data, the merging obtaining log information by analysis captures strategy, are combined crawl strategy as required and adjust.
Wherein more preferably, in conjunction with Map-Reduce mechanism, adopt the log analysis algorithm monitors system running state removing cluster based on state noise, specifically comprise the steps:
S31, extracts log information from distributed memory system, by its node position according to crawl log information, carries out rough sort according to log information classification, at its similarity matrix of middle structure of all categories, and selects in category set a bit as central point;
S32, uses k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, builds comprise sharing of whole log category and close on figure most with the similarity matrix after sparse;
S33, adopts Map mechanism, for sharing each point closed on most in figure, gathers the distance length of this point apart from other points;
S34, adopts Reduce mechanism, and distance length summation Map mechanism gathered, generates new key-value pair;
S35, chosen distance length and maximum point, as similarity matrix central point, cover former central point, for from length and the point being less than length threshold, are labeled as noise, are not re-used as class bunch central point;
S36, a little with the linking of point, remove the link that weight ratio threshold value is little, choose link each other o'clock as a class bunch, make each class bunch represent a classification log information;
S37, takes further analysis according to different classes of log information, obtains the information reflecting operation states of electric power system, realizes the supervision to operation of power networks state by the change observing these information.
Wherein more preferably, in step S31, described log information classification comprises: system journal, access log and User action log three class.
Wherein more preferably, in step s 32, build sharing of whole log category to close on figure most and comprise the steps:
First, with the neighbor point list of k k-nearest neighbor determination log information A and B, when in the point of proximity list of A and B all the other side, point-to-point transmission sets up a link; Then be set to zero with certain point without the similarity corresponding to the point linked by similarity matrix, realize the rarefaction of similarity matrix; Finally establish the link two and weight limit are drawn out, complete sharing of the whole log category of structure and close on figure most;
The weight of the link between 2 the i.e. similarity str (i, j) of 2, calculating formula of similarity is: str (i, j)=∑ (k+1-m) * (k+1-n);
Wherein, k is the size of A and B neighbor point list, m and n be A and B close on interval sequence number of closing on separately at it in list.
Wherein more preferably, based on the operation states of electric power system supervision method that distributed information log is analyzed, also comprise the steps:
S4, according to Operation of Electric Systems situation, determines to need the index of special concern and affiliated log information classification thereof, by carrying out monitoring the supervision realized operation states of electric power system to described index separately in corresponding log information classification.
Wherein more preferably, comprise the steps: further in step s 4 which
S41, resolves log information, determines the log information classification needed belonging to the index of special concern;
S42, resolving the key word extracting in logged result and need special concern, be spliced into field name, its value value is set to 1;
S43, adopts Reduce mechanism, in described log information classification, calculates the number of times gathering described field name and occur in this classification, generates and export new key-value pair;
S44, extracts the information of key assignments centering, analyzes, realize the supervision of operation states of electric power system to it.
Operation states of electric power system supervision method provided by the present invention, log information is obtained from electric system, in conjunction with Map-Reduce mechanism, employing removes cluster log analysis algorithm based on state noise carries out classification process to log information, and monitor operation states of electric power system by carrying out analysis to sorted log information, thus when system occurs abnormal, energy Timeliness coverage also processes in the very first time, effectively meet timely, the efficient service requirement of electric system.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of operation states of electric power system supervision method provided by the invention;
Fig. 2 is in the present invention, realizes the network crawler system structural drawing that log information gathers;
Fig. 3 is in the present invention, log information collecting flowchart figure;
Fig. 4 is in the present invention, removes the process flow diagram of the log analysis algorithm monitors system running state of cluster based on state noise;
Fig. 5 is the statistical study process flow diagram of specific fields log information.
Embodiment
Below in conjunction with the drawings and specific embodiments, technology contents of the present invention is described in further detail.
As shown in Figure 1, the operation states of electric power system supervision method analyzed based on distributed information log provided by the invention, specifically comprise the steps:, first by capturing the log information of technical limit spacing electric system based on the log scan of syslog (system journal) mode, to be combined into journal file; Then by dividing processing, journal file is split, sews content before and after combination message and make log information have unified Log data format, by log information one by one serializing output in distributed memory system (HDFS/HBase); Finally in conjunction with the Map-Reduce mechanism in Hadoop, adopt the log analysis algorithm removing cluster based on state noise to carry out classification to log information and process, and monitor operation states of electric power system by carrying out analysis to sorted log information.Detailed specific description is done to this process below.
S1, by obtaining the log information of electric system based on the log scan grasping means of syslog mode, and is merged into journal file.
Data acquisition, also known as data acquisition, is the process utilizing a kind of instrument from its exterior image data and be input to internal system.In today of internet industry fast development, data collecting field there occurs important change, is widely used in internet and field of distributed type.In power industry, data acquisition is exactly the collecting work be concerned about safety equipment, application system etc. being carried out to log information needed for power system monitor, fault analysis by certain concrete mode (file, syslog, http etc.).
Log collection technology is one of gordian technique of log analysis.Log collection technology needs to gather the log informations such as various safety equipment, application system, for the event analysis work on upper strata provides Data Source, therefore log collection process is the basis that system carries out detection and decision-making, and its accuracy, reliability and efficiency thereof directly have influence on the performance of whole system.
In one embodiment of the invention, the log information of analysis mainly comprises: system journal, access log, User action log three class, by obtaining the log information of electric system based on the log scan grasping means of syslog mode.System journal (syslog) agreement develops in the TCP/IP System Implementation in Bai Keli software distribution research centre, University of California (BSD), and oneself becomes industry-standard protocol at present, the daily record of its register system available and equipment.In the network equipment such as router, switch of UNIX/Linux system, syslog records any event in system, and supvr by checking system log (SYSLOG), can grasp system status at any time.The system journal of UNIX/Linux, by syslogd process register system pertinent events, also records application program can operate event, by suitable configuration, can also realize the communication run between the machine of syslog agreement.By analyzing these network behavior daily records, following the trail of and grasping the situation relevant with system, equipment and network.
In one embodiment of the invention, the log scan grasping means based on syslog mode adopts the network crawler system being applied to system journal scanning crawl to carry out real time scan and grasping system daily record, for follow-up running state monitoring is prepared.Web crawlers (Spider) refers to follows http protocol, travels through the software program of information space according to the index relative between hyperlink wherein and Web page document.
Network crawler system comprises Seed Management Module, handling module and reptile daily record data information extraction and statistical module; Realize the network crawler system structural drawing of log information collection as shown in Figure 2, reptile daily record data information extraction and statistical module capture node from Seed Management Module and handling module and obtain log information, first back up at home server, then compress according to the mode of HadoopLzop, by Internet Transmission, packed data is uploaded to HDFS, Hive generates Map-Reduce task according to the daily record plan of resolving, submit to Hadoop cluster in Job mode, its result of calculation is stored in reptile data system.Cluster Job dispatching system is responsible for Job task scheduling, and to realize effective utilization of resource, the running status of group operation monitoring record Job task, network monitoring can be monitored the running status of system.
Wherein, the acquisition log information realizing log information by network crawler system specifically comprises the steps:
S11, Seed Management Module is distributed on each node of electric system, is carried out choosing merging by the daily record data that each seed module be positioned on this node captures, obtains all kinds of log informations of this node.
For electric system, multiple seed module is distributed on each node of electric system, the log informations such as the system information that during for capturing Operation of Electric Systems, this node produces, visit information and each senior application message.Seed Management Module is also distributed on each node of electric system, and the log information in order to be captured by each seed module carries out choosing merging, obtains all kinds of log informations of this node.
S12, handling module is distributed in power train and unifies district, 2nd district, 3rd district, the log information obtained is gathered to the Seed Management Module of each node and carries out crawl merging, obtain the integral data in each district, be sent to local area data processing node, process carried out to data and is stored in journal file.
Each node that a district of electric system, 2nd district, 3rd district comprise is dispersed with Seed Management Module, handling module is distributed in a district of electric system, 2nd district, 3rd district respectively, the Seed Management Module that one district of electric system, 2nd district, 3rd district comprise is gathered the log information obtained and carry out crawl merging, obtain the integral data in each district, and be sent to local area data processing node, data are processed, treated log information is stored in journal file.
S13, reptile daily record data information extraction and statistical module obtain all kinds of log informations choosing merging from Seed Management Module and handling module, obtain from the node of crawl log information and capture record data, the merging obtaining log information by analysis captures strategy, can be combined crawl strategy as required in time and adjust.
Reptile daily record data information extraction and statistical module play the effect adjusting and capture strategy, acquisition Seed Management Module and handling module choose all kinds of log informations of merging on the one hand, obtain from the node of crawl log information on the other hand and capture record data, by analyzing these information, the merging obtaining whole crawler system captures strategy, when running into system problem, can the timely daily record kind that relates to of problem for occurring be combined and capture strategy and adjust accordingly as required, the Seed Management Module in system and handling module is made only to capture the log information relevant to problem, decrease quantity and the time of log information process, improve the efficiency of O&M.
S2, splits journal file, processes to it log information obtaining consolidation form, make the log information in journal file one by one serializing output in distributed memory system (HDFS/HBase).
By Flume instrument, journal file is split, adopt the mode of sewing before and after combination message, customization Log data format, different classes of log information is made to obtain unified Log data format, make log information one by one serializing output in distributed memory system (HDFS/HBase), for next step log analysis creates facility.
According to the actual needs of electric system, the log information of analysis mainly comprises: system journal, access log, User action log three class.System journal is used for system running state monitoring, comprises system resource utilization rate, network equipment behaviour in service etc.; Access log is used for the interaction scenario of statistical system main frame, as system access amount, access node information, access time etc.; User action log, for dispatching the mining analysis of behavior pattern, mainly carries out modeling analysis to the service data of operations staff.Three class journal files are captured by crawler technology and utilize Flume instrument to be sent in distributed memory system in the mode of batch, timing.Flume instrument is that a kind of distributed information log is collected, means of transport.It is elementary cell with Agent, comprises data receiver, transmitting terminal, passage, is the distributed instrument with high scalability and high-freedom degree, not only can collects non-structured text, also can collect the files such as non-structured video, audio frequency.Log information collecting flowchart as shown in Figure 3, whether first this process detects has new journal file to produce, if had, journal file is split, log information is carried out to the unified process of form, then by process after log information one by one serializing be stored in distributed system, be convenient to later collective analysis.
S3, log information is extracted from distributed memory system (HDFS/HBase), in conjunction with the Map-Reduce mechanism in Hadoop, employing removes cluster log analysis algorithm based on state noise carries out classification process to log information, monitors operation states of electric power system by carrying out analysis to sorted log information.
In electrical network distributed data framework, (in one embodiment of the invention, the seed module of web crawlers serves as data acquisition unit to multiple data acquisition unit, in order to gather the log information in electric system.) through part administration in a network environment.Therefore need the operation of heart Centralized Monitoring data acquisition unit and main frame in the controlling, and by log information, system state is monitored.
It is maximum that cluster (Clustering) is exactly the similarity be divided between same group objects by data set, the process of the similarity in different groups between object minimized multiple groups (group) or bunch (cluster).Cluster analysis is a kind of important technology in data analysis, applies very extensive.From statistical angle, cluster analysis, as one of the Main Branches of multivariate statistical analysis, is a kind of method by data modeling reduced data, mainly based on the clustering method of Distance geometry based on similarity.From the angle of machine learning, cluster be a kind of need not predefined class or band class mark training example without instructing machine learning method.
In one embodiment of the invention, log information is extracted from distributed memory system (HDFS/HBase), in conjunction with the Map-Reduce mechanism in Hadoop, employing removes cluster log analysis algorithm based on state noise carries out classification process to log information, a classification log information is only comprised in each class after process bunch, the corresponding index representing operation states of electric power system can be found in single classification, as information on services, by contrasting the analysis of index operation information, obtain current power system running state.Such as: when the corresponding index of operation states of electric power system occurs normally running inconsistent service data with electric system, illustrate that the corresponding index of electric system occurs abnormal, O&M can be carried out to the equipment that corresponding index is associated rapidly, greatly reduce the O&M time of system.
As shown in Figure 4, in conjunction with the Map-Reduce mechanism in Hadoop, adopt the log analysis algorithm monitors system running state removing cluster based on state noise, specifically comprise the steps:
S31, log information is extracted from distributed memory system (HDFS/HBase), by its node position according to crawl log information, rough sort is done according to system journal, application log, access log etc., at its similarity matrix of middle structure of all categories, and in category set random choose a bit as central point.
In electric system, some Node distribution is at basic platform, the log information captured mainly comprises system journal, some node is application node, the log information captured mainly comprises application log, some node does not have the action of other nodes mutual, and just do not have access log, on some node, all types daily record has.In one embodiment of the invention, according to node position, the log information gathered is carried out rough sort.First divide a class the node list only comprising unitary class log information, and then get union with the node that three class log informations all comprise.Form different classes of journal file after rough sort, at its similarity matrix of middle structure of all categories, and in category set random choose a bit as central point.In one embodiment of the invention, each log information forms a point in journal file.Similarity matrix is a square formation, using each point and other point similarity as matrix element.
Close on most its similarity of section definition according to sharing of two log informations, namely the similarity of two log informations determines by between its nearest-neighbour.In one embodiment of the invention, with the neighbor point list of k k-nearest neighbor determination daily record A and daily record X, when and if only if A and X is in the point of proximity list of the other side, point-to-point transmission just sets up a link.Have the some X linked to be one with A to gather, have the point linked also to be one gather with B, these two intersection of sets collection are shared exactly and are most closed on interval.If daily record A is near daily record B, and they are near class set C, and so, A just has higher degree of confidence near B, because the similarity of A and B is determined by class set C simultaneously, class set C closes on interval most for sharing.
S32, uses k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, is set to zero by matrix with certain point without the similarity corresponding to some link, builds comprise the shared of whole log category and close on figure most with the similarity matrix after sparse.
In one embodiment of the invention, use k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, for the similarity matrix structure after sparse, similarity matrix mid point and point and weight limit thereof are drawn out, closes on figure most to build sharing of whole log category.Specifically comprise the steps:
The neighbor point list of A and B is determined with k k-nearest neighbor, when and if only if A and B is in the point of proximity list of the other side, point-to-point transmission just sets up a link, zero is set to certain point without the similarity corresponding to the point linked by similarity matrix, realize the rarefaction of similarity matrix, then establish the link two and weight limit are drawn out, sharing of whole log category can be built and close on figure most.Wherein, the weight of this link i.e. similarity of 2, calculating formula of similarity is as follows:
str(i,j)=∑(k+1-m)*(k+1-n)
Wherein, k is the size of A and B neighbor point list, m and n be A and B close on interval sequence number of closing on separately at it in list.
S33, adopts Map mechanism, closes on each point in figure most for sharing, and gathers the distance length of this point apart from other points.
S34, adopt Reduce mechanism, by Map mechanism gather distance length summation, calculate each point distance length and, generate new key-value pair.Wherein, the key value in key-value pair is log information, value value be each point distance length and.
S35, chosen distance length and maximum point, as similarity matrix central point, cover former central point, for distance length and the point being less than length threshold, are labeled as noise, are not re-used as class bunch central point.
S36, a little with the linking of point, remove the link that weight ratio threshold value is little, choose link each other o'clock as a class bunch, ensure that in class bunch, institute is a little all central point or is directly connected with central point, each class bunch represents a classification log information.
From step S32, the weight of the link i.e. similarity of 2, calculating formula of similarity is as follows:
str(i,j)=∑(k+1-m)*(k+1-n)
Wherein, k is the size of A and B neighbor point list, m and n be A and B close on interval sequence number of closing on separately at it in list.Distance length between 2 is larger, and the weight of link is less, and similarity is lower.Remove the link that weight ratio threshold value is little, can ensure to be other log information of same class in the link formed at remaining point, choose link each other o'clock as a class bunch, ensure that in class bunch, institute is a little all central point or is directly connected with central point, each class bunch represents a classification log information.
S37, further analysis is taked according to different classes of log information, obtain the indices of reflection POWER SYSTEM STATE, system access amount and the information such as access node and dispatcher's service data, realize the supervision to operation of power networks state by the change observing these information.
Further analysis is taked according to different classes of log information, utilize Hive, for the journal file being subordinate to system journal classification, statistics draws the indices of reflection system state, the status informations such as such as CPU usage, memory headroom, hard drive space, network interface card flow, process and information on services.To the journal file being subordinate to access log classification, analyze and draw the information such as be concerned about visit capacity, access node.To the journal file being subordinate to User action log classification, statistics dispatcher service data.When electric system occurs abnormal, can there is certain change in these information, realizes the monitoring to operation states of electric power system by the change observing these information.
S4, according to Operation of Electric Systems situation, determines to need the index of special concern and affiliated log information classification thereof, by corresponding log information classification separately to the supervision needing the index of special concern to monitor to realize to operation states of electric power system.
If in system operation, due to the particular/special requirement run, such as, in certain period easily there is exception in some index of electric system, cause electric power system fault, need the running status of user's special concern time period or certain index, can separately to needing the index of special concern to monitor.By carrying out special concern to it, can the exception of Timeliness coverage operation states of electric power system.As shown in Figure 5, specifically comprise the steps:
S41, resolves log information, determines the log information classification needed belonging to the index of special concern, and be namely concerned about problem belongs to system journal or access log or User action log.
S42, resolving the key word extracting in logged result and need special concern, " schedule job ", " ERROR " etc., be spliced into field name, its value value is set to 1.
S43, adopts Reduce mechanism, and in this log information classification, calculate and gather value value, namely the number of times that occurs in this classification of this field name, generates and export new key-value pair.
S44, extracts the information of key assignments centering, analyzes, realize the supervision of operation states of electric power system to it.
In sum, operation states of electric power system supervision method provided by the present invention, by capturing the log information of technical limit spacing electric system based on the log scan of syslog mode, then combine before and after message and sew content, every bar log information is made all to have the front suffix information of customization, make log information one by one serializing output in distributed memory system (HDFS/HBase), in conjunction with the Map-Reduce mechanism in Hadoop, adopt the log analysis algorithm monitors system running state removing cluster based on state noise, thus the exception of energy Timeliness coverage operation states of electric power system, and process in the very first time, effectively meet electric system timely, efficient service requirement.In addition, the web crawlers technology being applied to system journal scanning crawl can capture diversified daily record data from electric system, and carries out united analysis process by Flume instrument to it, improves the treatment effeciency of diversified daily record data.
Above the operation of power networks state monitoring method based on distributed information log analysis provided by the present invention is described in detail.For one of ordinary skill in the art, to any apparent change that it does under the prerequisite not deviating from connotation of the present invention, all by formation to infringement of patent right of the present invention, corresponding legal liabilities will be born.

Claims (8)

1. the operation states of electric power system supervision method analyzed based on distributed information log, is characterized in that comprising the steps:
S1, obtains the log information of electric system, and is merged into journal file;
S2, splits journal file, processes to it log information obtaining consolidation form, make the log information in journal file one by one serializing output in distributed memory system;
S3, log information is extracted from distributed memory system, in conjunction with Map-Reduce mechanism, adopt the log analysis algorithm removing cluster based on state noise to carry out classification process to log information, and monitor operation states of electric power system by carrying out analysis to sorted log information.
2. operation states of electric power system supervision method as claimed in claim 1, is characterized in that:
In step sl, the log scan grasping means based on syslog mode is adopted when obtaining described log information.
3. operation states of electric power system supervision method as claimed in claim 2, is characterized in that described log scan grasping means comprises the steps:
S11, is undertaken choosing merging by the log information that each seed module be positioned on each node of electric system captures, obtains all kinds of log informations of this node;
S12, in the regional of electric system, carries out crawl to all kinds of log informations of each node and merges, obtain the integral data in each region, and be sent to local area data processing node and process data, be stored in journal file;
S13, all kinds of log informations of merging are chosen in acquisition, and obtain from the node of crawl log information and capture record data, the merging obtaining log information by analysis captures strategy, are combined crawl strategy as required and adjust.
4. operation states of electric power system supervision method as claimed in claim 1, is characterized in that in conjunction with Map-Reduce mechanism, adopts the log analysis algorithm monitors system running state removing cluster based on state noise, specifically comprise the steps:
S31, extracts log information from distributed memory system, by its node position according to crawl log information, carries out rough sort according to log information classification, at its similarity matrix of middle structure of all categories, and selects in category set a bit as central point;
S32, uses k nearest neighbour classification algorithm by similarity matrix rarefaction of all categories, builds comprise sharing of whole log category and close on figure most with the similarity matrix after sparse;
S33, adopts Map mechanism, for sharing each point closed on most in figure, gathers the distance length of this point apart from other points;
S34, adopts Reduce mechanism, and distance length summation Map mechanism gathered, generates new key-value pair;
S35, chosen distance length and maximum point, as similarity matrix central point, cover former central point, for from length and the point being less than length threshold, are labeled as noise, are not re-used as class bunch central point;
S36, a little with the linking of point, remove the link that weight ratio threshold value is little, choose link each other o'clock as a class bunch, make each class bunch represent a classification log information;
S37, takes further analysis according to different classes of log information, obtains the information reflecting operation states of electric power system, realizes the supervision to operation of power networks state by the change observing these information.
5. operation states of electric power system supervision method as claimed in claim 4, is characterized in that:
In step S31, described log information classification comprises: system journal, access log and User action log three class.
6. the operation states of electric power system method of supervisioning as claimed in claim 4, is characterized in that in step s 32, and what build whole log category sharedly closes on most figure and comprise the steps:
First, with the neighbor point list of k k-nearest neighbor determination log information A and B, when in the point of proximity list of A and B all the other side, point-to-point transmission sets up a link; Then be set to zero with certain point without the similarity corresponding to the point linked by similarity matrix, realize the rarefaction of similarity matrix; Finally establish the link two and weight limit are drawn out, complete sharing of the whole log category of structure and close on figure most;
The weight of the link between 2 the i.e. similarity str (i, j) of 2, calculating formula of similarity is: str (i, j)=Σ (k+1-m) * (k+1-n);
Wherein, k is the size of A and B neighbor point list, m and n be A and B close on interval sequence number of closing on separately at it in list.
7. operation states of electric power system supervision method as claimed in claim 1, characterized by further comprising following steps:
S4, according to Operation of Electric Systems situation, determines to need the index of special concern and affiliated log information classification thereof, by carrying out monitoring the supervision realized operation states of electric power system to described index separately in corresponding log information classification.
8. operation states of electric power system supervision method as claimed in claim 7, is characterized in that comprising the steps: further in step s 4 which
S41, resolves log information, determines the log information classification needed belonging to the index of special concern;
S42, resolving the key word extracting in logged result and need special concern, is spliced into field name;
S43, adopts Reduce mechanism, in described log information classification, calculates the number of times gathering described field name and occur in this classification, generates and export new key-value pair;
S44, extracts the information of key assignments centering, analyzes, realize the supervision of operation states of electric power system to it.
CN201410681737.4A 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis Expired - Fee Related CN104616205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410681737.4A CN104616205B (en) 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410681737.4A CN104616205B (en) 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis

Publications (2)

Publication Number Publication Date
CN104616205A true CN104616205A (en) 2015-05-13
CN104616205B CN104616205B (en) 2019-10-25

Family

ID=53150638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410681737.4A Expired - Fee Related CN104616205B (en) 2014-11-24 2014-11-24 A kind of operation states of electric power system monitoring method based on distributed information log analysis

Country Status (1)

Country Link
CN (1) CN104616205B (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138661A (en) * 2015-09-02 2015-12-09 西北大学 Hadoop-based k-means clustering analysis system and method of network security log
CN105516355A (en) * 2016-01-13 2016-04-20 国家电网公司 Device and method for safely storing error big data of smart electricity meter based on fountain code
CN105608203A (en) * 2015-12-24 2016-05-25 Tcl集团股份有限公司 Internet of things log processing method and device based on Hadoop platform
CN105701621A (en) * 2016-02-19 2016-06-22 云南电网有限责任公司电力科学研究院 Intelligent power grid real time load analyzing method and system
CN106022664A (en) * 2016-07-08 2016-10-12 大连大学 Big data analysis based network intelligent power saving monitoring method
CN106209826A (en) * 2016-07-08 2016-12-07 瑞达信息安全产业股份有限公司 A kind of safety case investigation method of Network Security Device monitoring
CN107291614A (en) * 2017-05-04 2017-10-24 平安科技(深圳)有限公司 File method for detecting abnormality and electronic equipment
CN107483238A (en) * 2017-08-04 2017-12-15 郑州云海信息技术有限公司 A kind of blog management method, cluster management node and system
CN107704594A (en) * 2017-10-13 2018-02-16 东南大学 Power system daily record data real-time processing method based on SparkStreaming
CN108133043A (en) * 2018-01-12 2018-06-08 福建星瑞格软件有限公司 A kind of server running log structured storage method based on big data
CN108804606A (en) * 2018-05-29 2018-11-13 上海欣能信息科技发展有限公司 A kind of electric power measures class Data Migration to the method and system of HBase
CN108833156A (en) * 2018-06-08 2018-11-16 中国电力科学研究院有限公司 A kind of appraisal procedure and system of the simulation performance index for power telecom network
CN108845560A (en) * 2018-05-30 2018-11-20 国网浙江省电力有限公司宁波供电公司 A kind of power scheduling log Fault Classification
CN108959445A (en) * 2018-06-13 2018-12-07 云南电网有限责任公司信息中心 Distributed information log processing method and processing device
CN108984610A (en) * 2018-06-11 2018-12-11 华南理工大学 A kind of method and system based on the offline real-time processing data of big data frame
CN109213091A (en) * 2018-06-27 2019-01-15 中国电子科技集团公司第五十五研究所 A kind of semiconductor chip process equipment method for monitoring state based on document analysis
CN109685399A (en) * 2019-02-19 2019-04-26 贵州电网有限责任公司 Electric system log confluence analysis method and system
CN110069572A (en) * 2019-03-19 2019-07-30 深圳壹账通智能科技有限公司 HIVE method for scheduling task, device, equipment and storage medium based on big data platform
CN110231998A (en) * 2019-06-13 2019-09-13 泰康保险集团股份有限公司 Detection method, device and the storage medium of distributed timing task
CN110389874A (en) * 2018-04-20 2019-10-29 比亚迪股份有限公司 Journal file method for detecting abnormality and device
CN110555010A (en) * 2019-09-11 2019-12-10 中国南方电网有限责任公司 power grid real-time operation data storage system
CN110825873A (en) * 2019-10-11 2020-02-21 支付宝(杭州)信息技术有限公司 Method and device for expanding log exception classification rule
CN111049684A (en) * 2019-12-12 2020-04-21 闻泰通讯股份有限公司 Data analysis method, device, equipment and storage medium
CN111158997A (en) * 2019-12-24 2020-05-15 河南文正电子数据处理有限公司 Safety monitoring method and device for multi-log system
CN111260505A (en) * 2020-02-13 2020-06-09 吴龙圣 Big data analysis method and device based on power Internet of things and computer equipment
CN112948211A (en) * 2021-02-26 2021-06-11 杭州安恒信息技术股份有限公司 Alarm method, device, equipment and medium based on log processing
CN114172921A (en) * 2021-12-02 2022-03-11 国网山东省电力公司信息通信公司 Log auditing method and device for scheduling recording system
CN114169651A (en) * 2022-02-14 2022-03-11 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103607291A (en) * 2013-10-25 2014-02-26 北京科东电力控制系统有限责任公司 Alarm analysis merging method for power secondary system intranet security monitoring platform

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103607291A (en) * 2013-10-25 2014-02-26 北京科东电力控制系统有限责任公司 Alarm analysis merging method for power secondary system intranet security monitoring platform

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
庞传军: ""电网调度日志系统的设计与开发"", 《湖北电力》 *
王高垒: ""爬虫日志数据信息抽取与统计系统设计与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
薛文娟: ""基于层次聚类的日志分析技术研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
马茜: ""基于Web的电力系统自适应安全事件管理设计"", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105138661A (en) * 2015-09-02 2015-12-09 西北大学 Hadoop-based k-means clustering analysis system and method of network security log
CN105138661B (en) * 2015-09-02 2018-10-30 西北大学 A kind of network security daily record k-means cluster analysis systems and method based on Hadoop
CN105608203A (en) * 2015-12-24 2016-05-25 Tcl集团股份有限公司 Internet of things log processing method and device based on Hadoop platform
CN105608203B (en) * 2015-12-24 2019-09-17 Tcl集团股份有限公司 A kind of Internet of Things log processing method and device based on Hadoop platform
CN105516355B (en) * 2016-01-13 2018-07-17 国家电网公司 Intelligent electric energy meter error big data safe storage device based on fountain codes and method
CN105516355A (en) * 2016-01-13 2016-04-20 国家电网公司 Device and method for safely storing error big data of smart electricity meter based on fountain code
CN105701621A (en) * 2016-02-19 2016-06-22 云南电网有限责任公司电力科学研究院 Intelligent power grid real time load analyzing method and system
CN106209826A (en) * 2016-07-08 2016-12-07 瑞达信息安全产业股份有限公司 A kind of safety case investigation method of Network Security Device monitoring
CN106022664A (en) * 2016-07-08 2016-10-12 大连大学 Big data analysis based network intelligent power saving monitoring method
CN107291614A (en) * 2017-05-04 2017-10-24 平安科技(深圳)有限公司 File method for detecting abnormality and electronic equipment
CN107483238A (en) * 2017-08-04 2017-12-15 郑州云海信息技术有限公司 A kind of blog management method, cluster management node and system
CN107704594B (en) * 2017-10-13 2021-02-09 东南大学 Real-time processing method for log data of power system based on spark streaming
CN107704594A (en) * 2017-10-13 2018-02-16 东南大学 Power system daily record data real-time processing method based on SparkStreaming
CN108133043A (en) * 2018-01-12 2018-06-08 福建星瑞格软件有限公司 A kind of server running log structured storage method based on big data
CN110389874A (en) * 2018-04-20 2019-10-29 比亚迪股份有限公司 Journal file method for detecting abnormality and device
CN108804606A (en) * 2018-05-29 2018-11-13 上海欣能信息科技发展有限公司 A kind of electric power measures class Data Migration to the method and system of HBase
CN108845560A (en) * 2018-05-30 2018-11-20 国网浙江省电力有限公司宁波供电公司 A kind of power scheduling log Fault Classification
CN108845560B (en) * 2018-05-30 2021-07-13 国网浙江省电力有限公司宁波供电公司 Power dispatching log fault classification method
CN108833156B (en) * 2018-06-08 2022-08-30 中国电力科学研究院有限公司 Evaluation method and system for simulation performance index of power communication network
CN108833156A (en) * 2018-06-08 2018-11-16 中国电力科学研究院有限公司 A kind of appraisal procedure and system of the simulation performance index for power telecom network
CN108984610A (en) * 2018-06-11 2018-12-11 华南理工大学 A kind of method and system based on the offline real-time processing data of big data frame
CN108959445A (en) * 2018-06-13 2018-12-07 云南电网有限责任公司信息中心 Distributed information log processing method and processing device
CN109213091A (en) * 2018-06-27 2019-01-15 中国电子科技集团公司第五十五研究所 A kind of semiconductor chip process equipment method for monitoring state based on document analysis
CN109685399A (en) * 2019-02-19 2019-04-26 贵州电网有限责任公司 Electric system log confluence analysis method and system
CN110069572A (en) * 2019-03-19 2019-07-30 深圳壹账通智能科技有限公司 HIVE method for scheduling task, device, equipment and storage medium based on big data platform
CN110069572B (en) * 2019-03-19 2022-08-02 深圳壹账通智能科技有限公司 HIVE task scheduling method, device, equipment and storage medium based on big data platform
CN110231998A (en) * 2019-06-13 2019-09-13 泰康保险集团股份有限公司 Detection method, device and the storage medium of distributed timing task
CN110231998B (en) * 2019-06-13 2021-07-20 泰康保险集团股份有限公司 Detection method and device for distributed timing task and storage medium
CN110555010B (en) * 2019-09-11 2022-04-05 中国南方电网有限责任公司 Power grid real-time operation data storage system
CN110555010A (en) * 2019-09-11 2019-12-10 中国南方电网有限责任公司 power grid real-time operation data storage system
CN110825873A (en) * 2019-10-11 2020-02-21 支付宝(杭州)信息技术有限公司 Method and device for expanding log exception classification rule
CN111049684A (en) * 2019-12-12 2020-04-21 闻泰通讯股份有限公司 Data analysis method, device, equipment and storage medium
CN111158997A (en) * 2019-12-24 2020-05-15 河南文正电子数据处理有限公司 Safety monitoring method and device for multi-log system
CN111260505A (en) * 2020-02-13 2020-06-09 吴龙圣 Big data analysis method and device based on power Internet of things and computer equipment
CN111260505B (en) * 2020-02-13 2020-11-10 青岛联众芯云科技有限公司 Big data analysis method and device based on power Internet of things and computer equipment
CN112948211A (en) * 2021-02-26 2021-06-11 杭州安恒信息技术股份有限公司 Alarm method, device, equipment and medium based on log processing
CN114172921A (en) * 2021-12-02 2022-03-11 国网山东省电力公司信息通信公司 Log auditing method and device for scheduling recording system
CN114169651B (en) * 2022-02-14 2022-04-19 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity
CN114169651A (en) * 2022-02-14 2022-03-11 中国空气动力研究与发展中心计算空气动力研究所 Active prediction method for supercomputer operation failure based on application similarity

Also Published As

Publication number Publication date
CN104616205B (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN104616205A (en) Distributed log analysis based operation state monitoring method of power system
CN107330056B (en) Wind power plant SCADA system based on big data cloud computing platform and operation method thereof
JP2022531953A (en) Power system control using a dynamic power flow model
CN105868075A (en) System and method for monitoring and analyzing great deal of logs in real time
CN104616092A (en) Distributed log analysis based distributed mode handling method
CN109501834A (en) A kind of point machine failure prediction method and device
CN106781455A (en) A kind of region Expressway Information system based on cloud computing
CN105893628A (en) Real-time data collection system and method
CN103902816A (en) Electrification detection data processing method based on data mining technology
Lin et al. A general framework for quantitative modeling of dependability in cyber-physical systems: A proposal for doctoral research
CN105069025A (en) Intelligent aggregation visualization and management control system for big data
CN106716454A (en) Utilizing machine learning to identify non-technical loss
Potdar et al. Big energy data management for smart grids—Issues, challenges and recent developments
Ahsan et al. Data-driven next-generation smart grid towards sustainable energy evolution: techniques and technology review
CN111259073A (en) Intelligent business system running state studying and judging system based on logs, flow and business access
Doostan et al. A data‐driven approach for predicting vegetation‐related outages in power distribution systems
CN108228683A (en) A kind of distributed intelligence electric network data analysis platform based on cloud computing
Aranda et al. Context-aware Edge Computing and Internet of Things in Smart Grids: A systematic mapping study
CN115908046A (en) Visual power distribution system based on airport terminal building BIM
Ju et al. The use of edge computing-based internet of things big data in the design of power intelligent management and control platform
CN109002901A (en) A kind of province ground county's integration electric network information total management system and device
CN107562768A (en) A kind of data handling procedure dynamic back jump tracking method
CN110196857A (en) A kind of industry internet of things data forecast analysis platform
Moguel et al. Multilayer big data architecture for remote sensing in Eolic parks
CN111311079A (en) Comprehensive energy-using service system for large users

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191025

Termination date: 20211124

CF01 Termination of patent right due to non-payment of annual fee