CN111797146A - Big data-based equipment defect correlation analysis method - Google Patents

Big data-based equipment defect correlation analysis method Download PDF

Info

Publication number
CN111797146A
CN111797146A CN202010700356.1A CN202010700356A CN111797146A CN 111797146 A CN111797146 A CN 111797146A CN 202010700356 A CN202010700356 A CN 202010700356A CN 111797146 A CN111797146 A CN 111797146A
Authority
CN
China
Prior art keywords
equipment
defect
determining
importance
defects
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010700356.1A
Other languages
Chinese (zh)
Inventor
文屹
郑友卓
张锐峰
付宇
郝树青
邓东林
肖小兵
刘安茳
张洋
何洪流
李前敏
吴鹏
王卓月
柏毅辉
李忠
安波
陈宇
黄如云
蔡永祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Electric Power Research Institute of Guizhou Power Grid Co Ltd
Original Assignee
Electric Power Research Institute of Guizhou Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Electric Power Research Institute of Guizhou Power Grid Co Ltd filed Critical Electric Power Research Institute of Guizhou Power Grid Co Ltd
Priority to CN202010700356.1A priority Critical patent/CN111797146A/en
Publication of CN111797146A publication Critical patent/CN111797146A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Operations Research (AREA)
  • Medical Informatics (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Quality & Reliability (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure relates to a big data-based equipment defect correlation analysis method, which comprises the following steps: determining a relation between an equipment defect and a plurality of equipment characteristics according to a preset equipment defect library, wherein the equipment defect library comprises a plurality of equipment characteristics, equipment defect grades and a plurality of defect events, and the equipment defect grades comprise critical defects, serious defects and general defects; respectively determining the importance of each equipment characteristic according to the relationship between the equipment defect and the plurality of equipment characteristics; determining a plurality of target features from a plurality of device features according to the importance; and performing correlation analysis on the target characteristics according to the equipment defect library to determine a plurality of characteristic combinations causing the equipment defects. According to the embodiment of the disclosure, through big data correlation analysis, the generation reason and the distribution rule of the equipment defects of the power system can be obtained, and an auxiliary decision is provided for the equipment defect analysis and the reduction of the equipment defect rate of the power system.

Description

Big data-based equipment defect correlation analysis method
Technical Field
The disclosure relates to the technical field of electric power big data and artificial intelligence, in particular to a big data-based equipment defect correlation analysis method.
Background
The big data of electric power is the inevitable process of power industry technological innovation in the energy revolution, not only is the technological progress, but also relates to the important changes of the whole electric power system in the aspects of development concept, management system, technical route and the like in the big data era, and is the jump of the value form of the next generation intelligent electric power system in the big data era. The method for reshaping the power core value and converting the power development mode is two core main lines of power big data.
Power systems include a large number of power devices that may experience device defects during operation. However, the factors involved in the equipment defect of the power system are numerous, and it is difficult to determine the distribution rule and the generation cause by simple analysis, and the equipment defect rate cannot be effectively reduced.
Disclosure of Invention
In view of this, the present disclosure provides a device defect correlation analysis method based on big data.
According to an aspect of the present disclosure, there is provided a big data-based device defect correlation analysis method, the method including:
determining a relation between an equipment defect and a plurality of equipment characteristics according to a preset equipment defect library, wherein the equipment defect library comprises a plurality of equipment characteristics, equipment defect grades and a plurality of defect events, and the equipment defect grades comprise critical defects, serious defects and general defects;
respectively determining the importance of each equipment characteristic according to the relationship between the equipment defect and the plurality of equipment characteristics;
determining a plurality of target features from a plurality of device features according to the importance;
and performing correlation analysis on the target characteristics according to the equipment defect library to determine a plurality of characteristic combinations causing the equipment defects.
In a possible implementation manner, the determining, according to a preset device defect library, a relationship between a device defect and a plurality of device features includes:
constructing and training an equipment defect random forest model according to a preset equipment defect library to obtain a trained equipment defect random forest model;
and determining the relation between the equipment defect and a plurality of equipment characteristics according to the trained equipment defect random forest model.
In a possible implementation manner, the determining the importance of each device feature according to the relationship between the device defect and the plurality of device features includes:
according to the relation between the equipment defect and a plurality of equipment characteristics, determining the kini index of each equipment characteristic in the trained equipment defect random forest model respectively;
and determining the Gini index of each equipment characteristic as the importance of each equipment characteristic.
In a possible implementation manner, the determining, according to the importance, a plurality of target features from a plurality of device features includes:
sorting the plurality of equipment features according to the importance of each equipment feature to obtain a plurality of sorted equipment features;
and selecting a plurality of target features with high importance from the sorted equipment features according to the preset number.
In one possible implementation manner, the device defect library is a device defect multidimensional width table, the plurality of device characteristics include device basic characteristics, device production characteristics, device operation and maintenance characteristics and device environment characteristics,
the basic characteristics of the equipment comprise at least one of equipment code, voltage class and operation life, the production characteristics of the equipment comprise at least one of manufacturer, equipment model and equipment production date, the operation and maintenance characteristics of the equipment comprise at least one of operation and maintenance units and operation and maintenance records, and the environmental characteristics of the equipment comprise at least one of lightning stroke, pollution flashover, high temperature, low temperature, trees and small animals.
According to another aspect of the present disclosure, there is provided an apparatus for analyzing device defect association based on big data, the apparatus including:
the system comprises a relation determining module, a relation determining module and a judging module, wherein the relation determining module is used for determining the relation between equipment defects and a plurality of equipment characteristics according to a preset equipment defect library, the equipment defect library comprises a plurality of equipment characteristics, equipment defect grades and a plurality of defect events, and the equipment defect grades comprise critical defects, serious defects and general defects;
the importance determining module is used for respectively determining the importance of each equipment characteristic according to the relationship between the equipment defect and the plurality of equipment characteristics;
the target characteristic determining module is used for determining a plurality of target characteristics from a plurality of equipment characteristics according to the importance;
and the characteristic combination determining module is used for performing correlation analysis on the target characteristics according to the equipment defect library and determining a plurality of characteristic combinations causing the equipment defects.
In one possible implementation manner, the relationship determining module includes:
the model construction submodule is used for constructing and training an equipment defect random forest model according to a preset equipment defect library to obtain a trained equipment defect random forest model;
and the relation determining submodule is used for determining the relation between the equipment defect and the plurality of equipment characteristics according to the trained equipment defect random forest model.
In one possible implementation manner, the importance determination module includes:
the index determining submodule is used for respectively determining the kini index of each equipment characteristic in the trained equipment defect random forest model according to the relation between the equipment defect and the plurality of equipment characteristics;
and the importance determining submodule is used for determining the kini indexes of the equipment characteristics as the importance of the equipment characteristics.
In one possible implementation, the target feature determination module includes:
the sorting submodule is used for sorting the plurality of equipment characteristics according to the importance of each equipment characteristic to obtain a plurality of sorted equipment characteristics;
and the selecting submodule is used for selecting a plurality of target characteristics with high importance from the sorted equipment characteristics according to the preset number.
In one possible implementation manner, the device defect library is a device defect multidimensional width table, the plurality of device characteristics include device basic characteristics, device production characteristics, device operation and maintenance characteristics and device environment characteristics,
the basic characteristics of the equipment comprise at least one of equipment code, voltage class and operation life, the production characteristics of the equipment comprise at least one of manufacturer, equipment model and equipment production date, the operation and maintenance characteristics of the equipment comprise at least one of operation and maintenance units and operation and maintenance records, and the environmental characteristics of the equipment comprise at least one of lightning stroke, pollution flashover, high temperature, low temperature, trees and small animals.
According to the embodiment of the disclosure, the relationship between the equipment defect and the multiple equipment features can be determined according to the equipment defect library, the importance of each equipment feature is further determined, then the multiple target features are determined from the multiple equipment features according to the importance, correlation analysis is performed on the multiple target features, and the multiple feature combinations causing the equipment defect are determined, so that the main influence factors (namely the multiple target features) of the equipment defect and the multiple factor combinations (namely the multiple feature combinations) which are easy to cause the equipment defect are mined from the equipment defect library through big data correlation analysis, the generation reason and the distribution rule of the equipment defect are obtained, and an auxiliary decision is provided for equipment defect analysis of the power system and reduction of the equipment defect rate.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments, features, and aspects of the disclosure and, together with the description, serve to explain the principles of the disclosure.
FIG. 1 shows a flow chart of a big data based device defect correlation analysis method according to an embodiment of the present disclosure.
Fig. 2 is a schematic diagram illustrating a processing procedure of a big data-based device defect correlation analysis method according to an embodiment of the present disclosure.
Fig. 3 shows a block diagram of a device defect correlation analysis apparatus based on big data according to an embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the present disclosure will be described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used exclusively herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
Furthermore, in the following detailed description, numerous specific details are set forth in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements and circuits that are well known to those skilled in the art have not been described in detail so as not to obscure the present disclosure.
The big data analysis in the embodiment of the disclosure refers to the analysis of large-scale data. Big data can be summarized as 5V: large data Volume (Volume), fast speed (Velocity), multiple types (Variety), Value (Value), and authenticity (Veracity).
In the embodiment of the disclosure, large data analysis (for example, data mining, association analysis, and the like) may be performed on large-scale equipment defect events accumulated in the power system to find the generation cause and distribution rule of the equipment defect, so as to provide an auxiliary decision for reducing the equipment defect rate of the power system.
FIG. 1 shows a flow chart of a big data based device defect correlation analysis method according to an embodiment of the present disclosure. As shown in fig. 1, the method includes:
step S11, determining the relationship between the equipment defect and a plurality of equipment characteristics according to a preset equipment defect library, wherein the equipment defect library comprises a plurality of equipment characteristics, the grade of the equipment defect and a plurality of defect events, and the grade of the equipment defect comprises critical defect, serious defect and general defect;
step S12, respectively determining the importance of each equipment feature according to the relationship between the equipment defect and a plurality of equipment features;
step S13, determining a plurality of target characteristics from a plurality of equipment characteristics according to the importance;
step S14, performing association analysis on the target features according to the device defect library, and determining a plurality of feature combinations causing device defects.
According to the embodiment of the disclosure, the relationship between the equipment defect and the multiple equipment features can be determined according to the equipment defect library, the importance of each equipment feature is further determined, then the multiple target features are determined from the multiple equipment features according to the importance, correlation analysis is performed on the multiple target features, and the multiple feature combinations causing the equipment defect are determined, so that the main influence factors (namely the multiple target features) of the equipment defect and the multiple factor combinations (namely the multiple feature combinations) which are easy to cause the equipment defect are mined from the equipment defect library through big data correlation analysis, the generation reason and the distribution rule of the equipment defect are obtained, and an auxiliary decision is provided for equipment defect analysis of the power system and reduction of the equipment defect rate.
In one possible implementation, the device defect library may include a plurality of device characteristics, a level of device defects, and a plurality of defect events. Wherein the plurality of device characteristics may include device attributes of the power system, such as device number, voltage class, operational age, maintenance records, and the like; the class of equipment defects may include critical defects, and general defects. The plurality of fault events may include a plurality of equipment fault events or equipment fault records of the power system. Each defect event may include feature values corresponding to a plurality of device features and a level of device defects.
In one possible implementation, the plurality of device features may be multi-dimensional features of the electrical device. According to the equipment attribute of the power system, the plurality of equipment characteristics may include equipment basic characteristics, equipment production characteristics, equipment operation and maintenance characteristics and equipment environment characteristics of the power equipment, wherein the equipment basic characteristics may include at least one of equipment code, voltage level and commissioning life, the equipment production characteristics may include at least one of manufacturer, equipment model and equipment production date, the equipment operation and maintenance characteristics may include at least one of operation and maintenance unit and operation and maintenance record, and the equipment environment characteristics may include at least one of lightning stroke, pollution flashover, high temperature, low temperature, trees and small animals.
In one possible implementation, the device defect library may be represented as a device defect multidimensional wide table. The header of the multidimensional wide list of equipment defects may include a plurality of equipment characteristics and the grade of the equipment defect, and may also include other information such as the occurrence time of the defect event. The contents of the device defect multidimensional wide table include a plurality of defect events.
In one possible implementation, the relationship between the device defect and the plurality of device features may be determined according to a preset device defect library in step S11. The relationship between the device defect and the plurality of device features may be determined by machine learning, model building (e.g., random forest model), and the like. The present disclosure is not limited to a particular manner of determining a relationship between a device defect and a plurality of device features.
In one possible implementation manner, the importance of each device feature may be determined in step S12 according to the relationship between the device defect and the plurality of device features. The importance can be used to indicate the magnitude of the impact of device characteristics on device defects. The higher the importance of the device feature, the greater the influence on the device defect, and it is considered that the more likely the device feature causes the device defect. The importance may be represented by a weight, a kini index, an out-of-bag error rate, and the like, which is not limited by the present disclosure.
In one possible implementation, after determining the importance of each device feature, a plurality of target features may be determined from the plurality of device features according to the importance in step S13. Determining the device characteristics with the importance degree greater than or equal to the importance degree threshold value in the plurality of device characteristics as a plurality of target characteristics according to a preset importance degree threshold value; and a plurality of target features with high importance can be selected from the plurality of equipment features according to the preset number, and a plurality of target features can be selected in other modes. The present disclosure is not limited as to the manner in which the plurality of target features are selected.
In a possible implementation manner, in step S14, the target features are subjected to association analysis by an association analysis algorithm (for example, association rule algorithm Apriori, frequent pattern growth algorithm FP-growth (frequency pattern growth), etc.) according to the device defect library, so as to determine a plurality of feature combinations causing device defects.
For example, a plurality of candidate feature combinations including the target feature may be first mined from the device defect library by using an association rule algorithm Apriori, where a support degree of the candidate feature combinations is greater than or equal to a preset minimum support degree; and then, determining the candidate feature combination with the confidence degree greater than or equal to the preset minimum confidence degree from the plurality of candidate feature combinations as the plurality of feature combinations causing the equipment defects.
In one possible implementation, the correlation analysis may also be performed separately according to the levels of the device defects, that is, multiple feature combinations causing device defects of respective levels may be determined separately. For example, a plurality of combinations of features that cause critical defects in the electrical equipment may be determined, and so on. The specific process is similar to the above method, and is not described herein again.
In one possible implementation, step S11 may include: constructing and training an equipment defect random forest model according to a preset equipment defect library to obtain a trained equipment defect random forest model; and determining the relation between the equipment defect and a plurality of equipment characteristics according to the trained equipment defect random forest model.
In a possible implementation manner, when constructing the random forest model of the equipment defect according to the equipment defect library, the method may first label each equipment feature and the level of the equipment defect, for example:
labels for voltage classes may include 400V, 10kV, 35kV, 110kV, 220kV, and 500 kV;
the label of the commissioning age may include 1 year and less, 1-3 years, 3-5 years, 5-30 years, 30 years and more;
the labels of the operation and maintenance records can comprise operation and maintenance years and months;
tags of environmental information characteristics may include lightning strikes, pollution flashover, high temperature, low temperature, trees, small animals;
the labels for the class of equipment defects may include critical defects, and general defects.
After labeling the device characteristics and the levels of the device defects, a plurality of defect events in the device defect library may be preprocessed (e.g., quantized) according to the label information, and the preprocessed defect events are divided into a training set and a verification set.
The random forest model with the equipment defects can be constructed according to a preset initial value of the model parameters and a training set, wherein the number of decision trees in the random forest model with the equipment defects can be set according to actual conditions, and the random forest model with the equipment defects is not limited by the disclosure; and then performing multiple rounds of iterative training on the random forest model with the equipment defects.
When the equipment defect random forest model meets verification conditions (for example, the training turn reaches a preset turn threshold or other conditions), verifying the equipment defect random forest model by using a verification set; and if the verification is not passed, continuing the training by using the mode, and if the verification is passed, finishing the training to obtain the trained equipment defect random forest model.
In one possible implementation, the relationship between the equipment defect and the plurality of equipment features may be determined according to a trained equipment defect random forest model. For example, the relationship between the device defect and the plurality of device features may be determined according to the relationship between the node corresponding to the device defect and the nodes corresponding to the plurality of device features in the trained device defect random forest model.
It should be understood that the specific verification conditions can be set by those skilled in the art according to practical situations, and the present disclosure is not limited thereto.
In this embodiment, an equipment defect random forest model can be constructed and trained according to the equipment defect library, and the relationship between the equipment defect and the plurality of equipment features can be determined according to the trained equipment defect random forest model, so that the accuracy of the relationship between the equipment defect and the plurality of equipment features can be improved.
In one possible implementation, step S12 may include: according to the relation between the equipment defect and a plurality of equipment characteristics, determining the kini index of each equipment characteristic in the trained equipment defect random forest model respectively; and determining the Gini index of each equipment characteristic as the importance of each equipment characteristic.
That is, the kini index (i.e., Gini index) of each device feature in the trained device defect random deep forest model can be calculated according to the relationship between the device defect and the plurality of device features, and the kini index is determined as the importance of each device feature. Wherein, the larger the Gini index is, the higher the importance of the equipment characteristics is; the smaller the kini index, the less important the device characteristics. In this way, the accuracy of the importance of the individual device characteristics can be increased.
In one possible implementation, step S13 may include: sorting the plurality of equipment features according to the importance of each equipment feature to obtain a plurality of sorted equipment features; and selecting a plurality of target features with high importance from the sorted equipment features according to the preset number.
When a plurality of target features are selected, the plurality of device features can be sorted according to the importance (such as a kini index) of each device feature to obtain a plurality of sorted device features, wherein the sorting mode can be from large to small or from small to large; and then selecting a plurality of target features with high importance from the sorted device features according to the preset number.
For example, assuming that the preset number is 10, when the sorting mode is from large to small, the top 10 device features may be selected as target features from the sorted multiple device features; when the sorting mode is from small to large, the last 10 device features can be selected from the sorted multiple device features as target features.
In the embodiment, the target characteristics are selected after the plurality of equipment characteristics are sorted according to the importance degree, so that the method is simple and quick, and the processing efficiency can be improved.
Fig. 2 is a schematic diagram illustrating a processing procedure of a big data-based device defect correlation analysis method according to an embodiment of the present disclosure. As shown in fig. 2, in step S201, determining a plurality of device characteristics and device defect levels according to device attributes of the power system and a plurality of defect events of the power system, and creating a device defect multidimensional width table, where a header of the device defect multidimensional width table may include the plurality of device characteristics and the device defect levels, and may further include occurrence times of the defect events; the content of the equipment defect multidimensional width table comprises a plurality of defect events;
then, in step S202, labeling the multiple device features and the levels of the device defects, performing quantization processing on the device defect multidimensional width table according to the label information, and then constructing and training a device defect random forest model according to the quantized device defect multidimensional width table to obtain a trained device defect random forest model;
then, in step S203, determining a relationship between the device defect and the plurality of device features according to the trained device defect random forest model; in step S204, according to the relationship between the device defect and the plurality of device features, respectively determining a kini index of each device feature in the trained device defect random forest model, and determining the kini index as the importance of each device feature; in step S205, the respective device features are sorted from high to low in importance, and a preset number of device features with high importance are selected from the sorted device features as a plurality of target features.
Finally, in step S206, according to the multidimensional width table of the device defect, association analysis is performed on a plurality of target features through an association rule algorithm Apriori or a frequent pattern Growth algorithm FP-Growth, so as to determine a plurality of feature combinations causing the device defect.
According to the embodiment of the disclosure, by constructing and training the equipment defect random forest model, main influence factors (namely a plurality of target characteristics) of the equipment defect can be mined from an equipment defect library, a plurality of factor combinations (namely a plurality of characteristic combinations) which easily cause the equipment defect are determined through importance sequencing and association analysis, and the generation reason and the distribution rule of the equipment defect are obtained, so that the method can be used for guiding the equipment defect analysis, providing an auxiliary decision for reducing the equipment defect rate of a power system, simultaneously helping to determine the power equipment which easily causes key defects and high-frequency defects, and providing a decision basis for effectively supporting management optimization such as differential operation and maintenance, and production manufacturer technical evaluation.
It should be noted that, although the above embodiment is taken as an example to describe the big data based device defect correlation analysis method, those skilled in the art can understand that the disclosure should not be limited thereto. In fact, the user can flexibly set each step according to personal preference and/or actual application scene, as long as the technical scheme of the disclosure is met.
Fig. 3 shows a block diagram of a device defect correlation analysis apparatus based on big data according to an embodiment of the present disclosure. As shown in fig. 3, the apparatus includes:
a relation determining module 31, configured to determine a relation between an equipment defect and multiple equipment features according to a preset equipment defect library, where the equipment defect library includes the multiple equipment features, a class of the equipment defect, and multiple defect events, and the class of the equipment defect includes a critical defect, and a general defect;
an importance determining module 32, configured to determine importance of each device feature according to a relationship between the device defect and the plurality of device features;
a target feature determining module 33, configured to determine a plurality of target features from the plurality of device features according to the importance;
and the feature combination determining module 34 is configured to perform association analysis on the multiple target features according to the device defect library, and determine multiple feature combinations causing device defects.
In a possible implementation manner, the relationship determining module 31 includes:
the model construction submodule is used for constructing and training an equipment defect random forest model according to a preset equipment defect library to obtain a trained equipment defect random forest model;
and the relation determining submodule is used for determining the relation between the equipment defect and the plurality of equipment characteristics according to the trained equipment defect random forest model.
In one possible implementation, the importance determining module 32 includes:
the index determining submodule is used for respectively determining the kini index of each equipment characteristic in the trained equipment defect random forest model according to the relation between the equipment defect and the plurality of equipment characteristics;
and the importance determining submodule is used for determining the kini indexes of the equipment characteristics as the importance of the equipment characteristics.
In a possible implementation manner, the target feature determination module 33 includes:
the sorting submodule is used for sorting the plurality of equipment characteristics according to the importance of each equipment characteristic to obtain a plurality of sorted equipment characteristics;
and the selecting submodule is used for selecting a plurality of target characteristics with high importance from the sorted equipment characteristics according to the preset number.
In one possible implementation manner, the device defect library is a device defect multidimensional width table, the plurality of device characteristics include device basic characteristics, device production characteristics, device operation and maintenance characteristics and device environment characteristics,
the basic characteristics of the equipment comprise at least one of equipment code, voltage class and operation life, the production characteristics of the equipment comprise at least one of manufacturer, equipment model and equipment production date, the operation and maintenance characteristics of the equipment comprise at least one of operation and maintenance units and operation and maintenance records, and the environmental characteristics of the equipment comprise at least one of lightning stroke, pollution flashover, high temperature, low temperature, trees and small animals.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A big data-based equipment defect correlation analysis method is characterized by comprising the following steps:
determining a relation between an equipment defect and a plurality of equipment characteristics according to a preset equipment defect library, wherein the equipment defect library comprises a plurality of equipment characteristics, equipment defect grades and a plurality of defect events, and the equipment defect grades comprise critical defects, serious defects and general defects;
respectively determining the importance of each equipment characteristic according to the relationship between the equipment defect and the plurality of equipment characteristics;
determining a plurality of target features from a plurality of device features according to the importance;
and performing correlation analysis on the target characteristics according to the equipment defect library to determine a plurality of characteristic combinations causing the equipment defects.
2. The method of claim 1, wherein determining the relationship between the device defect and the plurality of device features according to a preset device defect library comprises:
constructing and training an equipment defect random forest model according to a preset equipment defect library to obtain a trained equipment defect random forest model;
and determining the relation between the equipment defect and a plurality of equipment characteristics according to the trained equipment defect random forest model.
3. The method of claim 2, wherein determining the importance of each device feature according to the relationship between the device defect and the plurality of device features comprises:
according to the relation between the equipment defect and a plurality of equipment characteristics, determining the kini index of each equipment characteristic in the trained equipment defect random forest model respectively;
and determining the Gini index of each equipment characteristic as the importance of each equipment characteristic.
4. The method of claim 1, wherein determining a plurality of target features from a plurality of device features according to the importance comprises:
sorting the plurality of equipment features according to the importance of each equipment feature to obtain a plurality of sorted equipment features;
and selecting a plurality of target features with high importance from the sorted equipment features according to the preset number.
5. The method of claim 1, wherein the device defect library is a multi-dimensional wide list of device defects, the plurality of device characteristics includes device base characteristics, device production characteristics, device operation and maintenance characteristics, and device environment characteristics,
the basic characteristics of the equipment comprise at least one of equipment code, voltage class and operation life, the production characteristics of the equipment comprise at least one of manufacturer, equipment model and equipment production date, the operation and maintenance characteristics of the equipment comprise at least one of operation and maintenance units and operation and maintenance records, and the environmental characteristics of the equipment comprise at least one of lightning stroke, pollution flashover, high temperature, low temperature, trees and small animals.
6. An apparatus for analyzing device defect association based on big data, the apparatus comprising:
the system comprises a relation determining module, a relation determining module and a judging module, wherein the relation determining module is used for determining the relation between equipment defects and a plurality of equipment characteristics according to a preset equipment defect library, the equipment defect library comprises a plurality of equipment characteristics, equipment defect grades and a plurality of defect events, and the equipment defect grades comprise critical defects, serious defects and general defects;
the importance determining module is used for respectively determining the importance of each equipment characteristic according to the relationship between the equipment defect and the plurality of equipment characteristics;
the target characteristic determining module is used for determining a plurality of target characteristics from a plurality of equipment characteristics according to the importance;
and the characteristic combination determining module is used for performing correlation analysis on the target characteristics according to the equipment defect library and determining a plurality of characteristic combinations causing the equipment defects.
7. The apparatus of claim 6, wherein the relationship determination module comprises:
the model construction submodule is used for constructing and training an equipment defect random forest model according to a preset equipment defect library to obtain a trained equipment defect random forest model;
and the relation determining submodule is used for determining the relation between the equipment defect and the plurality of equipment characteristics according to the trained equipment defect random forest model.
8. The apparatus of claim 7, wherein the importance determination module comprises:
the index determining submodule is used for respectively determining the kini index of each equipment characteristic in the trained equipment defect random forest model according to the relation between the equipment defect and the plurality of equipment characteristics;
and the importance determining submodule is used for determining the kini indexes of the equipment characteristics as the importance of the equipment characteristics.
9. The apparatus of claim 6, wherein the target feature determination module comprises:
the sorting submodule is used for sorting the plurality of equipment characteristics according to the importance of each equipment characteristic to obtain a plurality of sorted equipment characteristics;
and the selecting submodule is used for selecting a plurality of target characteristics with high importance from the sorted equipment characteristics according to the preset number.
10. The apparatus of claim 6, wherein the device defect library is a multi-dimensional wide list of device defects, the plurality of device characteristics includes device base characteristics, device production characteristics, device operation and maintenance characteristics, and device environment characteristics,
the basic characteristics of the equipment comprise at least one of equipment code, voltage class and operation life, the production characteristics of the equipment comprise at least one of manufacturer, equipment model and equipment production date, the operation and maintenance characteristics of the equipment comprise at least one of operation and maintenance units and operation and maintenance records, and the environmental characteristics of the equipment comprise at least one of lightning stroke, pollution flashover, high temperature, low temperature, trees and small animals.
CN202010700356.1A 2020-07-20 2020-07-20 Big data-based equipment defect correlation analysis method Pending CN111797146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010700356.1A CN111797146A (en) 2020-07-20 2020-07-20 Big data-based equipment defect correlation analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010700356.1A CN111797146A (en) 2020-07-20 2020-07-20 Big data-based equipment defect correlation analysis method

Publications (1)

Publication Number Publication Date
CN111797146A true CN111797146A (en) 2020-10-20

Family

ID=72808121

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010700356.1A Pending CN111797146A (en) 2020-07-20 2020-07-20 Big data-based equipment defect correlation analysis method

Country Status (1)

Country Link
CN (1) CN111797146A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342784A (en) * 2021-07-01 2021-09-03 贵州电网有限责任公司 Database design method for risk assessment of main transformer equipment of power grid
CN113379313A (en) * 2021-07-02 2021-09-10 贵州电网有限责任公司 Intelligent preventive test operation management and control system
CN113377759A (en) * 2021-07-01 2021-09-10 贵州电网有限责任公司 Defect filling data management method based on expert system algorithm
CN113435652A (en) * 2021-07-01 2021-09-24 贵州电网有限责任公司 Primary equipment defect diagnosis and prediction method
CN113435759A (en) * 2021-07-01 2021-09-24 贵州电网有限责任公司 Primary equipment risk intelligent evaluation method based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105843210A (en) * 2016-03-22 2016-08-10 清华大学 Power transformer defect information data mining method
CN107025293A (en) * 2017-04-13 2017-08-08 广东电网有限责任公司电力科学研究院 A kind of second power equipment defective data method for digging and system
CN107423328A (en) * 2017-04-13 2017-12-01 温州市图盛科技有限公司 The construction method of big data electric power first-aid hotspot prediction system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105843210A (en) * 2016-03-22 2016-08-10 清华大学 Power transformer defect information data mining method
CN107025293A (en) * 2017-04-13 2017-08-08 广东电网有限责任公司电力科学研究院 A kind of second power equipment defective data method for digging and system
CN107423328A (en) * 2017-04-13 2017-12-01 温州市图盛科技有限公司 The construction method of big data electric power first-aid hotspot prediction system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342784A (en) * 2021-07-01 2021-09-03 贵州电网有限责任公司 Database design method for risk assessment of main transformer equipment of power grid
CN113377759A (en) * 2021-07-01 2021-09-10 贵州电网有限责任公司 Defect filling data management method based on expert system algorithm
CN113435652A (en) * 2021-07-01 2021-09-24 贵州电网有限责任公司 Primary equipment defect diagnosis and prediction method
CN113435759A (en) * 2021-07-01 2021-09-24 贵州电网有限责任公司 Primary equipment risk intelligent evaluation method based on deep learning
CN113435652B (en) * 2021-07-01 2023-01-24 贵州电网有限责任公司 Primary equipment defect diagnosis and prediction method
CN113379313A (en) * 2021-07-02 2021-09-10 贵州电网有限责任公司 Intelligent preventive test operation management and control system

Similar Documents

Publication Publication Date Title
CN111797146A (en) Big data-based equipment defect correlation analysis method
CN110609759B (en) Fault root cause analysis method and device
CN110705873B (en) Power distribution network running state portrait analysis method
CN108470022B (en) Intelligent work order quality inspection method based on operation and maintenance management
CN110837866A (en) XGboost-based electric power secondary equipment defect degree evaluation method
CN108154244A (en) The O&M methods, devices and systems of real estate power equipment
CN113190421A (en) Detection and analysis method for equipment health state of data center
CN108304567B (en) Method and system for identifying working condition mode and classifying data of high-voltage transformer
CN112905804B (en) Dynamic updating method and device for power grid dispatching knowledge graph
CN111027629A (en) Power distribution network fault outage rate prediction method and system based on improved random forest
CN116737510B (en) Data analysis-based intelligent keyboard monitoring method and system
CN114925260A (en) Intelligent bidding information fitting method
CN105471647A (en) Power communication network fault positioning method
CN111737993B (en) Method for extracting equipment health state from fault defect text of power distribution network equipment
CN112000708A (en) Abnormal data processing method and system based on regulation and control adapted data fusion
CN114912854B (en) Subway train operation adjusting method and device, electronic equipment and storage medium
CN115936389A (en) Big data technology-based method for matching evaluation experts with evaluation materials
CN115618286A (en) Transformer partial discharge type identification method, system, equipment, terminal and application
CN115470854A (en) Information system fault classification method and classification system
CN115239971A (en) GIS partial discharge type recognition model training method, recognition method and system
CN113537942A (en) Method and system for increasing number of sample marks
CN114626433A (en) Fault prediction and classification method, device and system for intelligent electric energy meter
CN113537770A (en) Decision tree configuration life prediction method and system based on cloud computing
Ji et al. Cost Prediction of Distribution Network Project Based on DART Model
CN111723851A (en) Production line fault detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201020

RJ01 Rejection of invention patent application after publication