CN107301243A - Switchgear fault signature extracting method based on big data platform - Google Patents

Switchgear fault signature extracting method based on big data platform Download PDF

Info

Publication number
CN107301243A
CN107301243A CN201710550324.6A CN201710550324A CN107301243A CN 107301243 A CN107301243 A CN 107301243A CN 201710550324 A CN201710550324 A CN 201710550324A CN 107301243 A CN107301243 A CN 107301243A
Authority
CN
China
Prior art keywords
mrow
msub
data
msubsup
mfrac
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710550324.6A
Other languages
Chinese (zh)
Inventor
孔宪光
常建涛
王佩
刘燕龙
殷磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201710550324.6A priority Critical patent/CN107301243A/en
Publication of CN107301243A publication Critical patent/CN107301243A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/327Testing of circuit interrupters, switches or circuit-breakers
    • G01R31/3271Testing of circuit interrupters, switches or circuit-breakers of high voltage or medium voltage devices
    • G01R31/3275Fault detection or status indication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Abstract

The present invention proposes a kind of feature extracting method of the switchgear failure based on big data platform, mainly solves prior art in the data in face of magnanimity switchgear failure, it is impossible to the problem of efficiently and accurately carrying out feature extraction to various fault types.Its implementation is:Build Hadoop sub-platforms and carry out Data Collection, storage and data prediction;Build SparkR platforms and carry out multivariable multi-scale entropy MMSE Distributed Calculation, and result of calculation is saved in distributed file system HDFS;Result of calculation is downloaded from HDFS, the multivariate sample entropy curve of each failure of R Software on Drawing switchgears is utilized;According to the multivariate sample entropy curve of each failure, the multivariate sample entropy for choosing correspondence scale factor scope is used as the characteristic parameter of each failure.Whole conceptual design of the invention is rigorous, complete, possesses mass data storage and distributed computation ability, and the efficiency and accuracy that fault signature is extracted are high, can provide foundation for diagnosis and anticipation switchgear failure in time.

Description

Switchgear fault signature extracting method based on big data platform
Technical field
The invention belongs to industrial big data processing technology field, specifically a kind of switchgear fault signature extracting method, It can be applied to the feature extraction to the various failures of enterprise switches equipment.
Background technology
Switchgear is born in power system as one of power system terminal device and controls and protect dual Business, its reliability and intelligent level will produce far-reaching influence to the stabilization and automaticity of power system.Switchgear The statistical analysis of accident shows that the reason for causing primary cut-out failure mainly has abnormal operating mechanism, SF6 leakages, assisted parts Part is damaged and critical piece deterioration.The influence factor of switchgear failure mainly have the use time of switchgear, annual load factor, Environment Operation class, temperature, number of operations and current times etc..Fault characteristic parameters are extracted by switching devices, are timely Diagnosis and anticipation switchgear failure provide foundation, reduce its O&M cost.
Traditional feature extracting method is typically that feature extraction is carried out under unit, serial mode.Traditional characteristic extraction side The treatable data volume of method is smaller, and this accuracy to feature extraction has considerable influence.When in face of mass data, data Storage and processing can expose that poor fault tolerance, speed is slow, the low problem of efficiency.With switchgear failure influence factor not Disconnected to increase, the continuous expansion of equipment fault data scale is difficult to store mass data under unit serial mode, can not be significantly Improve data processing speed;Simultaneously under unit serial mode, traditional characteristic extracting method can only handle small sample amount data, enter And the accuracy of feature extraction can be reduced.When fault impact factor constantly expands, traditional characteristic extracting method is also difficult to handle The data set of multivariable.
In summary, traditional feature extracting method is typically only capable to handle small sample amount data.When the variable of data set increases When many, traditional feature extracting method is also difficult to the data set for handling multivariable gradually.The data volume of the every kind of failure of switchgear Not only big, the influence factor of every kind of failure is more than use time, load factor, environment Operation class, temperature, number of operations and opened This six factors of disconnected number of times, future, fault impact factor can be on the increase.Therefore, traditional feature extracting method is difficult to face simultaneously Continuous expansion and influence factor to fault data amount are on the increase, and then the speed and the degree of accuracy extracted to fault signature are caused Influence.
The content of the invention
It is an object of the invention to for above-mentioned problem of the prior art, propose a kind of opening based on big data processing platform Equipment fault feature extracting method is closed, to improve the degree of accuracy and the extraction rate of the extraction of switchgear fault signature.
The present invention technical thought be:By the processing to big-sample data amount, introduce multivariable multi-scale entropy MMSE and calculate Method, to solve the problem of fault impact factor is on the increase, improves the degree of accuracy that switchgear fault signature is extracted indirectly;Pass through Distributed Parallel Computing of the MMSE algorithms on SparkR platforms, improves the speed that switchgear fault signature is extracted.It is realized Scheme includes as follows:
Switchgear fault signature extracting method proposed by the present invention based on big data platform, step includes as follows:
(1) SparkR big data platforms are built:
(1a) installs linux system, Hadoop open source softwares and Spark open source softwares;
(1b) determines the node number of platform cluster according to existing fault data scale, and to be processed according to subsequently needing Fault data scale, can extend or reduce to the node number;
Each node of (1c) configuration platform cluster, i.e., regard any 1 node as host node from the nodes of determination Master, remaining is as from node Slave;
(1d) it is determined that host node Master and all from node Slave, configuration server process SSH (Secure Shell) and carry out without password authentification, and Java software, configuration Java context, configuration Hadoop core documents and Spark are installed Core document;
(2) Data Collection and storage:Host node Master is adopted from platform exterior by Hadoop Sqoop component technologys Collect the fault data of relationship type;The fault data of file type is gathered by Flume component technologys, and these data of collection are deposited In the distributed file system HDFS for storing up Hadoop, host node Master and all these data are shared from node Slave;
(3) data prediction:Carry out conversion successively to the fault data in distributed file system HDFS and normalized Pretreatment, quality data is provided for subsequent data analysis;
(4) data distribution formula is calculated:
On local host, the multivariable multi-scale entropy MMSE that can only be run on unit is rewritten into energy using R softwares The distributed algorithm run on big data platform SparkR;
Host node Master calls MMSE's by big data platform SparkR SparkR api interfaces from local host Distributed algorithm, is deployed to each from node Slave, and is used as using pretreated data the input of the algorithm;
From the multivariate sample entropy of each failure of node Slave parallel computations, and result of calculation is saved in Hadoop's In distributed file system HDFS;
(5) visual presentation:Under stand-alone environment, local host is from the distributed file system HDFS of big data platform Result data is downloaded, the multivariate sample entropy song of the various failures of switchgear is then drawn using the drawing function for R softwares of increasing income Line;
(6) feature extraction:According to the multivariate sample entropy curve of each failure, each damage curve of selection is all shallower, and respectively The multivariate sample entropy of failure correspondence scale factor differs larger scale factor scope each other, and by the scale factor scope Multivariate sample entropy as each failure characteristic parameter.
The present invention compared with prior art, with advantages below:
1) present invention uses big data platform SparkR, will increase income R softwares and Spark softwares of increasing income are combined by force, can be with Spark elasticity distribution formula data set RDD and DataFrameAPI is seamlessly used in R softwares, by Spark internal memory meters The advantage of a variety of computation models is supported in calculation, unified software stack, distributed data calculating and analysis is efficiently carried out, big advise is solved The challenge that mould data set is brought.
2) though existing multivariable multi-scale entropy MMSE is applied in the ambits such as physics, physiology, opening Equipment fault analysis field is closed, MMSE is not employed also.Multivariable multi-scale entropy MMSE algorithms are applied to switch by the present invention Equipment fault analysis field, solves the problem of fault impact factor increases, and the extraction of switchgear fault signature is improved indirectly The degree of accuracy;Distributed Parallel Computing of the MMSE algorithms on SparkR platforms is realized, the speed of feature extraction is improved.
3) present invention can become apparent from, more intuitively due to introducing scale factor in multivariable multi-scale entropy MMSE Distinguish several malfunction types of switchgear.
Brief description of the drawings
Fig. 1 be the present invention realize general flow chart;
Fig. 2 is interior joint configuration flow figure of the present invention
Fig. 3 is SparkR integrated stand compositions in the present invention
Fig. 4 is the flow chart of feature extraction algorithm MMSE in the present invention
Fig. 5 is the multivariate sample entropy curve map of 4 kinds of failures
Embodiment
The present invention is elaborated with reference to the accompanying drawings and detailed description.
When the feature extracting method of traditional switchgear failure faces magnanimity fault impact factor data, do not possess big rule Mould data storage and disposal ability, are all that feature extraction is carried out under unit, serial mode, speed is slow, efficiency is low and security Difference, directly influences the efficiency and accuracy of fault signature extraction.
Hadoop big data processing platforms, its HDFS distributed file systems and MapReduce programming modes is relatively good Ground solves the problem of mass data distributed storage and processing.Compared with Hadoop, Spark provides distributed data collection Abstract, programming model is more flexible and efficient, internal memory can be made full use of to carry out improving performance.Spark can solve iteration well Computing and interactive computing, it introduces elasticity distribution formula data set RDD, there is a fault tolerant mechanism, and data acquisition system can be by simultaneously Row operation, can be cached in internal memory, without reloading data from HDFS every time as MapReduce. Pretreated data set is created into RDD by data calculation process, Spark, be cached to internal memory, and then is performed parallel by multiple Task is reused.R softwares possess powerful function of statistic analysis and abundant third party's expanding packet, but the core fortune of R softwares at present Row environment is single thread, and treatable data volume is limited to the memory size of unit, the mass data processing in big data epoch The challenge to R software sharings.SparkR will increase income R softwares and Spark softwares of increasing income are combined by force, can be seamless in R softwares Ground uses Spark RDD and DataFrameAPI, is calculated by Spark internal memories, unifies in software stack to support a variety of computation models Advantage, efficiently carry out distributed data calculating and analysis, solve the challenge that large-scale dataset is brought.Therefore, it is of the invention SparkR platforms are introduced, by the processing to big-sample data amount, multivariable multi-scale entropy MMSE algorithms are introduced, to solve failure The problem of influence factor is on the increase, improves the degree of accuracy that switchgear fault signature is extracted indirectly;Existed by MMSE algorithms Distributed Parallel Computing on SparkR platforms, improves the speed that switchgear fault signature is extracted.
Reference picture 1, step is as follows for of the invention realizing:
Step 1, SparkR big data platforms are built.
(1a) installs the CentOS-6.3 versions of linux system, the Hadoop- for Hadoop softwares of increasing income on local host 2.6.0 version, the Spark-1.4.0 versions for Spark softwares of increasing income.
With reference to table 1, the correlation technique component needed for mounting platform SparkR sub-platform Hadoop, including Flume, Sqoop。
Technology component needed for the sub-platform Hadoop of table 1
Wherein:Core:Represent distributed file system and general purpose I/O components and interface;
Avro:Represent to provide efficiently, across language RPC data sequence system, perdurable data storage;
HDFS:Distributed file system is represented, for realizing that the piecemeal of large-scale data is stored;
MapReduce:Represent distributed data processing framework and performing environment;
Zookeeper:Represent the distributed coordination service of high availability;
Pig:Data-flow language and running environment are represented, to retrieve large-scale dataset;
Chukwa:The collector of data storage in operation HDFS is represented, analysis report is generated using MapReduce;
Mahout:Represent machine learning algorithm storehouse;
Flume:Represent result collection system;
Sqoop:Data syn-chronization instrument is represented, for transmitting data between traditional data and Hadoop;
(1b), according to existing fault data scale, the node number for determining platform cluster is 4;And need place according to follow-up The fault data scale of reason, can extend or reduce to the node number;
(1c) with reference to table 2, each node of configuration platform cluster, i.e., from the nodes of determination using any 1 node as Host node Master, remaining is connected as from LAN between node Slave, node;
The Master nodes mainly configure name manager NameNode and task manager JobTracker role, bear Blame the execution of house steward's distributed data and task resolution;Host node Master attribute is NameNode, its as master server, Operated for managing the access of the NameSpace and client of file system to file system;
This 3 are transported actuator from node Salve1, Slave2 and Slave3 configuration data memory DataNode and task TaskTracker, is responsible for the execution of Distributed Storage and task.Attribute from node Slave is DataNode, and it is led Want the data that function is management storage.
The platform cluster node structure of table 2
Namespace node Ip addresses Attribute
Master 192.168.137.2 NameNode
Slave1 192.168.137.3 DataNode
Slave2 192.168.137.4 DataNode
Slave3 192.168.137.5 DateNode
(1d) installs related software in host node and three from node and configures associated documents:
Reference picture 2, it is determined that host node Master and three from node lave1, Slave2 and Slave3, configuration clothes Be engaged in device process SSH simultaneously carry out without password authentification, and install Java software, configuration Java context, configuration Hadoop core documents and Spark core documents;Wherein Hadoop core documents include core-site.xml, hdfs-site.xml, mapred- Site.xml and yarn-site.xml;Spark core documents include Spark-env.sh, slaves and profile.
After the completion of above-mentioned (1a)~(1d) steps, platform SparkR overall architectures are obtained, as is shown with reference to figure 3.
Reference picture 3, the SparkR platforms that this example is built, including each node of cluster and virtual machine JVM rear ends two parts. SparkR provides elasticity distribution formula data set RDD and data frame application programming interfaces DataFrame for the operation of R softwares API.SparkR API are operated in R softwares, and Core is operated in virtual machine JVM.JVM rear ends are a groups in Core Part there is provided the bridging functionality between R softwares and virtual machine JVM, can allow R software programmings code establishing java class reality Example, call the case method of Java object or the static method of java class.SparkR DataFrame API need not be passed Enter, the data in data frame DataFrame are entirely to be stored with JVM data type.DataFrame API further comprises one Part RDD API.
During work, DataFrame is first converted into elasticity distribution formula data set RDD, elasticity distribution formula data are then called Collect RDD packet, polymerization and repartition operation, launching process RWorker carries out MMSE Distributed Calculation.By using The customized simple efficient binary protocol socket in family, the R softwares after host node RDD partition data, serializing are compiled The algorithm routine and other information write are transmitted to process Rworker, the partition data that process Rworker unserializings are received and The algorithm routine of R software programmings, the algorithm routine of R software programmings is applied on partition data, then result data is serialized JVM ends are passed back into byte arrays.
Step 2, Data Collection and storage.
Host node Master gathers the fault data of relationship type from platform exterior by Hadoop Sqoop component technologys; The fault data of file type is gathered by Flume component technologys, and by these data Cun Chudao Hadoop of collection distribution In file system HDFS, host node Master and all these data are shared from node Slave.
The fault data that described Sqoop component technologys and Flume component technologys are collected is respectively 5000, such as 3~table of table 6 It is shown.
The fault category of table 3 is every influence factor data of " operating mechanism is abnormal "
The fault category of table 4 is every influence factor data of " SF6 leakages "
The fault category of table 5 is every influence factor data of " accessory damage "
The fault category of table 6 is the influence factor data of " critical piece deterioration "
Step 3, to being stored in the fault data in distributed file system HDFS conversion is carried out successively and normalized pre- Processing.
(3a) in data set with the data conversion that represents of interval into corresponding single number:
" less than 40% " in influence factor " annual load factor " is converted into 0.25, " 40%~60% " is converted into 0.5, " 60%~80% " is converted into 0.75, and " more than 80% " is converted into 0.9.
It is interval that attribute in data set is normalized to [0,1] by (3b):
Wherein, x is the actual value of the influence factor of each failure, xmax、xminMaximum and minimum respectively in actual value Value, y is the value after normalization.
Step 4, data distribution formula is calculated.
On local host, the multivariable multi-scale entropy MMSE that can only be run on unit is rewritten into energy using R softwares The distributed algorithm run on big data platform SparkR;
Host node Master calls MMSE point by big data platform SparkR SparkRAPI interfaces from local host Cloth algorithm, is deployed to each from node Slave, and is used as using pretreated data the input of the algorithm;From node The multivariate sample entropy of each failure of Slave parallel computations, and result of calculation is saved in Hadoop distributed file system In HDFS.
Reference picture 4, fault signature extraction algorithm MMSE flow is as follows:
(4a) determines embedded dimension m=(2,2,2,2,2,2), delay vector τ=(1,1,1,1,1,1), threshold value r=0.2* Sd, sd are the standard deviation of each variable, scale factor ε=1,2 ..., 20;Determined according to the number of the influence factor of failure First variable p=6, according to fault data bar, several numbers determine the second variable N=5000;
(4b) builds length as N using pretreated data and includes the data set { x of p variablek,i, wherein i=1, 2,...,N;K=1,2 ..., p;
(4c) is to multivariate data collectionThick-breakpoint processing is carried out based on scale factor ε, is obtained It is to new data set:
To each scale factor ε=1,2 ..., 20, it is N and the multivariate data collection for including p variable that length is sought respectivelyMultivariate sample entropy:
(4d) builds N-n m dimension composite delay vectors Xm(i)∈Rm, i=1,2 ..., N-n, n=max { M } × max (τ), wherein M=[m1,m2,...,mp]∈Rp, wherein m1,m2,...,mpAll it is positive integer, embedded dimension vectorDelay vector τ=[τ12,...,τp], wherein τ12,...,τpAll it is positive integer, then Mixed Delay is vectorial Xm(i) it can be expressed as:
(4e) definition vector XmAnd X (i)m(j) distance between is the poor maximum of its corresponding element, i.e.,:
(4f) is to each composite delay vector Xm(i) itself and other vector distances, are asked respectively, and statistical distance is less than given Threshold value r number PiAnd PiThe probability of appearance
Pi={ d [Xm(i),Xm(j)]≤r,i≠j};
(4g) calculates probabilityAverage value Bm(r):
Composite delay vector in (4d) is expanded to m+1 dimensions by (4h) from m dimensions, and vector M includes p element, has p kinds real Existing method, i.e. M=[m1,m2,...,mk+1,...,mp], k=1,2 ..., p, the individual Mixed Delay vector X of construction p × (N-n)m+1 (i)∈Rm+1
(4i) defines two vector Xm+1And X (i)m+1(j) distance between is the maximum of its corresponding element difference, seeks Vector Groups Xm+1(i) distance between any two, and statistical distance is less than given threshold value r number QiAnd QiThe probability of appearance
Qi={ d [Xm+1(i),Xm+1(j)]≤r,i≠j};
(4j) is calculatedAverage value B under m+1 dimensionsm+1(r):
(4k) is according to step (4f) result of calculationWith the result of calculation B of step (4j)m+1(r) multivariable sample, is calculated This entropy MSampEn:
After calculating terminates, multivariate sample entropy of 4 kinds of failures on scale factor ε=1,2 ..., 20 is obtained, and will As a result it is saved in distributed file storage system HDFS.
Step 5, visual presentation.
Under stand-alone environment, local host downloads result data from the distributed file system HDFS of big data platform, The visualization bag enriched using R softwares, draws multivariate sample entropy curve of each failure of switchgear in 20 scale factors, As shown in Figure 5.
From figure 5 it can be seen that the multivariate sample entropy curve of 4 kinds of failures, in addition to scale factor 1, the multivariable sample of 4 kinds of failures This entropy curve does not all intersect, classifying quality highly significant.
Step 6, feature extraction.
The multivariate sample entropy curve of 4 kinds of failures according to Fig. 5, the curve of 4 kinds of failures is in scale factor 10~20 In the range of all than shallower, and the multivariate sample entropy of 4 kinds of failures correspondence scale factors differs larger each other, so choosing yardstick The multivariate sample entropy of the scope of the factor 10~20, as the characteristic parameter of 4 kinds of failures, is timely diagnosis and the event of anticipation switchgear Barrier provides foundation.
Above description is only example of the present invention, does not constitute any limitation of the invention, it is clear that for this , all may be without departing substantially from principle of the invention structure after present invention and principle has been understood for the professional in field In the case of, the various modifications and changes in form and details are carried out, but these modifications and variations based on inventive concept are still Within the claims of the present invention.

Claims (3)

1. a kind of switchgear fault signature extracting method based on big data platform, including:
(1) SparkR big data platforms are built:
(1a) installs linux system, Hadoop open source softwares and Spark open source softwares;
(1b) determines the node number of platform cluster according to existing fault data scale, and according to subsequently needing failure to be processed Data scale, can extend or reduce to the node number;
Each node of (1c) configuration platform cluster, i.e., using any 1 node as host node Master from the nodes of determination, Remaining is as from node Slave;
(1d) it is determined that host node Master and all from node Slave, configuration server process SSH (Secure Shell) and carry out without password authentification, and Java software, configuration Java context, configuration Hadoop core documents and Spark are installed Core document;
(2) Data Collection and storage:Host node Master is gathered by Hadoop Sqoop component technologys and closed from platform exterior It is the fault data of type;The fault data of file type is gathered by Flume component technologys, and by these data Cun Chudao of collection In Hadoop distributed file system HDFS, host node Master and all these data are shared from node Slave;
(3) data prediction:The fault data being stored in distributed file system HDFS is changed and normalized successively Pretreatment, provide quality data for subsequent data analysis;
(4) data distribution formula is calculated:
On local host, being rewritten into the multivariable multi-scale entropy MMSE that can only be run on unit using R softwares can be big The distributed algorithm run on data platform SparkR;
Host node Master calls MMSE distribution by big data platform SparkR SparkR api interfaces from local host Formula algorithm, is deployed to each from node Slave, and is used as using pretreated data the input of the algorithm;
From the multivariate sample entropy of each failure of node Slave parallel computations, and result of calculation is saved in Hadoop distribution In file system HDFS;
(5) visual presentation:Under stand-alone environment, local host is downloaded from the distributed file system HDFS of big data platform Result data, then draws the multivariate sample entropy curve of the various failures of switchgear using the drawing function for R softwares of increasing income;
(6) feature extraction:According to the multivariate sample entropy curve of each failure, each damage curve of selection is all shallower, and each failure The multivariate sample entropy of correspondence scale factor differs larger scale factor scope each other, and by many of the scale factor scope Variable sample entropy as each failure characteristic parameter.
2. according to the method described in claim 1, it is characterised in that to the event in distributed file system HDFS in step (3) Barrier data carry out conversion and normalized pretreatment successively, be first in data set with the interval data conversion represented into corresponding Single number;The attribute in data set is finally normalized into [0,1] interval.
3. according to the method described in claim 1, it is characterised in that from each failure of node Slave parallel computations in step (4) Multivariate sample entropy, is carried out as follows:
(3a) determines that embedded dimension m, delay vector τ, threshold value r=0.2*sd, sd are the standard deviation of each variable, scale factor ε;First variable p is determined according to the number of the influence factor of failure, several numbers determine the second variable N according to fault data bar;
(3b) data using after pretreatment build length as N and include the data set { x of p variablek,i, wherein i=1,2 ..., N; K=1,2 ..., p;
(3c) is to multivariate data collectionThick-breakpoint processing is carried out based on scale factor ε, obtains new Data set be:
<mrow> <msubsup> <mi>y</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>j</mi> </mrow> <mi>&amp;epsiv;</mi> </msubsup> <mo>=</mo> <mfrac> <mn>1</mn> <mi>&amp;epsiv;</mi> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mrow> <mo>(</mo> <mi>j</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mi>&amp;epsiv;</mi> <mo>+</mo> <mn>1</mn> </mrow> <mrow> <mi>j</mi> <mi>&amp;epsiv;</mi> </mrow> </munderover> <msub> <mi>x</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <mn>1</mn> <mo>&amp;le;</mo> <mi>j</mi> <mo>&amp;le;</mo> <mfrac> <mi>N</mi> <mi>&amp;epsiv;</mi> </mfrac> <mo>,</mo> <mi>k</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>p</mi> <mo>;</mo> </mrow>
To each scale factor ε, it is N and the multivariate data collection for including p variable that length is sought respectively Multivariate sample entropy:
(3d) builds N-n m dimension composite delay vectors Xm(i)∈Rm, i=1,2 ..., N-n, n=max { M } × max (τ), its Middle M=[m1,m2,...,mp]∈Rp, wherein m1,m2,...,mpAll it is positive integer, embedded dimension vectorDelay Vectorial τ=[τ12,...,τp], wherein τ12,...,τpAll be positive integer, then Mixed Delay vector Xm(i) it can be expressed as:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>X</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>&amp;lsqb;</mo> <msub> <mi>x</mi> <mrow> <mn>1</mn> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>1</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mn>1</mn> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>1</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mn>1</mn> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>&amp;tau;</mi> <mn>1</mn> </msub> </mrow> </msub> <mo>,</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mrow> <mn>2</mn> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>2</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mn>2</mn> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>2</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mn>2</mn> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>&amp;tau;</mi> <mn>2</mn> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mrow> <mi>p</mi> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mi>p</mi> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mi>p</mi> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mi>p</mi> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>&amp;tau;</mi> <mi>p</mi> </msub> </mrow> </msub> <mo>&amp;rsqb;</mo> <mo>;</mo> </mrow> </mtd> </mtr> </mtable> </mfenced>
(3e) definition vector XmAnd X (i)m(j) distance between is the poor maximum of its corresponding element, i.e.,:
<mrow> <mi>d</mi> <mo>&amp;lsqb;</mo> <msub> <mi>X</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>,</mo> <msub> <mi>X</mi> <mi>m</mi> </msub> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>&amp;rsqb;</mo> <mo>=</mo> <munder> <mrow> <mi>m</mi> <mi>a</mi> <mi>x</mi> </mrow> <mrow> <mi>l</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mo>...</mo> <mo>,</mo> <mi>m</mi> </mrow> </munder> <mo>{</mo> <mo>|</mo> <mi>x</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>+</mo> <mi>l</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <mi>x</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>+</mo> <mi>l</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>|</mo> <mo>}</mo> <mo>;</mo> </mrow>
(3f) is to each Mixed Delay vector Xm(i) itself and other vector distances, are asked respectively, and statistical distance is less than given threshold value r Number PiAnd PiThe probability of appearance
Pi={ d [Xm(i),Xm(j)]≤r,i≠j};
<mrow> <msubsup> <mi>B</mi> <mi>i</mi> <mi>m</mi> </msubsup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>N</mi> <mo>-</mo> <mi>n</mi> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <msub> <mi>P</mi> <mi>i</mi> </msub> <mo>;</mo> </mrow>
(3g) calculates probabilityAverage value Bm(r):
<mrow> <msup> <mi>B</mi> <mi>m</mi> </msup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>N</mi> <mo>-</mo> <mi>n</mi> </mrow> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mi>n</mi> </mrow> </msubsup> <msubsup> <mi>B</mi> <mi>i</mi> <mi>m</mi> </msubsup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
Composite delay vector in (3d) is expanded to m+1 dimensions by (3h) from m dimensions, and vector M includes p element, has p kinds realization side Method, i.e. M=[m1,m2,...,mk+1,...,mp], k=1,2 ..., p, the individual Mixed Delay vector X of construction p × (N-n)m+1(i) ∈Rm+1
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>X</mi> <mrow> <mi>m</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>&amp;lsqb;</mo> <msub> <mi>x</mi> <mrow> <mn>1</mn> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>1</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mn>1</mn> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>1</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mn>1</mn> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>&amp;tau;</mi> <mn>1</mn> </msub> </mrow> </msub> <mo>,</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mrow> <mn>2</mn> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>2</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mn>2</mn> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mn>2</mn> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mn>2</mn> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>&amp;tau;</mi> <mn>2</mn> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mi>k</mi> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mi>k</mi> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mo>&amp;lsqb;</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mi>k</mi> </msub> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <mn>1</mn> <mo>&amp;rsqb;</mo> <msub> <mi>&amp;tau;</mi> <mi>k</mi> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mrow> <mi>p</mi> <mo>,</mo> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mi>p</mi> <mo>,</mo> <mi>i</mi> <mo>+</mo> <msub> <mi>&amp;tau;</mi> <mi>p</mi> </msub> </mrow> </msub> <mo>,</mo> <mn>...</mn> <mo>,</mo> <msub> <mi>x</mi> <mrow> <mi>p</mi> <mo>,</mo> <mi>i</mi> <mo>+</mo> <mrow> <mo>(</mo> <msub> <mi>m</mi> <mi>p</mi> </msub> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> <msub> <mi>&amp;tau;</mi> <mi>p</mi> </msub> </mrow> </msub> <mo>&amp;rsqb;</mo> <mo>;</mo> </mrow> </mtd> </mtr> </mtable> </mfenced>
(3i) defines two vector Xm+1And X (i)m+1(j) distance between is the maximum of its corresponding element difference, seeks Vector Groups Xm+1 (i) distance between any two, and statistical distance is less than given threshold value r number QiAnd QiThe probability of appearance
Qi={ d [Xm+1(i),Xm+1(j)]≤r,i≠j};
<mrow> <msubsup> <mi>B</mi> <mi>i</mi> <mrow> <mi>m</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>N</mi> <mo>-</mo> <mi>n</mi> <mo>)</mo> </mrow> <mo>-</mo> <mn>1</mn> </mrow> </mfrac> <msub> <mi>Q</mi> <mi>i</mi> </msub> <mo>;</mo> </mrow>
(3j) is calculatedAverage value B under m+1 dimensionsm+1(r):
<mrow> <msup> <mi>B</mi> <mrow> <mi>m</mi> <mo>+</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>N</mi> <mo>-</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <msubsup> <mi>&amp;Sigma;</mi> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>p</mi> <mrow> <mo>(</mo> <mi>N</mi> <mo>-</mo> <mi>n</mi> <mo>)</mo> </mrow> </mrow> </msubsup> <msubsup> <mi>B</mi> <mi>i</mi> <mrow> <mi>m</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> <mo>;</mo> </mrow>
(3k) is according to step (3f) result of calculationWith the result of calculation B of step (3j)m+1(r) multivariate sample entropy, is calculated MSampEn:
<mrow> <mi>M</mi> <mi>S</mi> <mi>a</mi> <mi>m</mi> <mi>p</mi> <mi>E</mi> <mi>n</mi> <mo>=</mo> <mo>-</mo> <mi>l</mi> <mi>n</mi> <mo>&amp;lsqb;</mo> <mfrac> <mrow> <msup> <mi>B</mi> <mrow> <mi>m</mi> <mo>+</mo> <mn>1</mn> </mrow> </msup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> </mrow> <mrow> <msup> <mi>B</mi> <mi>m</mi> </msup> <mrow> <mo>(</mo> <mi>r</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>&amp;rsqb;</mo> <mo>.</mo> </mrow> 3
CN201710550324.6A 2017-07-07 2017-07-07 Switchgear fault signature extracting method based on big data platform Pending CN107301243A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710550324.6A CN107301243A (en) 2017-07-07 2017-07-07 Switchgear fault signature extracting method based on big data platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710550324.6A CN107301243A (en) 2017-07-07 2017-07-07 Switchgear fault signature extracting method based on big data platform

Publications (1)

Publication Number Publication Date
CN107301243A true CN107301243A (en) 2017-10-27

Family

ID=60133884

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710550324.6A Pending CN107301243A (en) 2017-07-07 2017-07-07 Switchgear fault signature extracting method based on big data platform

Country Status (1)

Country Link
CN (1) CN107301243A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108762959A (en) * 2018-04-02 2018-11-06 阿里巴巴集团控股有限公司 A kind of method, apparatus and equipment of selecting system parameter
CN111289888A (en) * 2019-12-11 2020-06-16 嘉兴恒创电力集团有限公司博创物资分公司 High-voltage circuit breaker state detection and fault diagnosis method based on big data technology
CN111308336A (en) * 2020-03-24 2020-06-19 广西电网有限责任公司电力科学研究院 High-voltage circuit breaker fast overhaul method and device based on big data
CN112357771A (en) * 2020-11-19 2021-02-12 中船重工(青岛)海洋装备研究院有限责任公司 Ship-shore integrated equipment state monitoring system and method
CN112666451A (en) * 2021-03-15 2021-04-16 南京邮电大学 Integrated circuit scanning test vector generation method
CN114236374A (en) * 2021-12-13 2022-03-25 中国矿业大学 Real-time diagnosis method for open circuit fault of rectifier

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699985A (en) * 2015-03-26 2015-06-10 西安电子科技大学 Medical big-data acquisition and analysis system and method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699985A (en) * 2015-03-26 2015-06-10 西安电子科技大学 Medical big-data acquisition and analysis system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
孔宪光等: "面向复杂工业大数据的实时特征提取方法", 《西安电子科技大学学报(自然科学版)》 *
宋亚奇: "云平台下电力设备监测大数据存储优化与并行处理技术研究", 《中国博士学位论文全文数据库(电子期刊)工程科技Ⅱ辑》 *
曲朝阳等: "基于Spark的电力设备在线监测数据可视化方法", 《电工电能新技术》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108762959A (en) * 2018-04-02 2018-11-06 阿里巴巴集团控股有限公司 A kind of method, apparatus and equipment of selecting system parameter
CN108762959B (en) * 2018-04-02 2021-07-06 创新先进技术有限公司 Method, device and equipment for selecting system parameters
CN111289888A (en) * 2019-12-11 2020-06-16 嘉兴恒创电力集团有限公司博创物资分公司 High-voltage circuit breaker state detection and fault diagnosis method based on big data technology
CN111289888B (en) * 2019-12-11 2022-04-08 嘉兴恒创电力集团有限公司博创物资分公司 High-voltage circuit breaker state detection and fault diagnosis method based on big data technology
CN111308336A (en) * 2020-03-24 2020-06-19 广西电网有限责任公司电力科学研究院 High-voltage circuit breaker fast overhaul method and device based on big data
CN112357771A (en) * 2020-11-19 2021-02-12 中船重工(青岛)海洋装备研究院有限责任公司 Ship-shore integrated equipment state monitoring system and method
CN112666451A (en) * 2021-03-15 2021-04-16 南京邮电大学 Integrated circuit scanning test vector generation method
CN112666451B (en) * 2021-03-15 2021-06-29 南京邮电大学 Integrated circuit scanning test vector generation method
CN114236374A (en) * 2021-12-13 2022-03-25 中国矿业大学 Real-time diagnosis method for open circuit fault of rectifier
CN114236374B (en) * 2021-12-13 2023-11-14 中国矿业大学 Real-time diagnosis method for open-circuit fault of rectifier

Similar Documents

Publication Publication Date Title
CN107301243A (en) Switchgear fault signature extracting method based on big data platform
CN111177095B (en) Log analysis method, device, computer equipment and storage medium
CN104580519B (en) A kind of method of rapid deployment openstack cloud computing platforms
US20160292591A1 (en) Streamlined analytic model training and scoring system
CN102236672A (en) Method and device for importing data
CN102880990B (en) Fault processing system
CN106503268B (en) Data comparison methods, devices and systems
CN103455346B (en) Application program deployment method, deployment main control computer, deployment client side and cluster
CN106062751A (en) Managing data profiling operations related to data type
CN109739919A (en) A kind of front end processor and acquisition system for electric system
CN109522228A (en) Interface automatic test data configuration method, apparatus, platform and storage medium
CN104050276A (en) Cache processing method and system of distributed database
CN105760272B (en) Monitoring backstage business customizing method and its system based on plug-in unit
CN104794095B (en) Distributed Calculation processing method and processing device
EP3470992A1 (en) Efficient storage and utilization of a hierarchical data set
JP2023506239A (en) Systems and methods for autonomous monitoring and recovery in hybrid energy management
US10163060B2 (en) Hierarchical probability model generation system, hierarchical probability model generation method, and program
CN106371931B (en) A kind of high-performance geoscience computing service system based on Web frame
CN101996257A (en) Method for searching reconfigurable route exchange platform component
CN104299170B (en) Intermittent energy source mass data processing method
CN107679701B (en) Load reduction parallel computing method and device
CN109684517A (en) A kind of historical data storage method, reading/writing method, storage device and equipment
US20240004778A1 (en) Method for processing command, device for processing command, and electronic device
CN101387582A (en) Failure diagnosis system and method based on PDA
Dai et al. Cyber physical power system modeling and simulation based on graph computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171027