CN106845526B - A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering - Google Patents
A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering Download PDFInfo
- Publication number
- CN106845526B CN106845526B CN201611247433.2A CN201611247433A CN106845526B CN 106845526 B CN106845526 B CN 106845526B CN 201611247433 A CN201611247433 A CN 201611247433A CN 106845526 B CN106845526 B CN 106845526B
- Authority
- CN
- China
- Prior art keywords
- data
- parameter
- fault
- classification
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Testing And Monitoring For Control Systems (AREA)
- Test And Diagnosis Of Digital Computers (AREA)
Abstract
A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering provided by the invention, Fault Classification of the invention is from the mass data of equipment operation, fault data is selected according to diagnostic rule, and the machine for carrying out supervision independently clusters, form the automatic classification results of relevant parameter failure, it is able to solve current equipment failure overdiagnose and relies on expert knowledge library, and the problem of having ignored the incidence relation between each subsystem between the parameter of depth Non-linear coupling, and magnanimity valid data there is no the problem of good digging utilization in practical equipment model operation;Simultaneously, since the implementation of Fault Classification of the invention needs not rely upon the precise physical modeling to object equipment, therefore traditional complication system difficulty difficult to model is avoided, the intelligent fault classification and relevant parameter analysis excavated based on mass data are realized, with the controllable failure modes ability of accuracy rate.
Description
Technical field
The present invention relates to equipment failure prediction and the fields health control (PHM), and in particular to one kind is merged based on big data
The relevant parameter Fault Classification of clustering.
Background technique
Failure predication and health control have been developed as aerospace field system logistics support, maintenance and autonomous health
The important support technology of management and basis, in " National Program for Medium-to Long-term Scientific and Technological Development 2006-2020 ", " weight
Big product and great installation forecasting technique in life span " is proposed as cutting edge technology in space flight in recent years, Aeronautics subject hair
In exhibition report, PHM technology is classified as crucial and support technology.
PHM technology has become one and covers basic material, mechanical structure, the energy, electronics, automatic test, reliability, letter
The multi-field cross disciplines and research hot topic direction such as breath have important application value and realistic meaning.In most of work
In industry system PHM application, mathematics or the physical model for establishing complex component or system are very difficult or even cannot achieve, or identification
The parameter of model is complex, and therefore, the test data in each stage such as component or system design, emulation, operation and maintenance passes
Sensor historical data just becomes the main means for grasping system performance decline.
Fast development is gradually paid attention to and obtained to PHM method as a result, based on test or sensor historic data mining,
Important research hot spot as the field PHM.Especially for complication systems such as aerospaces, it is difficult to directly acquire or construct characterization
The physical model of component, system degradation and remaining life, meanwhile, these objective systems and component have a large amount of available state prisons
It surveys and test data, therefore, the PHM method system based on data-driven obtains U.S. army, NASA and numerous grinds
Study carefully the extensive attention of mechanism, industrial enterprise.
Data-driven PHM method is to be acquired and obtained feature ginseng related with system property based on advanced sensor technology
Number, and these characteristic parameters are associated with useful information, it detected, analyzed and is predicted by intelligent algorithm and model, provided
The probability that remaining life distribution, performance degradation degree or the task of goal systems fail, to be maintenance system and system security
Decision information is provided.
In data-driven PHM method system, method flow, distinct methods fusion, model selection, Model suitability etc.
Problem has become for the research emphasis in the field now, and data-driven PHM method is obtained with its flexible adaptability and ease for use
It obtained and is widely applied and promotes.
Summary of the invention
It is an object of the present invention to solve existing data-driven PHM method, there is fault datas to obtain difficulty
Technical problem, the present invention provides a kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering, for improving
Existing complex equipment contains the operation data of massive information not by effective status excavated, efficiently used.
To achieve the above object, the present invention provides the algorithm flows of complete set, carry out operation and parsing, obtain final
Failure modes and parameter association probabilistic model.The relevant parameter Fault Classification includes:
Step 1) obtains the various operation datas of object equipment.
The design data that step 2) is equipped according to object is established and covers the parameter diagnostic rule that object equips whole parameters
Library.Parameter diagnostic rule library not only includes the thresholding judgment rule of parameter, while also Trend judgement rule and jump comprising parameter
Become judgment rule.
Step 3) is subject to the rule in parameter diagnostic rule library, obtains failure to operation data screening all in step 1)
Data form all fault data set without classification fault data collection.
Step 4) is met with independently being clustered without classification fault data collection by the data that clustering algorithm carries out supervision
It is required that clusters number and every cluster centre.The number of cluster is gradually increased to number appropriate since 2, and final choose makes
The minimum value that each cluster core average weighted distance no longer reduces is the sum of cluster.Meanwhile passing through determining every cluster centre
Classification fault data collection is obtained to classification is carried out without classification fault data collection.
Step 5) will use mapping-reduction algorithm without classification fault data collection in step 3), generate parameter association probability
Model, while the parameter association probabilistic model includes that every parameter breaks down in object equipment, other parameters are also sent out
The probability distribution data of raw failure, data are arranged in probability table from high to low.
Step 6) is using the classification fault data collection in step 4) as fault distinguishing standard, using nearest neighbor algorithm, to step
Rapid 1) the middle operation data obtained carries out fault category identification, obtains failure modes result.
Step 7) is combined according to failure modes result with the parameter association probabilistic model in step 5), obtains comprehensive event
Hinder diagnostic classification result.The content that comprehensive diagnos classification results include are as follows: failure modes result, the failure modes result it is all
The probability distribution data of parameter.
As a further improvement of the above technical scheme, the format of the operation data obtained in the step 1) meets: every
A complete data entry includes all parameter values equipped at the time of the data entry occurs and in the moment object;Each
Single data values in data entry characterize the measured value of a parameter in the object-based device at a certain moment;Between each data entry
It is arranged one by one according to the sequencing that the moment occurs.
As a further improvement of the above technical scheme, the format of the fault data screened in the step 3) meets: every
A data entry includes the whole fault parameters to break down at the time of the data entry occurs and at the moment;For data
The parameter to break down in entry marks failure according to parameter diagnostic rule library and triggered rule occurs.
As a further improvement of the above technical scheme, the parameter diagnostic rule library includes the bound of parameter, ginseng
Number jump abnormal determination rule, parameter trend gradual change abnormal determination rule.
As a further improvement of the above technical scheme, the step 4) specifically includes:
Step 101) sets the initial number of clusters number K as 2, according to current K value to without classification fault data collection
Cluster operation is carried out, K cluster centre and its corresponding K cluster are obtained;
Step 102) calculates the mean profile coefficient of K cluster, the mean profile coefficient that K is clustered and K-1 cluster
Mean profile coefficient compare, if two mean profile coefficients are constant, choose current K value as cluster sum, otherwise
It is re-execute the steps 101) after setting K=K+1;The silhouette coefficient indicates that all data entries for including in each cluster are corresponding
Vector point to cluster centre geometric distance average value;
Step 103) carries out cluster operation to without classification fault data collection with the cluster sum determined in step 102), and leads to
It crosses the every cluster centre obtained to classify to all fault datas concentrated without classification fault data, obtains classification number of faults
According to collection.
As a further improvement of the above technical scheme, the operating procedure packet of cluster centre is obtained in the step 101)
It includes:
Step 101-1) from object equip all operation datas in select the corresponding vector point of a data entry at random
As first cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as in the second cluster
The heart;
Step 101-2) calculate the geometric distance Distance (x) of each cluster centre cluster centre nearest with it, general
All geometric distance Distance (x), which are added, obtains total distance Sum (Distance (x));
Step 101-3) to randomly select the data entry that one can fall in total distance Sum (Distance (x)) corresponding
Vector point Random re-execute the steps 101-2 as the cluster centre newly increased), until picking out in K cluster
The heart.
As a further improvement of the above technical scheme, the step 5) specifically includes:
Whole fault data entries comprising each parameter are successively respectively mapped to together, form each item ginseng by step 201)
The corresponding mapping class of number, the mapping class include whole fault data entries an of parameter and its frequency of appearance;
Step 202) calculates the sum of fault data entry in each mapping class, the denominator as probability calculation;
Step 203) adds up time occurred in each mapping class comprising the other parameters in addition to the mapping class corresponds to parameter
Number, the molecule as probability calculation;
The ratio between molecule in step 203) and denominator in step 202) are obtained each parameter and broken down by step 204)
While, probability distribution data that other parameters also break down.
As a further improvement of the above technical scheme, the step 6) specifically includes: calculating all in step 1)
The geometric distance of operation data and the every cluster centre determined, takes mean profile system of the smallest distance value with corresponding cluster
Number is compared, if the distance value is less than the mean profile coefficient of corresponding cluster, determines operation data for the cluster institute
Corresponding fault type.
A kind of relevant parameter Fault Classification advantage based on the analysis of big data Fusion of Clustering of the invention is:
The present invention provides a kind of definition clearly, it is practical it is operable, with good result based on mass data
The relevant parameter Fault Classification of Fusion of Clustering analysis, improves following technology existing for existing method for diagnosing faults and asks
Topic:
1. current equipment failure overdiagnose relies on expert knowledge library, and expert knowledge library is when facing complication system, face
Face multiple shot array problem, it is difficult to cover whole fault conditions and its relevant parameter, have ignored the ginseng of depth coupling between each subsystem
The problem of non-linear correlation relationship between number.In this regard, Fault Classification of the invention is excavated not by data mining means
With parameter association relationship and its fault mode between subsystem, so as to be effectively improved the above problem.
2. available data driving PHM method office is only limitted to component-level fault diagnosis, and in the fault diagnosis of complication system grade
In the process, due to there is the difficulty to complication system entirety Accurate Model, for the variety classes event being mingled in normal data
Barrier data rely primarily on the machine learning clustering method of non-supervisory formula, and the result of cluster both includes normal data, also include failure
Data, and fault data classification is bad.Thus currently based on the method for diagnosing faults of data-driven, although being examined in component-level
Preferable effect is achieved in disconnected, but in the diagnosis of complication system grade, it is difficult to obtain and be better than the fault diagnosis based on model-driven
Method.In this regard, Fault Classification of the invention has merged the advantages of data-driven method and model driven method, using existing
The expert knowledge library based on model, to equipment operation data carry out the classification for having supervision (interpretation result supervision), greatly improve
The classification and convergence of data, can improve the bad problem of the classifying quality of current data-driven PHM method.
Detailed description of the invention
Fig. 1 is the relevant parameter Fault Classification that one of embodiment of the present invention is analyzed based on big data Fusion of Clustering
Overview flow chart.
Fig. 2 a- Fig. 2 d is the four repetition Test Drawings choosing cluster sum in the embodiment of the present invention and executing.
Fig. 3 is the operational flowchart of clustering algorithm in the embodiment of the present invention.
Fig. 4 is in the embodiment of the present invention based on mapping-reduction algorithm parameter association probabilistic algorithm figure.
Specific embodiment
With reference to the accompanying drawings and examples to a kind of association ginseng based on the analysis of big data Fusion of Clustering of the present invention
Number Fault Classification is described in detail.
Expert knowledge library is relied in order to solve current equipment failure overdiagnose, and expert knowledge library is difficult to cover each subsystem
Between system depth couple parameter between non-linear correlation relationship the problem of, and using available data driving method complexity be
Ineffective, the status that mass data is not excavated effectively in fault diagnosis of uniting, the present invention provides a kind of definition clearly, real
Operable, with good result, based on the analysis of mass data Fusion of Clustering the relevant parameter Fault Classification in border.
In the present embodiment, the relevant parameter failure modes side provided by the invention based on the analysis of big data Fusion of Clustering
Method uses certain equipment power-supply system to be verified for example.It establishes, fault data screening, gather by data prediction, rule
The processes such as class, mapping, specification form comprehensive failure modes result.
First according to the data sources such as the real-time running data of equipment and direct fault location data, equipment operation data is established
Collection, for based on data-driven model training and verifying.Secondly it is equipped according to object, establishes equipment parameter diagnostic rule library,
Interpretation and detection are carried out for the real time fail to parameter in equipment operation.Then according to diagnostic rule library, equipment was run
The mass data of journey carries out interpretation, therefrom isolates the data entry containing fault parameter.After isolating fault data, use
There is the autonomous clustering method of the machine learning of supervision to carry out failure mode cluster.Fault verification is carried out using the cluster of generation, simultaneously
Generation error parameter matrix, and it is associated Parameter analysis using mapping-specification (Map-Reduce) method, form analysis knot
Fruit.It can thus be appreciated that: Fault Classification of the invention selects number of faults according to diagnostic rule from the mass data of equipment operation
According to, and the machine for carrying out supervision independently clusters, and forms the automatic classification results of relevant parameter failure, is able to solve equipment event at present
Hinder overdiagnose and rely on expert knowledge library, and has ignored the incidence relation between each subsystem between the parameter of depth Non-linear coupling
The problem of, and magnanimity valid data there is no the problem of good digging utilization in practical equipment model operation;Meanwhile by
The precise physical modeling to object equipment is needed not rely upon in the implementation of Fault Classification of the invention, therefore avoids tradition
Complication system difficulty difficult to model.
Refering to what is shown in Fig. 1, the relevant parameter Fault Classification specifically includes:
Step 1) obtains the various operation datas of object equipment;The operation data include direct fault location emulation data,
Analog simulation data, bus monitoring data, BIT, IETM data, maintenance and detection record and existing sensing data etc..
The related data that step 2) is equipped according to object carries out object analysis, establishes the parameter diagnostic rule of object equipment
Library.Rule base should equip the diagnostic rule of whole parameters comprising object, for example including but be not limited to bound (the regulation ginseng of parameter
Several bound extreme value, more than being then the criterion of failure), (regulation parameter is in the short time for parameter jump abnormal determination rule
The situation significantly jumped occurs for interior value, and determines jump degree and failure criterion), parameter trend gradual change abnormal determination rule
Then (failure criterion for the improper trend such as being gradually reduced is sported by being gradually increasing).
It should be noted that this parameter diagnostic rule library is most for the completeness for ensuring final argument association probability model
Low requirement is the single decision rule comprising each parameter.There is no need to require to object equip establish accurate physical model with
Acquire the associative expression formula of parameter.
Step 3) is under the premise of parameter diagnostic rule library is complete, on the basis of parameter diagnostic rule library, screening step 1)
Diagnostic rule in parameter diagnostic rule library can be inputted count at this time by the abnormal data entry in the magnanimity operation data of middle acquisition
Calculation machine, is screened by computer automatic execution.The format of the operation data should meet several following:
1, each complete data entry should include the exact time and fill in the moment object that the data entry occurs
Standby all parameter values;
2, the single data values in each data entry should characterize the actual measurement of a parameter in the object equipment at a certain moment
Value;
3, it is arranged one by one between each data entry according to the sequencing that the moment occurs.
The fault data filtered out should have following format:
1, each entry includes the exact time that the data entry occurs;
2, each entry includes the moment whole fault parameters of failure to have occurred, and is mapped and is advised in order to subsequent
About;
3, is marked by failure generation and is triggered according to parameter diagnostic rule library for the parameter to break down in data entry
Rule (thresholding rule, jump rule etc.).
The data obtained at this time are whole fault datas, are not classified.After obtaining fault data, data are carried out
Cluster operation.
Step 4) will independently be clustered without classification fault data collection by the data that clustering algorithm carries out supervision, be met
It is required that clusters number and every cluster centre after, by determining every cluster centre to the institute concentrated without classification fault data
Faulty data are classified, and classification fault data collection is obtained.
The method that operation uses K-Means is clustered, according to the fault data isolated in previous step, machine is carried out and independently gathers
Class operation.The wherein first step and a most important step are exactly the determination of K value (number of cluster core).K cluster core, it is practical
Characterization is exactly K kind fault condition.
The method that the present invention uses silhouette coefficient to optimize, for choosing K value.The silhouette coefficient of some cluster, refers to
The corresponding vector point of all data entries for including in the cluster to the cluster centre geometric distance average value.It is clustering
Cheng Hou, silhouette coefficient are lower, it was demonstrated that the classifying quality of the cluster is more outstanding.
Refering to what is shown in Fig. 3, the step 4) specifically includes:
Step 101) sets the initial number of clusters number K as 2, according to current K value to no classification since K=2
Fault data collection carries out cluster operation, obtains K cluster centre and its corresponding K cluster.
Step 102) calculates under current K value after the completion of clustering operation, the mean profile coefficient of K cluster.It is poly- by K
Compared with the mean profile coefficient that the mean profile coefficient of class is clustered with K-1, when the increase with K, silhouette coefficient is gradually received
It holds back, when no longer reducing, that is, chooses current K value as cluster sum, re-execute the steps 101) after otherwise setting K=K+1.Such as figure
It shown in 2a, 2b, 2c, 2d, is chosen for K value, has carried out four tests respectively.In four tests shown in the figure, with the increasing of K
Add, the variation of silhouette coefficient can be gradually reduced.When K reaches 11, gradually restrain.
Step 103) carries out cluster operation to without classification fault data collection with the cluster sum determined in step 102), and leads to
It crosses the every cluster centre obtained to classify to all fault datas concentrated without classification fault data, obtains classification number of faults
According to collection.
Based on above-mentioned steps 101), during true defining K value, for each current K value, it is both needed to be clustered
The selection of the heart.It is the selection of initial cluster center (seed point) first.For current K value, need to choose K seed point.Choosing
Taking cluster centre, specific step is as follows:
Step 101-1) first the corresponding vector of a data entry is chosen at random from all operation data libraries that object is equipped
Point is used as first cluster centre, and finds the vector point nearest with the geometric distance of first cluster centre as the second cluster
Center.
Step 101-2) for each vector point, we calculate the geometric distance of itself and a nearest cluster centre
Distance (x), and be stored in an array, these geometric distance Distance (x) addition is then obtained total distance Sum
(Distance(x))。
Step 101-3) random value is taken again, it is calculated with the mode of weight and obtains next cluster centre.This is calculated
The realization of method is to choose a corresponding vector point of data entry that can be fallen in total distance Sum (Distance (x)) immediately
Random, Random=Random-Distance (x), when Random≤0, point at this time is exactly the poly- of next selection
Class center.Repeat step 101-2) and step 101-3), until k cluster centre is selected.
It is the training of cluster in next step after choosing cluster centre.For each fault sample data, its correspondence is calculated
Vector point to the geometric distance of each cluster centre, be referred to apart from nearest cluster centre, then calculated after updating
Cluster geometric center, and substitute with new geometric center the former center of the cluster.Check whether cluster centre changes,
In case of variation (not converged), then constantly repeat the above process.When cluster centre restrains (being no longer changed), cluster
Operation is completed.
By above-mentioned operation, in the K value for having chosen optimization, and after having carried out cluster operation, what we grasped in hand
Valid data include: without classification fault data, the number K of cluster, each vector parameter for clustering core and belonging to each cluster
The detailed entry of (being subordinated to the cluster) fault data for including down.
Followed by be the operation of mapping-specification, the purpose of the operation be in order in the fault data of magnanimity,
It was found that the Non-linear coupling fault correlation relationship between parameter.
Step 5) will use mapping-reduction algorithm without classification fault data collection in step 3), generate parameter association probability
Model, while the parameter association probabilistic model includes that each parameter breaks down in object equipment, other parameters are also sent out
The probability distribution data of raw failure.
Refering to what is shown in Fig. 4, the step 5) specifically includes:
Step 201) carries out mapping operations first, i.e., based on without classification fault data collection, carries out from discrete failure
Mapping of the data to each parameter.According to the order of parameter, will successively distinguish comprising whole fault data entries of each parameter
It is mapped to together, forms the corresponding mapping class of each parameter.Mapping operations the result is that whole number of faults comprising each parameter
According to entry and its frequency of appearance.
By mapping operations, we have grasped the fault entries and its frequency for separately including each parameter.For example,
All fault entries to break down comprising parameter 1, we have been mapped in first mapping ensemblen (in Fig. 4 on the left of the second layer
First mapping ensemblen).All fault entries to break down comprising parameter 2, we have been mapped in second mapping ensemblen (figure
Second mapping ensemblen on the left of the second layer in 4), and so on, obtain the mapping ensemblen of all parameters.
Based on the mapping class that above-mentioned steps obtain, specification operation is carried out.The purpose of specification operation is calculated when certain
The synchronization that one parameter breaks down, the probability that in addition some parameter also breaks down simultaneously.Come between characterization parameter with this
Fault correlation relationship.
Each class that step 202) forms above-mentioned mapping, calculates the sum of fault data entry in each mapping class
(frequency addition), the denominator as probability calculation.
Step 203) adds up time occurred in each mapping class comprising the other parameters in addition to the mapping class corresponds to parameter
Number, its frequency is added, the molecule as probability calculation.
The ratio between molecule in step 203) and denominator in step 202) are obtained each parameter and broken down by step 204)
While, probability distribution data that other parameters also break down.It is (all to break down comprising parameter 1 with first mapping class
Data combination) for, in the mapping class, retrieval include parameter 2 combination, its frequency is added, as molecule, divided by this
The sum of class fault entries, while being broken down with this calculating parameter 1, probability that parameter 2 also breaks down.Parameter has been calculated
After 2, calculating parameter 3 to parameter s (has traversed all parameters).The fault correlation parameter list of parameter 1 is consequently formed.
And so on, from the 2nd to s-th of mapping class, carry out identical specification operation.Form the fail close of s parameter
Join parameter list.
The training part of data has been completed as a result, we have grasped the cluster of the K kind failure of K-Means generation, and
The parameter association probabilistic model that mapping-specification generates.Next it can use equipment operation data collection, carry out actual failure and examine
Disconnected and verifying.
Step 6) is using the classification fault data collection in step 4) as fault distinguishing standard, to all operations in step 1)
Data carry out fault category identification using nearest neighbor algorithm, obtain failure modes result.During actual motion, for one
The new operation data entry of item can use nearest neighbor algorithm, calculate separately the geometry of itself and the cluster centre of K fault cluster
Distance takes the smallest distance value (arest neighbors).If this minimum value is less than the silhouette coefficient of the cluster, that is, it can determine that operation number
According to for fault type corresponding to the cluster, fault diagnosis is carried out with this.
Step 7) combines failure modes result with the parameter association probabilistic model in step 5), obtains comprehensive diagnosis
As a result.Comprehensive diagnostic result includes: failure modes result, main fault parameter and with major failure parameter association probability
The parameter of larger (probability threshold value can adjust according to the actual situation).
In conclusion according to the relevant parameter failure modes side provided by the invention based on the analysis of big data Fusion of Clustering
Method realizes the intelligent fault classification and relevant parameter analysis excavated based on mass data.With the controllable failure of accuracy rate point
Class ability.And for the failure sorted out, according to parameter association probabilistic model, the association that can provide dependent failure parameter is general
Rate, to improve the formulation of the intelligent diagnostics and maintenance decision of failure.
It should be noted last that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting.Although ginseng
It is described the invention in detail according to embodiment, those skilled in the art should understand that, to technical side of the invention
Case is modified or replaced equivalently, and without departure from the spirit and scope of technical solution of the present invention, should all be covered in the present invention
Scope of the claims in.
Claims (7)
1. a kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering characterized by comprising
Step 1) obtains the various operation datas of object equipment;
The related data that step 2) is equipped according to object is established and covers the parameter diagnostic rule library that object equips whole parameters;
Step 3) is subject to the rule in parameter diagnostic rule library, obtains number of faults to operation data screening all in step 1)
According to by the formation of all fault data set without classification fault data collection;
Step 4) will independently be clustered without classification fault data collection by the data that clustering algorithm carries out supervision, be met the requirements
Clusters number and every cluster centre after, by determining every cluster centre to without concentrate all of classification fault data therefore
Barrier data are classified, and classification fault data collection is obtained;
Step 5) will use mapping-reduction algorithm without classification fault data collection in step 3), generate parameter association probabilistic model,
While the parameter association probabilistic model includes that each parameter breaks down in object equipment, other parameters also break down
Probability distribution data;The step 5) specifically includes:
Whole fault data entries comprising each parameter are successively respectively mapped to together, form each parameter pair by step 201)
The mapping class answered, the mapping class include whole fault data entries an of parameter and its frequency of appearance;
Step 202) calculates the sum of fault data entry in each mapping class, the denominator as probability calculation;
Step 203) adds up the number occurred in each mapping class comprising the other parameters in addition to the mapping class corresponds to parameter, makees
For the molecule of probability calculation;
Step 204) by the molecule in step 203) and the ratio between denominator in step 202), obtain each parameter break down it is same
When, probability distribution data that other parameters also break down;
Step 6) is using the classification fault data collection in step 4) as fault distinguishing standard, to all operation datas in step 1)
Fault category identification is carried out using nearest neighbor algorithm, obtains failure modes result;
Step 7) combines failure modes result with the parameter association probabilistic model in step 5), obtains and the failure modes occur
As a result the probability distribution data of all parameters.
2. the relevant parameter Fault Classification according to claim 1 based on the analysis of big data Fusion of Clustering, feature
Be, the format of the operation data obtained in the step 1) meets: each complete data entry is sent out comprising the data entry
All parameter values equipped at the time of raw and in the moment object;Single data values in each data entry characterize certain for the moment
The measured value of a parameter in the object-based device at quarter;It is arranged one by one between each data entry according to the sequencing that the moment occurs.
3. the relevant parameter Fault Classification according to claim 1 based on the analysis of big data Fusion of Clustering, feature
Be, the format of the fault data screened in the step 3) meets: each data entry include the data entry occur when
The whole fault parameters carved and broken down at the moment;For the parameter to break down in data entry, sentenced according to parameter
It reads rule base mark failure and triggered rule occurs.
4. the relevant parameter Fault Classification according to claim 1 based on the analysis of big data Fusion of Clustering, feature
It is, the parameter diagnostic rule library includes the bound of parameter, parameter jump abnormal determination is regular, parameter trend gradual change is different
Normal decision rule.
5. the relevant parameter Fault Classification according to claim 1 based on the analysis of big data Fusion of Clustering, feature
It is, the step 4) specifically includes:
Step 101) sets the initial number of clusters number K as 2, carries out according to current K value to without classification fault data collection
Operation is clustered, K cluster centre and its corresponding K cluster are obtained;
Step 102) calculates the mean profile coefficient of K cluster, and the mean profile coefficient that K is clustered clusters flat with K-1
Equal silhouette coefficient compares, if two mean profile coefficients are constant, choose current K value as cluster sum, otherwise sets K
It is re-execute the steps 101) after=K+1;The silhouette coefficient indicate all data entries for including in each cluster it is corresponding to
Amount point arrives the average value of the geometric distance of cluster centre;
Step 103) carries out cluster operation to without classification fault data collection with the cluster sum determined in step 102), and by obtaining
The every cluster centre taken classifies to all fault datas concentrated without classification fault data, obtains classification fault data
Collection.
6. the relevant parameter Fault Classification according to claim 5 based on the analysis of big data Fusion of Clustering, feature
It is, the operating procedure that cluster centre is obtained in the step 101) includes:
Step 101-1) from object equip all operation datas in select the corresponding vector point conduct of a data entry at random
First cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as the second cluster centre;
Step 101-2) the geometric distance Distance (x) that calculates each cluster centre cluster centre nearest with it, will own
Geometric distance Distance (x), which is added, obtains total distance Sum (Distance (x));
Step 101-3) randomly select the corresponding vector of data entry that can be fallen in total distance Sum (Distance (x))
Point Random re-execute the steps 101-2 as the cluster centre newly increased), until picking out K cluster centre.
7. the relevant parameter Fault Classification according to claim 1 based on the analysis of big data Fusion of Clustering, feature
Be, the step 6) specifically includes: calculate step 1) in all operation datas with determination every cluster centre it is several
What distance, takes the smallest distance value to be compared with the mean profile coefficient of corresponding cluster, corresponds to if the distance value is less than
Cluster mean profile coefficient, then determine operation data be the cluster corresponding to fault type.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611247433.2A CN106845526B (en) | 2016-12-29 | 2016-12-29 | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611247433.2A CN106845526B (en) | 2016-12-29 | 2016-12-29 | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106845526A CN106845526A (en) | 2017-06-13 |
CN106845526B true CN106845526B (en) | 2019-12-03 |
Family
ID=59114134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611247433.2A Active CN106845526B (en) | 2016-12-29 | 2016-12-29 | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106845526B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110018980B (en) * | 2017-12-25 | 2021-07-27 | 北京金风科创风电设备有限公司 | Method and device for searching fault data from simulation data of fan controller |
WO2019167180A1 (en) * | 2018-02-28 | 2019-09-06 | 日産自動車株式会社 | Abnormality type determining device and abnormality type determining method |
CN108763289B (en) * | 2018-04-13 | 2021-11-23 | 西安电子科技大学 | Massive heterogeneous sensor format data analysis method |
CN109445306B (en) * | 2018-10-26 | 2022-01-25 | 湖南磁浮技术研究中心有限公司 | Automatic associated parameter interpretation method and system based on rule configuration analysis |
CN109991951B (en) * | 2019-04-28 | 2020-10-02 | 齐鲁工业大学 | Multi-source fault detection and diagnosis method and device |
CN110263944A (en) * | 2019-05-21 | 2019-09-20 | 中国石油大学(华东) | A kind of multivariable failure prediction method and device |
CN113392208A (en) * | 2020-03-12 | 2021-09-14 | 中国移动通信集团云南有限公司 | Method, device and storage medium for IT operation and maintenance fault processing experience accumulation |
CN113282433B (en) * | 2021-06-10 | 2023-04-28 | 天翼云科技有限公司 | Cluster anomaly detection method, device and related equipment |
CN113421176B (en) * | 2021-07-16 | 2022-11-01 | 昆明学院 | Intelligent screening method for abnormal data in student score scores |
CN113656389B (en) * | 2021-08-12 | 2022-05-27 | 北京可视化智能科技股份有限公司 | Intelligent factory abnormal data processing method, device and system and storage medium |
CN116483705B (en) * | 2023-04-17 | 2024-10-11 | 哈尔滨工业大学 | Knowledge and model driven airborne software intelligent failure mode analysis method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701157A (en) * | 2015-12-30 | 2016-06-22 | 芜湖乐锐思信息咨询有限公司 | Monitoring system for integrating social network site information |
CN105718935A (en) * | 2016-01-25 | 2016-06-29 | 南京信息工程大学 | Word frequency histogram calculation method suitable for visual big data |
CN105891629B (en) * | 2016-03-31 | 2017-12-29 | 广西电网有限责任公司电力科学研究院 | A kind of discrimination method of transformer equipment failure |
CN106021062B (en) * | 2016-05-06 | 2018-08-07 | 广东电网有限责任公司珠海供电局 | The prediction technique and system of relevant fault |
CN106251034A (en) * | 2016-07-08 | 2016-12-21 | 大连大学 | Wisdom energy saving electric meter monitoring system based on cloud computing technology |
-
2016
- 2016-12-29 CN CN201611247433.2A patent/CN106845526B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN106845526A (en) | 2017-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106845526B (en) | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering | |
CN106355030B (en) | A kind of fault detection method based on analytic hierarchy process (AHP) and Nearest Neighbor with Weighted Voting Decision fusion | |
CN103914064B (en) | Based on the commercial run method for diagnosing faults that multi-categorizer and D-S evidence merge | |
EP2930578B1 (en) | Failure cause classification apparatus | |
CN111507376B (en) | Single-index anomaly detection method based on fusion of multiple non-supervision methods | |
CN109416531A (en) | The different degree decision maker of abnormal data and the different degree determination method of abnormal data | |
CN107967485A (en) | Electro-metering equipment fault analysis method and device | |
CN107430715A (en) | Cascade identification in building automation | |
CN114358152A (en) | Intelligent power data anomaly detection method and system | |
CN111858231A (en) | Single index abnormality detection method based on operation and maintenance monitoring | |
CN106404441B (en) | A kind of failure modes diagnostic method based on non-linear similarity index | |
CN110455537A (en) | A kind of Method for Bearing Fault Diagnosis and system | |
CN113255848A (en) | Water turbine cavitation sound signal identification method based on big data learning | |
CN111191726B (en) | Fault classification method based on weak supervision learning multilayer perceptron | |
CN101021723A (en) | Melt index detection fault diagnozing system and method in propylene polymerization production | |
CN109240276B (en) | Multi-block PCA fault monitoring method based on fault sensitive principal component selection | |
CN110163075A (en) | A kind of multi-information fusion method for diagnosing faults based on Weight Training | |
CN112906764B (en) | Communication safety equipment intelligent diagnosis method and system based on improved BP neural network | |
CN110490486B (en) | Enterprise big data management system | |
CN108334898A (en) | A kind of multi-modal industrial process modal identification and Fault Classification | |
CN101738998A (en) | System and method for monitoring industrial process based on local discriminatory analysis | |
CN112257767A (en) | Product key part state classification method aiming at class imbalance data | |
CN116341901A (en) | Integrated evaluation method for landslide surface domain-monomer hazard early warning | |
CN114266289A (en) | Complex equipment health state assessment method | |
CN109871002A (en) | The identification of concurrent abnormality and positioning system based on the study of tensor label |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |