CN106845526A - A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering - Google Patents
A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering Download PDFInfo
- Publication number
- CN106845526A CN106845526A CN201611247433.2A CN201611247433A CN106845526A CN 106845526 A CN106845526 A CN 106845526A CN 201611247433 A CN201611247433 A CN 201611247433A CN 106845526 A CN106845526 A CN 106845526A
- Authority
- CN
- China
- Prior art keywords
- data
- parameter
- fault
- cluster
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24143—Distances to neighbourhood prototypes, e.g. restricted Coulomb energy networks [RCEN]
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Test And Diagnosis Of Digital Computers (AREA)
- Testing And Monitoring For Control Systems (AREA)
Abstract
A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering that the present invention is provided, Fault Classification of the invention is from the mass data of equipment operation, fault data is selected according to diagnostic rule, and carry out the machine of supervision and independently cluster, form the automatic classification results of relevant parameter failure, can solve the problem that current equipment failure overdiagnose relies on expert knowledge library, and have ignored the problem of incidence relation between each subsystem between the parameter of depth Non-linear coupling, and magnanimity valid data there is no the problem of good digging utilization in actual equipment model operation;Simultaneously, because the implementation of Fault Classification of the invention need not rely upon the precise physical modeling to object equipment, therefore the difficulty that traditional complication system is difficult to model is avoided, the intelligent fault classification and relevant parameter analysis excavated based on mass data are realized, with the controllable failure modes ability of accuracy rate.
Description
Technical field
The present invention relates to equipment failure prediction and health control (PHM) field, and in particular to a kind of based on big data fusion
The relevant parameter Fault Classification of cluster analysis.
Background technology
Failure predication and health control have been developed as aerospace field system logistics support, safeguard and autonomous health
The important support technology of management and basis, in " National Program for Medium-to Long-term Scientific and Technological Development 2006-2020 ", " weight
Big product and great installation forecasting technique in life span " proposes to be sent out in space flight in recent years, Aeronautics subject as cutting edge technology
In exhibition report, PHM technologies are classified as crucial and support technology.
PHM technologies have become one and cover basic material, mechanical structure, the energy, electronics, automatic test, reliability, letter
The multi-field cross discipline such as breath and research hot topic direction, with important application value and realistic meaning.In most work
In industry system PHM applications, the mathematics or physical model for setting up complex component or system are very difficult or even cannot realize, or identification
The parameter of model is complex, therefore, the test data in each stage such as part or system design, emulation, operation and maintenance, biography
Sensor historical data just turns into grasps the Main Means that systematic function declines.
Thus, the PHM methods based on test or sensor historic data mining are gradually paid attention to and are obtained fast development,
Important research focus as PHM fields.Especially for complication systems such as Aero-Space, it is difficult to direct access or structure are characterized
The physical model of part, system degradation and residual life, meanwhile, these objective systems and part possess a large amount of available state prisons
Survey and test data, therefore, the PHM method systems based on data-driven obtain U.S. army, NASA and numerous grind
Study carefully the extensive attention of mechanism, industrial enterprise.
Data-driven PHM methods are to be gathered and obtained the feature ginseng relevant with system property based on advanced sensor technology
Number, and these characteristic parameters are associated with useful information, detected, analyzed and predicted by intelligent algorithm and model, be given
The probability of the residual life distribution, performance degradation degree or task failure of goal systems, so as to be maintenance system and system security
Decision information is provided.
In the middle of data-driven PHM method systems, method flow, distinct methods fusion, model selection, Model suitability etc.
Problem has become the research emphasis in the field now, and data-driven PHM methods are obtained with its flexible adaptability and ease for use
Obtained and be widely applied and promote.
The content of the invention
It is an object of the present invention to there is fault data acquisition difficulty to solve existing data-driven PHM methods
Technical problem, the present invention provides a kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering, for improving
Existing complex equipment contains the present situation that the service data of magnanimity information is not excavated effectively, effectively utilized.
To achieve the above object, the invention provides the algorithm flow of complete set, computing and parsing are carried out, is drawn final
Failure modes and parameter association probabilistic model.Described relevant parameter Fault Classification includes:
Step 1) obtain the various service datas that object is equipped.
Step 2) according to the design data of object equipment, the parameter diagnostic rule of the whole parameters of object equipment is covered in foundation
Storehouse.Parameter diagnostic rule storehouse not only includes the thresholding judgment rule of parameter, while the also Trend judgement rule comprising parameter and jump
Become judgment rule.
Step 3) be defined by the rule in parameter diagnostic rule storehouse, to step 1) in the screening of all of service data obtain failure
Data, all fault data set are formed without classification fault data collection.
Step 4) independently clustered by the data that clustering algorithm carries out supervision with without classification fault data collection, met
It is required that clusters number and every cluster centre.The number of cluster gradually increases to appropriate number since 2, and final selection makes
The minimum value that each cluster core average weighted distance no longer reduces is the sum of cluster.Meanwhile, by the every cluster centre for determining
Classification fault data collection is obtained to carrying out classification without classification fault data collection.
Step 5) by step 3) in use mapping-reduction algorithm, generation parameter association probability without classification fault data collection
Model, while described parameter association probabilistic model breaks down comprising every parameter in object equipment, other specification is also sent out
The probability distribution data of raw failure, data are arranged in probability form from high to low.
Step 6) using step 4) in classification fault data collection as fault distinguishing standard, using nearest neighbor algorithm, to step
Rapid 1) the middle service data for obtaining carries out fault category identification, obtains failure modes result.
Step 7) according to failure modes result with step 5) in parameter association probabilistic model be combined, draw comprehensive event
Barrier diagnostic classification result.The content that comprehensive diagnos classification results are included is:Failure modes result, the failure modes result it is all
The probability distribution data of parameter.
As the further improvement of above-mentioned technical proposal, the step 1) in obtain service data form meet:Often
Moment and all parameter values in moment object equipment that individual complete Data Entry occurs comprising the Data Entry;Each
A measured value for parameter in the object-based device at the single data values sign a certain moment in Data Entry;Between each Data Entry
Sequencing according to there is the moment is arranged one by one.
As the further improvement of above-mentioned technical proposal, the step 3) in screening fault data form meet:Often
Moment of the individual Data Entry comprising Data Entry generation and the whole fault parameters broken down at the moment;For data
The parameter broken down in entry, marks failure and triggered rule occurs according to parameter diagnostic rule storehouse.
As the further improvement of above-mentioned technical proposal, described bound of the parameter diagnostic rule storehouse comprising parameter, ginseng
Number saltus step unusual determination rule, parameter trend gradual change unusual determination rule.
As the further improvement of above-mentioned technical proposal, described step 4) specifically include:
Step 101) initial number that sets clusters number K is 2, according to current K values to without classification fault data collection
Cluster computing is carried out, K cluster centre and its corresponding K cluster is obtained;
Step 102) the K mean profile coefficient of cluster is calculated, by the K mean profile coefficient of cluster and K-1 cluster
Mean profile coefficient compare, if two mean profile coefficients are constant, choose current K values as cluster sum, otherwise
Step 101 is re-executed after setting K=K+1);Described silhouette coefficient represents all Data Entries correspondence included in each cluster
Vector point to the geometric distance of cluster centre average value;
Step 103) with step 102) the middle cluster for determining is total to carrying out cluster computing without classification fault data collection, and lead to
Cross the every cluster centre for obtaining to classify all fault datas concentrated without classification fault data, obtain classification number of faults
According to collection.
As the further improvement of above-mentioned technical proposal, described step 101) the middle operating procedure bag for obtaining cluster centre
Include:
Step 101-1) random from all service datas of object equipment select the corresponding vector point of Data Entry
As first cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as in the second cluster
The heart;
Step 101-2) geometric distance Distance (x) of each cluster centre cluster centre nearest with it is calculated, will
All geometric distances Distance (x) are added and obtain always apart from Sum (Distance (x));
Step 101-3) randomly select one can fall it is corresponding in the Data Entry always in Sum (Distance (x))
Vector point Random, as the cluster centre for newly increasing, re-executes step 101-2), until in picking out K cluster
The heart.
As the further improvement of above-mentioned technical proposal, described step 5) specifically include:
Step 201) the whole fault data entries comprising each bar parameter are respectively mapped to together, form each bar ginseng successively
The frequency of the corresponding mapping classes of number, the whole fault data entries of described mapping class comprising parameter and its appearance;
Step 202) sum of fault data entry in each mapping class is calculated, as the denominator of probability calculation;
Step 203) add up in each mapping class comprising the secondary of the other specification appearance in addition to mapping class correspondence parameter
Number, as the molecule of probability calculation;
Step 204) by step 203) in molecule and step 202) in the ratio between denominator, obtain each parameter and break down
While, the probability distribution data that other specification also breaks down.
As the further improvement of above-mentioned technical proposal, described step 6) specifically include:Calculation procedure 1) in it is all
Service data and the geometric distance of the every cluster centre for determining, take the distance value of minimum and the mean profile system of corresponding cluster
Number is compared, if the distance value is less than the mean profile coefficient of corresponding cluster, judges that service data is the cluster institute
Corresponding fault type.
It is of the invention it is a kind of based on big data Fusion of Clustering analysis relevant parameter Fault Classification advantage be:
The invention provides it is a kind of define clearly, it is actual it is exercisable, with good result based on mass data
The relevant parameter Fault Classification of Fusion of Clustering analysis, following some technology for improving existing method for diagnosing faults presence is asked
Topic:
1. current equipment failure overdiagnose relies on expert knowledge library, and expert knowledge library is when in face of complication system, face
Face multiple shot array problem, it is difficult to the whole failure situations of covering and its relevant parameter, have ignored the ginseng of depth coupling between each subsystem
The problem of the non-linear correlation relation between number.In this regard, Fault Classification of the invention is by data mining means, excavate not
With parameter association relation and its fault mode between subsystem such that it is able to be effectively improved above mentioned problem.
2. available data drives PHM methods office to be only limitted to component-level fault diagnosis, and in the fault diagnosis of complication system level
During, due to there is the difficulty to complication system entirety Accurate Model, for the variety classes event being mingled in normal data
Barrier data rely primarily on the machine learning clustering method of non-supervisory formula, and the result of cluster both includes normal data, also comprising failure
Data, and fault data classification is not good.Thus it is currently based on the method for diagnosing faults of data-driven, although examine in component-level
Preferable effect is achieved in disconnected, but in complication system level diagnosis, it is difficult to obtain better than the fault diagnosis based on model-driven
Method.In this regard, Fault Classification of the invention has merged the advantage of data-driven method and model driven method, using existing
The expert knowledge library based on model, the classification that has supervision (sentence read result supervision) is carried out to equipment service data, greatly improve
The classification and convergence of data, can improve the not good problem of the classifying quality of current data-driven PHM methods.
Brief description of the drawings
Fig. 1 is a kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering in the embodiment of the present invention
Overview flow chart.
Fig. 2 a- Fig. 2 d are four repetition Test Drawings of selection cluster sum execution in the embodiment of the present invention.
Fig. 3 is the operational flowchart of clustering algorithm in the embodiment of the present invention.
Fig. 4 is the parameter association probabilistic algorithm figure based on mapping-reduction algorithm in the embodiment of the present invention.
Specific embodiment
With reference to the accompanying drawings and examples to a kind of association ginseng based on the analysis of big data Fusion of Clustering of the present invention
Number Fault Classification is described in detail.
Expert knowledge library is relied in order to solve current equipment failure overdiagnose, and expert knowledge library is difficult to cover each subsystem
The problem of the non-linear correlation relation between system between the parameter of depth coupling, and be in complexity using available data driving method
Effect on driving birds is not good in system fault diagnosis, the present situation that mass data is not excavated effectively, the invention provides one kind definition clearly, in fact
Border is exercisable, with good result, relevant parameter Fault Classification based on the analysis of mass data Fusion of Clustering.
In the present embodiment, the relevant parameter failure modes side based on the analysis of big data Fusion of Clustering that the present invention is provided
Method, uses certain equipment power-supply system for example is verified.Set up by data prediction, rule, fault data is screened, poly-
The processes such as class, mapping, stipulations, form comprehensive failure modes result.
The data sources such as real-time running data and direct fault location data first according to equipment, set up equipment service data
Collection, for the model training based on data-driven and checking.Secondly equipped according to object, set up equipment parameter diagnostic rule storehouse,
Interpretation and detection are carried out for the real time fail to parameter in equipment operation.Then according to diagnostic rule storehouse, equipment was run
The mass data of journey carries out interpretation, therefrom isolates the Data Entry containing fault parameter.After fault data is isolated, use
The autonomous clustering method of machine learning for having supervision carries out failure mode cluster.Fault verification is carried out using the cluster of generation, while
Generation error parameter matrix, and Parameter analysis are associated using mapping-stipulations (Map-Reduce) method, form analysis knot
Really.It can thus be appreciated that:Fault Classification of the invention selects number of faults from the mass data of equipment operation according to diagnostic rule
According to, and carry out the machine of supervision and independently cluster, form the automatic classification results of relevant parameter failure, can solve the problem that equipment at present therefore
Barrier overdiagnose relies on expert knowledge library, and have ignored the incidence relation between each subsystem between the parameter of depth Non-linear coupling
Problem, and magnanimity valid data there is no the problem of good digging utilization in actual equipment model operation;Meanwhile, by
The precise physical modeling to object equipment is need not rely upon in the implementation of Fault Classification of the invention, therefore avoids tradition
Complication system is difficult to the difficulty for modeling.
With reference to shown in Fig. 1, described relevant parameter Fault Classification is specifically included:
Step 1) obtain the various service datas that object is equipped;Described service data include direct fault location emulation data,
Analog simulation data, bus monitoring data, BIT, IETM data, maintenance and detection record and existing sensing data etc..
Step 2) according to the related data of object equipment, object analysis are carried out, set up the parameter diagnostic rule of object equipment
Storehouse.Rule base should include the diagnostic rule that object equips whole parameters, and the bound of such as including but not limited to parameter (specifies ginseng
Several bound extreme values, more than then be failure criterion), parameter saltus step unusual determination rule (regulation parameter in the short time
There is the situation of significantly saltus step in interior value, and determine saltus step degree and failure criterion), parameter trend gradual change unusual determination rule
Then (sport the failure criterion for the improper trend such as being gradually reduced by being gradually increasing).
It should be noted that the completeness to ensure final argument association probability model, this parameter diagnostic rule storehouse is most
Low requirement is the single decision rule comprising each parameter.Therefore need not require to object equipment set up accurate physical model with
Try to achieve the associative expression formula of parameter.
Step 3) on the premise of parameter diagnostic rule storehouse is complete, on the basis of parameter diagnostic rule storehouse, screen step 1)
, can now be input into for the diagnostic rule in parameter diagnostic rule storehouse and count by the abnormal data entry in the magnanimity service data of middle acquisition
Calculation machine, is screened by computer automatic execution.The form of the service data should meet following several:
1st, each complete Data Entry should include the exact time of Data Entry generation and in moment object dress
Standby all parameter values;
2nd, the single data values in each Data Entry should characterize a certain moment object equipment in an actual measurement for parameter
Value;
3rd, arranged one by one according to the sequencing that the moment occurs between each Data Entry.
The fault data for filtering out should possess following form:
1st, the exact time that each entry occurs comprising the Data Entry;
2nd, each entry there occurs whole fault parameters of failure comprising the moment, in order to subsequently being mapped and being advised
About;
3rd, for the parameter broken down in Data Entry, according to parameter diagnostic rule storehouse, mark failure occurs to be triggered
Rule (thresholding rule, saltus step rule etc.).
The data for now obtaining are whole fault datas, are not classified.After fault data is obtained, data are carried out
Cluster computing.
Step 4) will independently be clustered by the data that clustering algorithm carries out supervision without classification fault data collection, met
It is required that clusters number and every cluster centre after, by every cluster centre for determining to the institute that is concentrated without classification fault data
Faulty data are classified, and obtain classification fault data collection.
Cluster computing, according to the fault data isolated in previous step, is carried out machine and independently gathered using the method for K-Means
Class computing.Wherein the first step, is also a most important step, is exactly the determination of K values (number of cluster core).K cluster core, it is actual
What is characterized is exactly K kind failure situations.
The method that the present invention is optimized using silhouette coefficient, for choosing K values.The silhouette coefficient of certain cluster, refers to
Average value of the corresponding vector point of all Data Entries included in the cluster to the geometric distance of the cluster centre.Clustering
Cheng Hou, silhouette coefficient is lower, it was demonstrated that the classifying quality of the cluster is more outstanding.
With reference to shown in Fig. 3, described step 4) specifically include:
Step 101) since K=2, the initial number for setting clusters number K is 2, according to current K values to without classification
Fault data collection carries out cluster computing, obtains K cluster centre and its corresponding K cluster.
Step 102) after the completion of computing is clustered, calculate under current K values, the K mean profile coefficient of cluster.It is poly- by K
The mean profile coefficient of class with K-1 cluster mean profile coefficient compared with, when with the increase of K, silhouette coefficient is gradually received
Hold back, when no longer reducing, that is, choose current K values as cluster sum, step 101 is re-executed after otherwise setting K=K+1).As schemed
It is that K values are chosen shown in 2a, 2b, 2c, 2d, four experiments has been carried out respectively.In four experiments shown in the figure, with the increasing of K
Plus, the change of silhouette coefficient can be gradually reduced.When K reaches 11, gradually restrain.
Step 103) with step 102) the middle cluster for determining is total to carrying out cluster computing without classification fault data collection, and lead to
Cross the every cluster centre for obtaining to classify all fault datas concentrated without classification fault data, obtain classification number of faults
According to collection.
Based on above-mentioned steps 101), during true defining K value, for each current K value, in being both needed to be clustered
The selection of the heart.First it is the selection of initial cluster center (seed point).For current K values, it is necessary to choose K seed point.Choosing
Take comprising the following steps that for cluster centre:
Step 101-1) first random from all runtime databases of object equipment choose the corresponding vector of Data Entry
O'clock as first cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as the second cluster
Center.
Step 101-2) for each vector point, we calculate itself and a nearest geometric distance for cluster centre
Distance (x), and be stored in an array, then these geometric distances Distance (x) are added and are obtained always apart from Sum
(Distance(x))。
Step 101-3) random value is taken again, the next cluster centre of acquisition is calculated with the mode of weight.This is calculated
The realization of method is to choose a corresponding vector point of Data Entry that can fall always in Sum (Distance (x)) immediately
Random, Random=Random-Distance (x), until Random<When=0, point now is exactly the poly- of next selection
Class center.Repeat step 101-2) and step 101-3), until k cluster centre is selected.
After cluster centre is chosen, next step is the training of cluster.For each fault sample data, its correspondence is calculated
Vector point to the geometric distance of each cluster centre, be referred to closest cluster centre, then calculate update after
Cluster geometric center, and the former center of the cluster is substituted with new geometric center.Check whether cluster centre changes,
In the event of change (not converged), then said process is constantly repeated.When cluster centre convergence (no longer changing), cluster
Computing is completed.
By above-mentioned computing, have chosen the K values of optimization, and after having carried out cluster computing, grasped in our hands
Valid data include:Without classification fault data, the number K of cluster, the vector parameter of each cluster core and belong to each and cluster it
The detailed entry of (being subordinated to the cluster) fault data for including down.
Followed by be mapping-stipulations computing, the purpose of the computing be in order to from the middle of the fault data of magnanimity,
It was found that the Non-linear coupling fault correlation relation between parameter.
Step 5) by step 3) in use mapping-reduction algorithm, generation parameter association probability without classification fault data collection
Model, while described parameter association probabilistic model breaks down comprising each parameter in object equipment, other specification is also sent out
The probability distribution data of raw failure.
With reference to shown in Fig. 4, described step 5) specifically include:
Step 201) carry out mapping operations first, i.e., based on without classification fault data collection, carry out from discrete failure
Mapping of the data to each parameter.According to the order of parameter, the whole fault data entries comprising each bar parameter are distinguished successively
It is mapped to together, forms the corresponding mapping class of each bar parameter.The result of mapping operations is the whole number of faults comprising each parameter
According to entry and its frequency of appearance.
By mapping operations, we have grasped fault entries and its frequency comprising each parameter respectively.For example,
The all of fault entries broken down comprising parameter 1, (second layer left side in Fig. 4 during we have been mapped to first mapping ensemblen
First mapping ensemblen).The all of fault entries broken down comprising parameter 2, we have been mapped to (figure in second mapping ensemblen
Second mapping ensemblen in second layer left side in 4), by that analogy, obtain the mapping ensemblen of all parameters.
Based on the mapping class obtained by above-mentioned steps, stipulations computing is carried out.The purpose of stipulations computing, is calculated when certain
The synchronization that one parameter breaks down, the probability that certain parameter also breaks down simultaneously in addition.Come between characterization parameter with this
Fault correlation relation.
Step 202) for each class of above-mentioned mapping formation, calculate the sum of fault data entry in each mapping class
(frequency addition), as the denominator of probability calculation.
Step 203) add up in each mapping class comprising the secondary of the other specification appearance in addition to mapping class correspondence parameter
Number, its frequency is added, as the molecule of probability calculation.
Step 204) by step 203) in molecule and step 202) in the ratio between denominator, obtain each parameter and break down
While, the probability distribution data that other specification also breaks down.It is (all to be broken down comprising parameter 1 with first mapping class
Data combination) as a example by, in the mapping class, retrieval comprising parameter 2 combination, by its frequency be added, as molecule, divided by this
The sum of class fault entries, while being broken down with this calculating parameter 1, the probability that parameter 2 also breaks down.Parameter is calculated
After 2, calculating parameter 3 to parameter s (has traveled through all parameters).It is consequently formed the fault correlation parameter list of parameter 1.
By that analogy, from the 2nd to s-th mapping class, identical stipulations computing is carried out.Form the s fail close of parameter
Connection parameter list.
Thus, the training part of data has completed, and we have grasped the cluster of the K kind failures of K-Means generations, and
The parameter association probabilistic model of mapping-stipulations generation.Next can carry out actual failure and examine using service data collection is equipped
Disconnected and checking.
Step 6) using step 4) in classification fault data collection as fault distinguishing standard, to step 1) in all operations
Data carry out fault category identification using nearest neighbor algorithm, obtain failure modes result.During actual motion, for one
The new service data entry of bar, can use nearest neighbor algorithm, and it is calculated respectively with the K geometry of the cluster centre of fault cluster
Distance, takes the distance value (arest neighbors) of minimum.If this minimum value is less than the silhouette coefficient of the cluster, you can judge operation number
According to the fault type corresponding to the cluster, fault diagnosis is carried out with this.
Step 7) by failure modes result with step 5) in parameter association probabilistic model be combined, draw comprehensive diagnosis
As a result.Comprehensive diagnostic result includes:Failure modes result, main fault parameter and with major failure parameter association probability
The parameter of larger (probability threshold value can be adjusted according to actual conditions).
In sum, the relevant parameter failure modes side analyzed based on big data Fusion of Clustering for being provided according to the present invention
Method, realizes the intelligent fault classification and relevant parameter analysis excavated based on mass data.With the controllable failure of accuracy rate point
Class ability.And for the failure for sorting out, according to parameter association probabilistic model, the association that can provide dependent failure parameter is general
Rate, so as to improve the formulation of the intelligent diagnostics and maintenance decision of failure.
It should be noted last that, the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted.Although ginseng
The present invention has been described in detail according to embodiment, it will be understood by those within the art that, to technical side of the invention
Case is modified or equivalent, and without departure from the spirit and scope of technical solution of the present invention, it all should cover in the present invention
Right in the middle of.
Claims (8)
1. it is a kind of based on big data Fusion of Clustering analysis relevant parameter Fault Classification, it is characterised in that including:
Step 1) obtain the various service datas that object is equipped;
Step 2) according to the related data of object equipment, the parameter diagnostic rule storehouse of the whole parameters of object equipment is covered in foundation;
Step 3) be defined by the rule in parameter diagnostic rule storehouse, to step 1) in the screening of all of service data obtain number of faults
According to by all fault data set formation without classification fault data collection;
Step 4) will independently be clustered by the data that clustering algorithm carries out supervision without classification fault data collection, obtain and meet requirement
Clusters number and every cluster centre after, by every cluster centre for determining to concentrated without classification fault data all therefore
Barrier data are classified, and obtain classification fault data collection;
Step 5) by step 3) in use mapping-reduction algorithm without classification fault data collection, generation parameter association probability model,
While described parameter association probabilistic model breaks down comprising each parameter in object equipment, other specification also breaks down
Probability distribution data;
Step 6) using step 4) in classification fault data collection as fault distinguishing standard, to step 1) in all service datas
Fault category identification is carried out using nearest neighbor algorithm, failure modes result is obtained;
Step 7) by failure modes result with step 5) in parameter association probabilistic model be combined, obtain occur the failure modes
The probability distribution data of all parameters of result.
2. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
Be, the step 1) in obtain service data form meet:Each complete Data Entry is sent out comprising the Data Entry
Raw moment and all parameter values in moment object equipment;Single data values in each Data Entry characterize certain for the moment
A measured value for parameter in the object-based device at quarter;Arranged one by one according to the sequencing that the moment occurs between each Data Entry.
3. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
Be, the step 3) in screening fault data form meet:Each Data Entry comprising the Data Entry occur when
The whole fault parameters carved and broken down at the moment;For the parameter broken down in Data Entry, sentenced according to parameter
Read rule base mark failure and triggered rule occurs.
4. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
It is that the bound of described parameter diagnostic rule storehouse comprising parameter, parameter saltus step unusual determination rule, parameter trend gradual change are different
Normal decision rule.
5. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
It is, described step 4) specifically include:
Step 101) initial number that sets clusters number K is 2, according to current K values to being carried out without classification fault data collection
Cluster computing, obtains K cluster centre and its corresponding K cluster;
Step 102) the K mean profile coefficient of cluster is calculated, the K mean profile coefficient of cluster is flat with what K-1 clustered
Equal silhouette coefficient compares, if two mean profile coefficients are constant, chooses current K values as cluster sum, otherwise sets K
Step 101 is re-executed after=K+1);Described silhouette coefficient represent all Data Entries included in each cluster it is corresponding to
Average value of the amount point to the geometric distance of cluster centre;
Step 103) with step 102) in the cluster sum that determines to carrying out cluster computing without classification fault data collection, and by obtaining
The every cluster centre for taking is classified to all fault datas concentrated without classification fault data, obtains classification fault data
Collection.
6. it is according to claim 5 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
Be, described step 101) in obtain cluster centre operating procedure include:
Step 101-1) random from all service datas of object equipment select the corresponding vector point conduct of Data Entry
First cluster centre, and the vector point nearest with the geometric distance of first cluster centre is found as the second cluster centre;
Step 101-2) geometric distance Distance (x) of each cluster centre cluster centre nearest with it is calculated, will be all
Geometric distance Distance (x) is added and obtains always apart from Sum (Distance (x));
Step 101-3) randomly select a corresponding vector of Data Entry that can fall always in Sum (Distance (x))
Point Random, as the cluster centre for newly increasing, re-executes step 101-2), until picking out K cluster centre.
7. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
It is, described step 5) specifically include:
Step 201) the whole fault data entries comprising each bar parameter are respectively mapped to together, form each bar parameter pair successively
The frequency of the mapping class answered, the whole fault data entries of described mapping class comprising parameter and its appearance;
Step 202) sum of fault data entry in each mapping class is calculated, as the denominator of probability calculation;
Step 203) add up in each mapping class to include the number of times that the other specification in addition to mapping class correspondence parameter occurs, make
It is the molecule of probability calculation;
Step 204) by step 203) in molecule and step 202) in the ratio between denominator, obtain that each parameter breaks down is same
When, the probability distribution data that other specification also breaks down.
8. it is according to claim 1 based on big data Fusion of Clustering analysis relevant parameter Fault Classification, its feature
It is, described step 6) specifically include:Calculation procedure 1) in all service datas with determine every cluster centre it is several
What distance, the distance value for taking minimum is compared with the mean profile coefficient of corresponding cluster, if the distance value is less than correspondence
Cluster mean profile coefficient, then judge fault type of the service data corresponding to the cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611247433.2A CN106845526B (en) | 2016-12-29 | 2016-12-29 | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611247433.2A CN106845526B (en) | 2016-12-29 | 2016-12-29 | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106845526A true CN106845526A (en) | 2017-06-13 |
CN106845526B CN106845526B (en) | 2019-12-03 |
Family
ID=59114134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611247433.2A Active CN106845526B (en) | 2016-12-29 | 2016-12-29 | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106845526B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108763289A (en) * | 2018-04-13 | 2018-11-06 | 西安电子科技大学 | A kind of analytic method of magnanimity heterogeneous sensor formatted data |
CN109445306A (en) * | 2018-10-26 | 2019-03-08 | 湖南磁浮技术研究中心有限公司 | Automatic associated parameter interpretation method and system based on rule configuration analysis |
CN109991951A (en) * | 2019-04-28 | 2019-07-09 | 齐鲁工业大学 | Multi-source fault detection and diagnosis method and apparatus |
CN110018980A (en) * | 2017-12-25 | 2019-07-16 | 北京金风科创风电设备有限公司 | Method and device for searching fault data from simulation data of fan controller |
CN110263944A (en) * | 2019-05-21 | 2019-09-20 | 中国石油大学(华东) | A kind of multivariable failure prediction method and device |
CN111771113A (en) * | 2018-02-28 | 2020-10-13 | 日产自动车株式会社 | Abnormal type determination device and abnormal type determination method |
CN113282433A (en) * | 2021-06-10 | 2021-08-20 | 中国电信股份有限公司 | Cluster anomaly detection method and device and related equipment |
CN113392208A (en) * | 2020-03-12 | 2021-09-14 | 中国移动通信集团云南有限公司 | Method, device and storage medium for IT operation and maintenance fault processing experience accumulation |
CN113421176A (en) * | 2021-07-16 | 2021-09-21 | 昆明学院 | Intelligent abnormal data screening method |
CN113656389A (en) * | 2021-08-12 | 2021-11-16 | 北京可视化智能科技股份有限公司 | Intelligent factory abnormal data processing method, device and system and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701157A (en) * | 2015-12-30 | 2016-06-22 | 芜湖乐锐思信息咨询有限公司 | Monitoring system for integrating social network site information |
CN105718935A (en) * | 2016-01-25 | 2016-06-29 | 南京信息工程大学 | Word frequency histogram calculation method suitable for visual big data |
CN105891629A (en) * | 2016-03-31 | 2016-08-24 | 广西电网有限责任公司电力科学研究院 | Transformer equipment fault identification method |
CN106021062A (en) * | 2016-05-06 | 2016-10-12 | 广东电网有限责任公司珠海供电局 | A relevant failure prediction method and system |
CN106251034A (en) * | 2016-07-08 | 2016-12-21 | 大连大学 | Wisdom energy saving electric meter monitoring system based on cloud computing technology |
-
2016
- 2016-12-29 CN CN201611247433.2A patent/CN106845526B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701157A (en) * | 2015-12-30 | 2016-06-22 | 芜湖乐锐思信息咨询有限公司 | Monitoring system for integrating social network site information |
CN105718935A (en) * | 2016-01-25 | 2016-06-29 | 南京信息工程大学 | Word frequency histogram calculation method suitable for visual big data |
CN105891629A (en) * | 2016-03-31 | 2016-08-24 | 广西电网有限责任公司电力科学研究院 | Transformer equipment fault identification method |
CN106021062A (en) * | 2016-05-06 | 2016-10-12 | 广东电网有限责任公司珠海供电局 | A relevant failure prediction method and system |
CN106251034A (en) * | 2016-07-08 | 2016-12-21 | 大连大学 | Wisdom energy saving electric meter monitoring system based on cloud computing technology |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110018980A (en) * | 2017-12-25 | 2019-07-16 | 北京金风科创风电设备有限公司 | Method and device for searching fault data from simulation data of fan controller |
CN110018980B (en) * | 2017-12-25 | 2021-07-27 | 北京金风科创风电设备有限公司 | Method and device for searching fault data from simulation data of fan controller |
CN111771113A (en) * | 2018-02-28 | 2020-10-13 | 日产自动车株式会社 | Abnormal type determination device and abnormal type determination method |
US11951615B2 (en) | 2018-02-28 | 2024-04-09 | Nissan Motor Co., Ltd. | Malfunction-type determination device and malfunction-type determination method |
CN108763289B (en) * | 2018-04-13 | 2021-11-23 | 西安电子科技大学 | Massive heterogeneous sensor format data analysis method |
CN108763289A (en) * | 2018-04-13 | 2018-11-06 | 西安电子科技大学 | A kind of analytic method of magnanimity heterogeneous sensor formatted data |
CN109445306A (en) * | 2018-10-26 | 2019-03-08 | 湖南磁浮技术研究中心有限公司 | Automatic associated parameter interpretation method and system based on rule configuration analysis |
CN109445306B (en) * | 2018-10-26 | 2022-01-25 | 湖南磁浮技术研究中心有限公司 | Automatic associated parameter interpretation method and system based on rule configuration analysis |
CN109991951A (en) * | 2019-04-28 | 2019-07-09 | 齐鲁工业大学 | Multi-source fault detection and diagnosis method and apparatus |
CN110263944A (en) * | 2019-05-21 | 2019-09-20 | 中国石油大学(华东) | A kind of multivariable failure prediction method and device |
CN113392208A (en) * | 2020-03-12 | 2021-09-14 | 中国移动通信集团云南有限公司 | Method, device and storage medium for IT operation and maintenance fault processing experience accumulation |
CN113282433A (en) * | 2021-06-10 | 2021-08-20 | 中国电信股份有限公司 | Cluster anomaly detection method and device and related equipment |
CN113421176A (en) * | 2021-07-16 | 2021-09-21 | 昆明学院 | Intelligent abnormal data screening method |
CN113421176B (en) * | 2021-07-16 | 2022-11-01 | 昆明学院 | Intelligent screening method for abnormal data in student score scores |
CN113656389A (en) * | 2021-08-12 | 2021-11-16 | 北京可视化智能科技股份有限公司 | Intelligent factory abnormal data processing method, device and system and storage medium |
CN113656389B (en) * | 2021-08-12 | 2022-05-27 | 北京可视化智能科技股份有限公司 | Intelligent factory abnormal data processing method, device and system and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106845526B (en) | 2019-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106845526A (en) | A kind of relevant parameter Fault Classification based on the analysis of big data Fusion of Clustering | |
Li et al. | A data-driven strategy for detection and diagnosis of building chiller faults using linear discriminant analysis | |
CN104471501B (en) | Pattern-recognition for the conclusion of fault diagnosis in equipment condition monitoring | |
CN108900546A (en) | The method and apparatus of time series Network anomaly detection based on LSTM | |
CN103974311B (en) | Based on the Condition Monitoring Data throat floater detection method for improving Gaussian process regression model | |
CN102282516B (en) | Abnormality detecting method and abnormality detecting system | |
Alippi et al. | Model-free fault detection and isolation in large-scale cyber-physical systems | |
CN102339389B (en) | Fault detection method for one-class support vector machine based on density parameter optimization | |
JP2019016209A (en) | Diagnosis device, diagnosis method, and computer program | |
CN108052528A (en) | A kind of storage device sequential classification method for early warning | |
CN107065834B (en) | The method for diagnosing faults of concentrator in hydrometallurgy process | |
CN103473540B (en) | The modeling of intelligent transportation system track of vehicle increment type and online method for detecting abnormality | |
Grbovic et al. | Cold start approach for data-driven fault detection | |
CN109612513A (en) | A kind of online method for detecting abnormality towards extensive higher-dimension sensing data | |
US7716152B2 (en) | Use of sequential nearest neighbor clustering for instance selection in machine condition monitoring | |
CN110247910A (en) | A kind of detection method of abnormal flow, system and associated component | |
CN103390154A (en) | Face recognition method based on extraction of multiple evolution features | |
CN108182445A (en) | Procedure fault recognition methods based on big data intelligence core independent component analysis | |
Duan et al. | Guided problem diagnosis through active learning | |
CN106792883A (en) | Sensor network abnormal deviation data examination method and system | |
CN109298633A (en) | Chemical production process fault monitoring method based on adaptive piecemeal Non-negative Matrix Factorization | |
CN116341901B (en) | Integrated evaluation method for landslide surface domain-monomer hazard early warning | |
CN108957173A (en) | A kind of detection method for avionics system state | |
CN112257767A (en) | Product key part state classification method aiming at class imbalance data | |
CN104899507A (en) | Detecting method for abnormal intrusion of large high-dimensional data of network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |