CN106600100B - Weighted multi-population particle swarm optimization-based hazard source reason analysis method - Google Patents

Weighted multi-population particle swarm optimization-based hazard source reason analysis method Download PDF

Info

Publication number
CN106600100B
CN106600100B CN201610940992.5A CN201610940992A CN106600100B CN 106600100 B CN106600100 B CN 106600100B CN 201610940992 A CN201610940992 A CN 201610940992A CN 106600100 B CN106600100 B CN 106600100B
Authority
CN
China
Prior art keywords
particle
cluster
particles
association rule
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610940992.5A
Other languages
Chinese (zh)
Other versions
CN106600100A (en
Inventor
周良
李诗瑶
谢强
郑洪源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN201610940992.5A priority Critical patent/CN106600100B/en
Publication of CN106600100A publication Critical patent/CN106600100A/en
Application granted granted Critical
Publication of CN106600100B publication Critical patent/CN106600100B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a method for analyzing a dangerous source reason based on weighted multi-population particle swarm optimization. And introducing the concept of item weight in the data preprocessing stage, and redefining the concept of item set range. In the process of association rule mining, an association rule mining algorithm based on weighted multi-population particle swarm optimization is provided, and the algorithm introduces an inter-population communication mechanism on the basis of multi-population cooperative particle swarm optimization, so that the population diversity is increased, and the defect that the algorithm is easy to fall into a local optimal solution is avoided. Meanwhile, the concept of particle weight is introduced, so that the algorithm can select rules which are more meaningful to a user. Therefore, accuracy and efficiency of analysis of the dangerous source reason are improved, analysis range of the dangerous source reason analysis is expanded, and complexity of the dangerous source reason analysis is reduced.

Description

Weighted multi-population particle swarm optimization-based hazard source reason analysis method
Technical Field
The invention relates to the technical field of information systems, in particular to a weighted multi-population particle swarm optimization-based hazard source reason analysis method.
Background
In the civil aviation air traffic safety management system, hazard source identification and risk assessment are important components. The detailed analysis of the hazard sources and the obtainment of the reasons and action mechanisms of the hazard sources are the prerequisites of the related departments in effectively and accurately evaluating the risks. In a traditional risk source cause analysis system, an event tree analysis method, a butterfly analysis method and a risk and operability analysis are used for analyzing a risk source. At present, experts and scholars at home and abroad propose a plurality of different analysis methods, but most of the analysis methods are established on the traditional analysis system. These methods can analyze the cause of the hazard from different sides, but the analysis lacks comprehensiveness, and for this reason, a method capable of more comprehensively analyzing the hazard is needed.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects in the prior art, improve the accuracy of the analysis of the causes of the dangerous sources, expand the analysis range of the analysis of the causes of the dangerous sources, improve the analysis efficiency of the analysis of the causes of the dangerous sources and reduce the complexity of the analysis of the causes of the dangerous sources, the invention provides a method for analyzing the causes of the dangerous sources based on weighted multi-population particle swarm optimization,
the technical scheme is as follows:
a danger source reason analysis method based on weighted multi-population particle swarm optimization comprises the following steps:
(1) distributing weights to each item of the hazard source by using a manual method or according to an existing algorithm; setting a danger source project set I and a danger source transaction database D; each danger source transaction of the danger source transaction database is represented by binary;
(2) define a weighted item set range:
Figure BDA0001139109770000011
wherein m ≠ n and m < n, m and n respectively denote the length of the item set, i.e. the number of items contained in the item set; wi (m) and wi (n) represent the weights of the sets of items, Tran (m) and Tran (n) represent the number of transactions containing the corresponding sets of items, WT (m, n) and Tran (m, n) represent the weight and number of transactions containing m and n and satisfying m → n, respectively, Σ WT (t) represents the sum of the weights of all transactions;
Figure BDA0001139109770000021
wherein, I (j) represents jth item in transaction T, | T | represents number of items in transaction;
(3) and (3) according to the m and n corresponding to the largest WRI obtained by calculation in the step (2), respectively serving as a front partition point and a rear partition point of the association rule, coding the association rule, and generating a candidate association rule set R ═ R of the hazard source1,…,Rm}; taking the association rule as the particles of the particle swarm, and determining a fitness function:
Figure BDA0001139109770000022
wherein WSPI (A) is the weighted support of the item set, N1And N2Is a weight parameter used to balance Support and confidence, Support (a @ B) refers to the number of transactions containing both items a and B, and | N | refers to the total number of transactions in the transaction database; WI (A) is the weight of the set of items containing A, and Trans (A) is the number of transactions containing A;
(4) carrying out danger source association rule mining by using a weighted multi-population particle swarm algorithm:
(41) randomly initializing the speed and the position of the particles, and clustering the particles according to the positions to generate different particle clusters;
(42) calculating a cluster range CR of each particle cluster by using a fitness function, and gradually ordering each particle cluster according to the cluster range;
(43) updating the optimal positions pbest, global optimal positions gbest and global local optimal positions gpest of all the particles according to the fitness function; and updates the velocity v of the particleij(t) and position xij(t);
(44) Comparing particle fitness values fitvalueijMinimum fitness value min fit of cluster where particles are locatediAnd a maximum fitness value max fitiThe relationship between: if min fiti<fitvalueij<max fitiThe position of the particle is not changed; if fitvalueij<min fitiAnd wij>min wi-1Wherein, min wi-1Is a cluster C preceding the cluster in which the particle is currently locatedi-1The minimum weight of (c); incorporating particles into Ci-1Cluster and delete num particles in the cluster with the smallest fitness value di-1,1…,di-nu1mAt the same time, num new particles are generatedi-1,1…,newi-nu1m}; if fitvalueij>max fitiAnd wij>min wi+1Wherein, min wi+1Is the cluster C next to the cluster in which the particle is currently locatedi+1The minimum weight of (c); incorporating particles into Ci+1Cluster and delete num' particles in the cluster with the smallest fitness value di+11,…,di+1num′At the same time, num' new particles are generatedi+11,…,newi+1num′};
(45) And (6) repeating the steps (42) to (44) until the optimal particle generation association rule is found or the iteration times are reached, and obtaining the precondition of the rule from the obtained association rule in a follow-up mode to obtain the reason of the hazard source.
In the step (1), each hazard source transaction in the hazard source transaction database is represented by binary specifically as follows: representing each transaction in the form of a set of binary 0 s and 1 s; the length of the transaction is the number of items in the item set; each bit in the binary system represents N items of the dangerous source item set database respectively; if the jth entry would have this position 1 in the transaction; otherwise, the position is set to 0.
The fitness function in the step (3) is used for measuring the importance of the association rule in the group; the support degree and the confidence degree of the association rule are combined by using a weighting method to obtain:
Figure BDA0001139109770000031
in order to better reflect the relation between the item weight and the support degree and the confidence degree, the weighting support degree is introduced into the formula; the weighted support wspi (a) for a set of items is defined as follows:
WSPI(A)=WI(A)Trans(A)
thus, a fitness function is obtained:
Figure BDA0001139109770000032
the encoding of the association rule in the step (3) specifically comprises: each item is represented by a 2-bit binary code; wherein 00 indicates that the item is a precedent of the association rule, 11 indicates a successor of the association rule, and 10 and 01 both indicate that the item does not belong to the association rule; one association rule has 2n bits in total.
In said step (44) the velocity v of the particles is updated according toij(t) and position xij(t):
Figure BDA0001139109770000033
The step (44) uses a mutation operation in a differential evolution algorithm in updating particles to generate new particles: randomly selecting 3 individuals in the population as a source of variation for new individuals, the new individuals being generated by:
Figure BDA0001139109770000034
wherein New represents a New individual in the population, r1,r2And r3Is a random number between 0 and | N |, wDEAre the differential weights.
Has the advantages that: compared with the prior art, the risk source reason analysis method based on the weighted multi-population particle swarm optimization has the following benefits:
(1) by using the group intelligent algorithm, human intervention in the analysis process is reduced, and the analysis efficiency of the analysis of the hazard source reason is improved;
(2) by introducing an inter-population communication mechanism into the group intelligent algorithm, the diversity of the population is increased, the analysis range of the cause analysis of the hazard source is expanded, and the problem that the particle swarm algorithm is easy to fall into local optimum is solved;
(3) by using the concept of weighting in the algorithm, the accuracy of the algorithm for analyzing the cause of the hazard source is improved.
Drawings
FIG. 1 is a transaction representation of an association rule; wherein (a) is a transaction and (b) is a binary representation of the transaction in (a);
FIG. 2 is a flow of association rule generation;
FIG. 3 is a flow chart of the method of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings.
The method for analyzing the dangerous source reason based on the weighted multi-population particle swarm optimization HCA-WMPSO considers the reason analysis of the dangerous source as a process of mining the association rule. To mine the associationIn the process of the rule, more meaningful rules can be generated and the influence of project weight on rule mining is considered, and the concept of the project weight is introduced in the data preprocessing stage to redefine the concept of the project set range. In the process of association rule mining, an association rule mining algorithm based on weighted multi-population particle swarm optimization is provided, an inter-population communication mechanism is introduced into the algorithm on the basis of multi-population cooperative particle swarm optimization, population diversity is increased, and the defect that the algorithm is easy to fall into a local optimal solution is overcome. Meanwhile, the concept of particle weight is introduced, so that the algorithm can select rules which are more meaningful to a user. In association rule mining, HCA-WMPSO is implemented in 2 steps. Before describing the implementation steps we assume that the set of hazard items I ═ { I ═ I1,…,InAnd a danger source transaction database.
First, data preprocessing
Firstly, distributing weights to each item of a hazard source by using a manual method or according to an existing algorithm; next, each hazard source transaction T of the hazard source transaction database is storediExpressed in binary, each bit in the binary respectively represents N items of the dangerous source item set database, if the jth item in the transaction takes the position as 1; otherwise, setting the position as 0; finally, we scan the transaction database to calculate the weight wt (t) of the transaction according to equation (2) and calculate the weighted itemset range WRI.
1. Binary conversion
To improve the scanning efficiency of the database and to more conveniently calculate the rule support and confidence, each transaction is represented in the form of a set of binary 0 and 1. The length of a transaction is the number of items in the set of items. Suppose now that the project set contains 4 projects I1,I2,I3And I4The transaction database contains 5 transactions, T1,T2,T3,T4And T5Their binary representation is shown in FIG. 1(a) and in FIG. 1 (b). In a transaction, if an item is contained, the corresponding bit is set to 1; otherwise, it is set to 0.
2. Item set scoping computation
To generate more meaningful association rules, the concept of item set Range (RI) is introduced herein. In the process of actual association rule mining, different importance degrees of different items in the rules are considered. We define a Weighted Itemset Range (WRI) as follows:
definition 1: weighted itemset range
Figure BDA0001139109770000051
In the definition, m ≠ n and m < n, m and n respectively denote the length of the set of items, i.e. the number of items contained in the set of items; wi (m) and wi (n) represent the weights of the item sets, Tran (m) and Tran (n) represent the number of transactions containing the respective item sets, WT (m, n) and Tran (m, n) represent the weights and numbers of transactions containing m and n and satisfying m → n, respectively, Σ WT (t) represents the sum of the weights of all transactions, and in formula (1), TWT () is of the form:
Figure BDA0001139109770000052
where I (j) represents the jth item in transaction T and | T | represents the number of items in the transaction.
And by calculating WRI, taking m and n corresponding to the largest WRI as a front partition point and a rear partition point of the association rule respectively, wherein the front partition point is the smallest number of items capable of being used as antecedents of the association rule, the rear partition point is the largest number of items contained in the association rule, and in the association rule, only the items appearing between the partition points m and n can be used as successors of the association rule. As shown in FIG. 2, WRI (2 → 4) takes the maximum value, and in one transaction record, the first two items of the transaction are used as the antecedent condition X of the rule, and the 3 rd and 4 th items are used as the result Y of the rule.
Second, analysis of causes of dangerous sources
1. Regular coding and fitness value calculation
And taking m and n corresponding to the largest WRI as the number. The association rule is encoded by a chromosome encoding method, so that the algorithm can efficiently calculate the fitness value. Each item is represented by a 2-bit binary code, where 00 represents that the item is a precedent of the association rule, 11 represents a successor of the association rule, and 10 and 01 both represent that the item does not belong to the association rule, so that one association rule has a total of 2n bits.
When mining association rules using a population of particles, we consider the association rules as particles of the population of particles. The fitness function is used to measure the importance of association rules in a population. In the association rule mining, rules with support (support) greater than minimum support Minsupport and Confidence (Confidence) greater than minimum Confidence are discovered. Compresse combines the support and confidence of association rules by using a weighted method, which is defined as follows:
Figure BDA0001139109770000061
in the formula, N1And N2Is a weighting parameter used to balance Support and confidence, Support (a @ B) refers to the number of transactions that contain both items a and B, and | N | refers to the total number of transactions in the transaction database.
To better reflect the relationship between item weight and Support and confidence, we introduce the concept of Weighted Support (WSP) in the above formula. The weighted support wspi (a) for a set of items is defined as follows:
WSPI(A)=WI(A)Trans(A)
where WI (A) is the weight of the set of items containing A and Trans (A) is the number of transactions containing A. We define Weighted comparison as follows:
Figure BDA0001139109770000062
as a fitness function of the algorithm.
2. Hazard source association rule mining
To get rules that are more meaningful to the user, we introduce the concept of particle weights. In the algorithm, each particle is an association rule, and the weight of the particle is the weight of the association rule, that is, the weight of the association rule is regarded as the weight of the transaction. To increase information sharing among populations, we introduce the concept of global local best, which refers to the maximum of local best solutions among all populations. When the particle speed is updated, gptest participates in speed updating as a part of the new particle speed, and the speed updating after the gptest is introduced is as follows:
Figure BDA0001139109770000071
in the formula, winertiaIs the inertial weight, c1,c2And c3Is 3 constants, r1,r2And r3Is [0,1 ]]Random numbers, w, satisfying a uniform distribution therebetweeniIs the particle weight, pbest is the local optimal solution for the particle in the sub-population, gbest is the global optimal solution for the particle population, where 1-wiIs to adjust the random coefficient r so that the particles with higher weights have a greater probability of approaching the optimal solution.
The weighted multi-population particle swarm optimization introduces an inter-population communication mechanism on the basis of multi-population, thereby increasing the population diversity of the population and avoiding the defect that the algorithm is easy to fall into the local optimal solution. Meanwhile, the concept of particle weight is introduced, so that the algorithm can select rules which are more meaningful to a user.
In the algorithm, in order to simulate the characteristics of birds in the predation process, a K-means clustering algorithm is firstly utilized to cluster association rules R to obtain different particle clusters R ═ C i1, …, n, each cluster of particles CiIt represents a bird "population". For each cluster C respectivelyiThe rule in (1) calculates the fitness value to obtain the maximum value max fit and the minimum value min fit of the fitness value as the boundary value BV of the cluster, the range [ min fit, max fit]Is the cluster region CR. Meanwhile, by comparing the weights of all the particles in the cluster, the minimum C is foundiIs calculated. To facilitate the latter operation, the clusters are arrangedSorting was done in increments of CR. In the algorithm, let CiEach particle R in a clusterijAre all in their own CRiSearching internally, wherein in the searching process, the particles need to check whether the particles exceed the searching range of the particles and perform corresponding operation, the process is an inter-population information interaction mechanism, and the specific steps are as follows:
step 1: calculating the particle RijFitness value ofi
Step 2: compare fitvalueiBoundary value BV with cluster in which the particle is locatedi: if min fiti<fitvaluej<max fiti
Then the particle RijThe position is unchanged;
and step 3: if fitvaluej<minfitiOr fitvaluej>maxfiti(ii) a Then at the CiGenerating new particles in the clusters;
simultaneously, comparing the weight of the particle with the weight of the current cluster;
and 4, step 4: if the weight w of the particleij>min wi-1(or min w)i+1) (ii) a Then the particle RijIncorporation into item Ci-1(or C)i+1) Clustering; deleting the particles (num) in the cluster having the smallest fitness value; and generating new particles
Figure BDA0001139109770000081
Until the maximum number of iterations is reached and the weight of the new particle is less than the minimum weight of the current cluster; otherwise, only new particles are generated
Figure BDA0001139109770000082
Until the maximum number of iterations is reached and the weight of the new particle is less than the minimum weight of the current cluster;
the above process completes the particle update process, in this process, we randomly select 3 individuals in the population as the variation source of new individuals in DE, and generate new individuals by a certain rule, the form is as follows:
Figure BDA0001139109770000083
wherein New represents a New individual in the population, r1,r2And r3Is a random number between 0 and | N |, wDEAre the differential weights.
In order to ensure the effectiveness of new particles generated by the algorithm, each time a new particle is generated, the WRI of the particle is calculated, and if the value is within the WRI range of the population, the particle cluster receives the new particle; otherwise, the process is cycled through until a particle is generated that satisfies the condition or a maximum number of iterations is reached.
Setting weighted inertia coefficients w of multi-population particle swarm optimizationinertiaDecreasing from 0.9 to 0.4, c1,c2And c3Are respectively c1=c2=2,c3The initialization population size N is 100, and the maximum number of iterations num is 1iteration300, the velocity V of the population particles0Is 0 and randomly assigned particle position X0. The general flow of the mining of the hazard source association rules is shown in fig. 3.
And (3) initializing the speed and the position of the particles at random at the beginning, and clustering the particles according to the positions by using a k-means algorithm to generate different particle clusters. And calculating the cluster range CR of each particle cluster by using a fitness function, and sequencing each particle cluster in an increasing way according to the cluster range. And in the searching process, particle position updating operation is carried out according to the improved speed updating formula in the text. The method specifically comprises the following steps:
step 1: initializing a population NpAnd the maximum number of iterations Nt(ii) a Initial velocity V for initializing population particles0And an initial position X0(ii) a According to V0And X0Assigning particles to different clusters of particles C ═ C using a clustering algorithm1,…,Cl};
Step 2: calculating the fitness value fitvalue of each particleij,C1To ClCluster range CR ofi=[min fiti,maxfiti]And minimum weight in cluster min wiAnd sorting the particle clusters in descending order of cluster range C '═ C'1,…,C′l};
And step 3: updating the optimal positions pbest, global optimal positions gbest and global local optimal positions gpest of all the particles according to the fitness function;
and 4, step 4: velocity v of the renewed particleij(t) and position xij(t);
And 5: comparing particle fitness values fitvalueijMinimum fitness value min fit of cluster where particles are locatediAnd a maximum fitness value max fitiThe relationship between:
step 5.1: if min fiti<fitvalueij<max fitiThe position of the particles is unchanged;
step 5.2: if fitvalueij<min fitiAnd wij>min wi-1(min wi-1Is a cluster C preceding the cluster in which the particle is currently locatedi-1Minimum weight of) incorporating the particle into the C-thi-1Cluster, delete num particles with minimum fitness value in cluster { di-11,…,di-1numAt the same time, num new particles are generatedi-11,…,newi-1num};
Step 5.3: if fitvalueij>max fitiAnd wij>min wi+1(min wi+1Is the cluster C next to the cluster in which the particle is currently locatedi+1Minimum weight of) incorporating the particle into the C-thi+1Cluster, delete num' particles in cluster with minimum fitness value { di+11,…,di+1num′At the same time, num' new particles are generatedi+11,…,newi+1num′};
Step 6: repeating the step 2 to the step 5 until the optimal particle generation association rule is found or the iteration times are reached;
and 7: the preconditions we have taken from the subsequent backtracking of the association rule to the rule are the cause of the hazard.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (6)

1. A danger source reason analysis method based on weighted multi-population particle swarm optimization is characterized by comprising the following steps: the method comprises the following steps:
(1) distributing weights to each item of the hazard source by using a manual method or according to an existing algorithm; setting a danger source project set I and a danger source transaction database D; each danger source transaction of the danger source transaction database is represented by binary;
(2) define a weighted item set range:
Figure FDA0002455315830000011
wherein m ≠ n and m < n, m and n respectively denote the length of the item set, i.e. the number of items contained in the item set; wi (m) and wi (n) represent the weights of the sets of items, Tran (m) and Tran (n) represent the number of transactions containing the corresponding sets of items, WT (m, n) and Tran (m, n) represent the weight and number of transactions containing m and n and satisfying m → n, respectively, and Σ WT (t) represents the sum of the weights of all transactions;
Figure FDA0002455315830000012
wherein, I (j) represents jth item in transaction T, | T | represents number of items in transaction;
(3) and (3) according to the m and n corresponding to the largest WRI obtained by calculation in the step (2), respectively serving as a front partition point and a rear partition point of the association rule, coding the association rule, and generating a candidate association rule set R ═ R of the hazard source1,…,Rm};
Compresse combines the support and confidence of association rules by using a weighted method, which is defined as follows:
Figure FDA0002455315830000013
taking the association rule as the particles of the particle swarm, and determining a fitness function:
Figure FDA0002455315830000014
wherein, wspi (a) is the weighted support of the item set, wspi (a) ═ wi (a) trans (a); n is a radical of1And N2Is a weight parameter used to balance Support and confidence, Support (a @ B) refers to the number of transactions containing both items a and B, and | N | refers to the total number of transactions in the transaction database; WI (A) is the weight of the set of items containing A, and Trans (A) is the number of transactions containing A;
(4) carrying out danger source association rule mining by using a weighted multi-population particle swarm algorithm:
(41) randomly initializing the speed and the position of the particles, and clustering the particles according to the positions to generate different particle clusters;
(42) calculating a cluster range CR of each particle cluster by using a fitness function, and gradually ordering each particle cluster according to the cluster range;
(43) updating the optimal positions pbest, global optimal positions gbest and global local optimal positions gpest of all the particles according to the fitness function; and updates the velocity v of the particleij(t) and position xij(t);
(44) Comparing particle fitness values fitvalueijMinimum fitness value minfit to the cluster in which the particle is locatediAnd maximum fitness value maxfitiThe relationship between: if minfiti<fitvalueij<maxfitiThe position of the particle is not changed; if fitvalueij<minfitiAnd wij>minwi-1Wherein, minwi-1Is a cluster C preceding the cluster in which the particle is currently locatedi-1The minimum weight of (c); incorporating particles into Ci-1Cluster and delete num particles in the cluster with the smallest fitness value di-11,…,di-1numAt the same time, num new particles are generatedi-11,…,newi-1num}; if fitvalueij>maxfitiAnd wij>minwi+1Wherein, minwi+1Is the cluster C next to the cluster in which the particle is currently locatedi+1Minimum weight of(ii) a Incorporating particles into Ci+1Cluster and delete num' particles in the cluster with the smallest fitness value di+11,…, di+1num′At the same time, num' new particles are generatedi+11,…,newi+1num′};
(45) And (6) repeating the steps (42) to (44) until the optimal particle generation association rule is found or the iteration times are reached, and obtaining the precondition of the rule from the obtained association rule in a follow-up mode to obtain the reason of the hazard source.
2. The method for analyzing cause of risk source according to claim 1, wherein: in the step (1), each hazard source transaction in the hazard source transaction database is represented by binary specifically as follows: representing each transaction in the form of a set of binary 0 s and 1 s; the length of the transaction is the number of items in the item set; each bit in the binary system represents N items of the dangerous source item set database respectively; if the jth entry would have this position 1 in the transaction; otherwise, the position is set to 0.
3. The method for analyzing cause of risk source according to claim 1, wherein: the fitness function in the step (3) is used for measuring the importance of the association rule in the group; the support degree and the confidence degree of the association rule are combined by using a weighting method to obtain:
Figure FDA0002455315830000031
in order to better reflect the relation between the item weight and the support degree and the confidence degree, the weighting support degree is introduced into the formula; the weighted support wspi (a) for a set of items is defined as follows:
WSPI(A)=WI(A)Trans(A)
thus, a fitness function is obtained:
Figure FDA0002455315830000032
4. the method for analyzing cause of risk source according to claim 1, wherein: the encoding of the association rule in the step (3) specifically comprises: each item is represented by a 2-bit binary code; wherein 00 indicates that the item is a precedent of the association rule, 11 indicates a successor of the association rule, and 10 and 01 both indicate that the item does not belong to the association rule; one association rule has 2n bits in total.
5. The method for analyzing cause of risk source according to claim 1, wherein: in said step (44) the velocity v of the particles is updated according toij(t) and position xij(t):
Figure FDA0002455315830000033
x(t+1)=x(t)+v(t+1)
In the formula, winertiaIs the inertial weight, c1,c2And c3Is 3 constants, r1,r2And r3Is [0,1 ]]Random numbers, w, satisfying a uniform distribution therebetweeniIs the particle weight, pbest is the local optimal solution of the particle in the sub-population, gbest is the global optimal solution of the particle population; wherein, 1-wiIs to adjust the random coefficient r so that the high-weighted particles are close to the optimal solution.
6. The method for analyzing cause of risk source according to claim 1, wherein: the step (44) uses a mutation operation in a differential evolution algorithm in updating particles to generate new particles: randomly selecting 3 individuals in the population as a source of variation for new individuals, the new individuals being generated by:
Figure FDA0002455315830000034
wherein New represents a New individual in the population, r1,r2And r3Is a random number between 0 and | N |, wDEIs a differential weight。
CN201610940992.5A 2016-11-01 2016-11-01 Weighted multi-population particle swarm optimization-based hazard source reason analysis method Active CN106600100B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610940992.5A CN106600100B (en) 2016-11-01 2016-11-01 Weighted multi-population particle swarm optimization-based hazard source reason analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610940992.5A CN106600100B (en) 2016-11-01 2016-11-01 Weighted multi-population particle swarm optimization-based hazard source reason analysis method

Publications (2)

Publication Number Publication Date
CN106600100A CN106600100A (en) 2017-04-26
CN106600100B true CN106600100B (en) 2020-10-27

Family

ID=58590401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610940992.5A Active CN106600100B (en) 2016-11-01 2016-11-01 Weighted multi-population particle swarm optimization-based hazard source reason analysis method

Country Status (1)

Country Link
CN (1) CN106600100B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108363728A (en) * 2018-01-10 2018-08-03 中国电力科学研究院有限公司 A kind of method and system for excavating extra-high voltage transformer equipment status data correlation rule
CN110334796A (en) * 2019-06-28 2019-10-15 北京科技大学 A kind of association rule mining method and device of social security events
CN110444022A (en) * 2019-08-15 2019-11-12 平安科技(深圳)有限公司 The construction method and device of traffic flow data analysis model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008417A (en) * 2014-05-27 2014-08-27 广西民族大学 Method for establishing high-rise building personnel evacuation bioluminescence particle swarm optimization algorithm model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8346712B2 (en) * 2009-11-24 2013-01-01 King Fahd University Of Petroleum And Minerals Method for identifying hammerstein models

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008417A (en) * 2014-05-27 2014-08-27 广西民族大学 Method for establishing high-rise building personnel evacuation bioluminescence particle swarm optimization algorithm model

Also Published As

Publication number Publication date
CN106600100A (en) 2017-04-26

Similar Documents

Publication Publication Date Title
Sayed et al. A binary clonal flower pollination algorithm for feature selection
CN108334949B (en) Image classifier construction method based on optimized deep convolutional neural network structure fast evolution
WO2021155706A1 (en) Method and device for training business prediction model by using unbalanced positive and negative samples
US20220188568A1 (en) Methods and systems for mining minority-class data samples for training a neural network
CN111898689B (en) Image classification method based on neural network architecture search
CN111226232A (en) Hybrid generator model
CN107766929B (en) Model analysis method and device
CN109118013A (en) A kind of management data prediction technique, readable storage medium storing program for executing and forecasting system neural network based
CN105929690B (en) A kind of Flexible Workshop Robust Scheduling method based on decomposition multi-objective Evolutionary Algorithm
CN106600100B (en) Weighted multi-population particle swarm optimization-based hazard source reason analysis method
CN111967971B (en) Bank customer data processing method and device
CN111582325B (en) Multi-order feature combination method based on automatic feature coding
Ali et al. Identification of functional piRNAs using a convolutional neural network
CN114330659A (en) BP neural network parameter optimization method based on improved ASO algorithm
CN116167617A (en) Geological disaster risk assessment method and system integrating random forest and attention
CN104732067A (en) Industrial process modeling forecasting method oriented at flow object
CN112215278B (en) Multi-dimensional data feature selection method combining genetic algorithm and dragonfly algorithm
Robu et al. A genetic algorithm for classification
KR102154425B1 (en) Method And Apparatus For Generating Similar Data For Artificial Intelligence Learning
CN111984842B (en) Bank customer data processing method and device
CN109308565B (en) Crowd performance grade identification method and device, storage medium and computer equipment
CN112241811A (en) Method for predicting hierarchical mixed performance of customized product in &#39;Internet +&#39; environment
CN112163068B (en) Information prediction method and system based on autonomous evolution learner
US20230401454A1 (en) Method using weighted aggregated ensemble model for energy demand management of buildings
CN110570048A (en) user demand prediction method based on improved online deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant