CN110442143B

CN110442143B - Unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization

Info

Publication number: CN110442143B
Application number: CN201910603461.0A
Authority: CN
Inventors: 段海滨; 陈琳; 邓亦敏; 霍梦真; 申燕凯; 张岱峰; 魏晨; 周锐; 杨庆; 赵建霞; 仝秉达; 吴江; 夏洁
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2020-10-27
Anticipated expiration: 2039-07-05
Also published as: CN110442143A

Abstract

The invention discloses an unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization, which comprises the following steps: loading a data set; calculating a difference matrix and an adjacency matrix of the data set; solving a minimum spanning tree by utilizing a primm algorithm; dividing the edge set in the minimum spanning tree into a strong connection edge set and a weak connection edge set; decoding the strong connection edge set, and performing pre-calculation of clustering; generating an initial pigeon group, decoding to obtain a clustering result, and evaluating the pigeon group at the initial moment by adopting compactness and continuity; carrying out non-dominated sorting on the initial pigeon group, and determining the global historical optimal position and the central position of the pigeon group at the current moment; and updating the position and the speed of the pigeon group, decoding the position of the pigeon group to obtain a clustering result, updating the global historical optimal position and the central position of the pigeon group, and continuing the process until a termination condition is met. The method is simple to implement, reduces the calculation load and the dimensionality of a decision space, is easier to search an optimal solution, and has the capability of better adapting to different clustering requirements.

Description

Unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization

Technical Field

The invention relates to an unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization, and belongs to the technical field of data mining.

Background

The unmanned aerial vehicle has the advantages of low manufacturing cost, good economy, long idle time, capability of executing tasks in severe environment, capability of avoiding casualties and the like, and is widely applied to civil fields such as power grid inspection, search and rescue, aerial photography and the like and military fields such as target reconnaissance, tracking, striking and the like. The single unmanned aerial vehicle is limited by factors such as sensing range, weapon load and computing power, and is difficult to complete complex tasks. Many unmanned aerial vehicles pass through the interactive sharing of information to tasks such as search, reconnaissance and strike are executed in coordination to the form of function distribution, can effectively improve system survival rate and whole efficiency of execution. In the process of cooperatively executing tasks by multiple unmanned aerial vehicles, situation awareness is the basis of unmanned aerial vehicle behavior decision, each unmanned aerial vehicle in an unmanned aerial vehicle cluster needs to perform data mining according to comprehensive information such as perception information of the surrounding environment and received information of nearby friends and airplanes, and performs knowledge mining and mode classification on the information, so that valuable knowledge and important information are extracted. However, due to the coexistence of a plurality of unmanned aerial vehicles and the transient change of environmental information, the amount of information to be processed is increased dramatically, and the information processing capacity and the calculation capacity of the unmanned aerial vehicle are limited. Therefore, it is important to design a reasonable and efficient data mining method. The invention aims to improve the capability of processing large-scale data mining by the unmanned aerial vehicle and reduce the calculation load by designing a data clustering analysis method based on combined multi-target pigeon swarm optimization, so that valuable knowledge and important information can be extracted from massive data, and the situation assessment and behavior decision of the unmanned aerial vehicle are facilitated.

Data clustering is a common method in data mining, and belongs to unsupervised learning. Common data clustering methods mainly include partition-based clustering (K-means and K-center algorithms), hierarchy-based clustering (Chameleon algorithm), density-based clustering (DBSCAN algorithm), and the like. The proposed methods effectively solve the problem of data clustering, but as the dimensionality of the data increases, its performance decreases accordingly. In addition, since data clustering belongs to unsupervised learning, data is unlabeled, and different evaluation indexes may cause different clustering results under the condition of lacking prior knowledge. The current clustering method mostly adopts a single evaluation index to guide the clustering process, such as: compactness or separability, a single evaluation index often cannot be effective for most data sets given different information sets. Therefore, aiming at the problems existing in the conventional clustering method, the data clustering problem is converted into an optimization problem, a plurality of suitable clustering evaluation indexes which possibly have mutual constraint relation are simultaneously selected according to requirements, then a data clustering analysis method which can be used for large-scale data is designed based on a combined multi-target pigeon group optimization method, and different characteristics of a data set are obtained by simultaneously optimizing the plurality of evaluation indexes, so that favorable support is provided for the unmanned aerial vehicle to obtain more valuable information from complex and massive situation information.

Disclosure of Invention

The invention provides an unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization, which aims to convert a data clustering problem into an optimization problem and design a combined multi-target pigeon swarm optimization algorithm to solve the problem so as to realize cluster analysis of data.

The invention provides an unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization aiming at the problem of data clustering, the implementation block diagram of the method is shown in figure 1, and the main implementation steps are as follows:

the method comprises the following steps: loading an unmanned aerial vehicle situation dataset

Loading a data set to be processed and calculating the number N of data in the data set_dataAnd dimension M of the data_dataBy using

To represent the ith data node in the data set.

Step two: computing a difference matrix and an adjacency matrix for a dataset

The purpose of data clustering is to divide data with high similarity into the same cluster and divide data with low similarity into different clusters according to a certain standard or the intrinsic property and rule of the data. By using Euclidean distance between nodes

The difference between a node i and a node j in the data set is represented, and the smaller the Euclidean distance is, the smaller the difference between the two nodes in the data set is, and the higher the possibility that the two nodes are divided into the same cluster is. Calculating Euclidean distance d between all nodes in data set_ijAnd normalizing it

Then the difference matrix D of the available nodes_dataComprises the following steps:

sorting the difference matrix shown in the formula (1) in ascending order to obtain an adjacent matrix shown in the formula (2):

wherein,

and the serial numbers of the adjacent nodes are obtained by sequencing from small to large according to the differences between all the nodes and the node j in the data set.

Step three: solving a minimum spanning tree

And (3) solving the minimum spanning tree by adopting a primm algorithm according to the adjacency matrix shown in the formula (2). Randomly selecting a node i as a starting point, selecting a node j with the minimum difference, and adding the node to a vertex set S ═ i]In (e), then S ═ U [ i ] U]＝[i,j]. In addition, the generated edge j → i is added to the edge set V, [ j → i ═ V]. In j → i, the starting point j of the edge is stored in the starting point set S_spIn the middle, the terminal i is stored in the terminal set S_epIn, S_ep(j) I denotes that node j is connected to node i, i being the end set S_epAnd j is called element i in the end point set S_epIndex number of (1). Repeating the step, sequentially selecting the nodes which are not in the vertex set and have the minimum difference with any node in the vertex set, generating a new edge, updating the vertex set, the edge set, the starting point set and the end point set, and terminating the step when all the nodes in the data set are added into the vertex set. Sequentially collecting the terminal points S_epAnd their element index labels, can generate a minimum spanning tree for the data set.

Step four: partitioning edge sets in a minimum spanning tree into strongly connected edge sets and weakly connected edge sets

Adjacency matrix N according to equation (3) and data set_nearestAnd a difference matrix D_dataCalculating the weight value W of each edge j → i_ji：

Where nn (i, j) is used to compute that node j is the next nearest neighbor to node i. According to the adjacency matrix N_nearestIt can be obtained if nn (i, j) ═ n_ikThen it means that node j is the kth nearest neighbor of node i. Weighted value W_jiThe larger the difference between the representation node j and the representation node i, the smaller the possibility that it is divided into the same cluster.

Calculating the weighted values W of all edges of the minimum spanning tree of the data set according to the formula (3), sorting the weighted values corresponding to the edges in a descending order, and sorting the weighted values with larger weighted value of lambdax (N)_data-1), λ is more than or equal to 0 and less than or equal to 1 edge, and a weak connection edge set E is added into the edge_sThe remaining (1-lambda) × (N)_data-1) adding strong connecting edge set E to edges_w. Simultaneous starting set S_spIs also divided into weakly connected start point sets S accordingly_spwAnd strong connection starting point set S_sps。

The data are solved by adopting a pigeon group optimization algorithm in the following stepsIn the process of clustering optimization problem, the connection relation of all strong connection edges is not changed, the weak connection edge set is abandoned, and the weak connection starting point set S is given_spsAll the nodes j in the cluster redistribute the connection terminal points k (k is the adjacent nodes in the m fields of the nodes j), thereby forming a new connection relation j → k to replace the original weak connection edge j → i, and all the weak connection terminal points k form the position of the ith pigeon

Wherein N is_wsIs S_spsThe number of intermediate nodes.

Step five: decoding the strong connection edge set to obtain a pre-polymerization result

Because the strong connection edge set is not changed in the subsequent clustering optimization solution, the strong connection edge set can be decoded to obtain a data pre-clustering result. Set of strong connection starting points S_spsRandomly selecting a node as a starting point, dividing all the nodes connected together into the same class, and introducing a classification mark vector

Reference numerals indicating classes to which respective nodes in the data set are assigned, e.g. a (2) ═ 3 indicates that node 2 is classified into class 3, for weakly connected start set S_spwThe classification flags of the nodes are initialized to A (i) -1, i belongs to S_spwRepeating the steps until all strong connection starting point sets S_spsThe nodes in (1) are all allocated.

Step six: generation of an initial Pigeon group

Random generation of p_sizeIndividual pigeons, each pigeon containing a spatial position

And velocity

Wherein i is the number of the pigeon and the spatial position of the pigeon

Indicating the connection end k of all new weakly connected edges. Different weak connection edge sets can be obtained by updating the spatial position of the pigeons. And setting the current simulation time as t to be 0. R is map and compass operator, beta₁And beta₂For randomly generated random numbers following a Gaussian distribution, σ is the transfer factor, T_maxIn order to be the maximum number of iterations,

is 1 XN_swA row vector of dimensions.

Step seven: evaluating the target function of the pigeon group at the moment t-0

According to the spatial position of pigeons

And end point set S_epReconstructing the full coding length of the spatial position of the pigeon, decoding the reconstructed pigeon to obtain the connection relation between the nodes, and then identifying all the classification identifications as A (i) ═ 1 according to the connection relation between the nodes, wherein i belongs to S_spwThe nodes are distributed to the corresponding classes until the classification identifiers A (i) ≠ -1 of all the nodes, i belongs to S_spw. Obtaining a clustering result C of all data in the data set according to the classification mark vector A_i＝[c₁,c₂,…,c_γ]Where γ is the number of classes formed, c_ττ is 1,2, …, and γ is τ -th group.

Selecting two cluster evaluation indexes (target functions) of compactness and continuity to evaluate the quality of a cluster result, and representing the continuity of the cluster by using a cluster clustering distance (target function f)₁)：

Representing cluster compactness by intra-class clustering distance (objective function f)₂). The average distance from each node in each class to the cluster center is first calculated according to equation (5).

Wherein cd_τDenotes the average distance of each node in the τ -th class to the cluster center, c_τIs a node set of the τ -th class, | c_τ| is the number of nodes contained in the τ -th class,

is the cluster center of the τ -th class. Repeating the steps until the objective function values corresponding to all pigeons in the pigeon group are calculated, wherein the pigeon i represents the ith clustering result C ═ C₁,c₂,…,c_γ]The corresponding objective function value is F_i(t)＝[f_i1(t),f_i2(t)]。

Step eight: performing non-dominant sorting on the initial pigeon group, and determining the global historical optimal position and the central position of the pigeon group at the current moment

And comparing the dominance relation between the pigeon i and the pigeon j based on the objective function values, and layering the whole pigeon group by a Pareto (Pareto) non-dominance sorting algorithm. If all objective function values F of pigeon i_i(t)＝[f_i1(t),f_i2(t)]Objective function values F all superior to pigeon j_j(t)＝[f_j1(t),f_j2(t)]I.e. f_i1(t)≤f_j1(t) and f_i2(t)≤f_j2(t), the pigeon i is said to dominate the pigeon j, and if the pigeon i is not dominated by another pigeon, the pigeon is said to be a non-dominated pigeon, and the pigeon is divided into first-stage non-dominated layers.

All non-dominated pigeons positioned at the first-level non-dominated layer are stored in an external archive set AS, one pigeons is randomly selected from the external archive set AS and is used AS the global optimal position p at the time t_best(t) taking the average position of all pigeons in the first level non-dominant layer in the external archive set AS AS p_center(t)。

Step nine: updating the position and velocity of pigeons

Introducing an auxiliary vector ζ_i＝[ζ_i1,ζ_i2,…,ζ_iNws]，ζ_ij∈[-1,0,1]The optimization algorithm of the continuous pigeon flock is converted into a combined optimization algorithm, so that the combined optimization algorithm can be used for solving a clustering optimization problem:

wherein p is_centerj(t) center position p of pigeon group at time t_centerThe j-th element of (t), p_gbestj(t) History optimal position p of pigeon flock at time t_gbestThe jth element of (t). The velocity update formula of the pigeon obtained according to the formula (7) is as follows:

wherein, beta₁And beta₂For randomly generated random numbers following a gaussian distribution,

is 1 XN_swA row vector of dimensions.

Then according to the speed v of the pigeon at the moment t +1_i(t +1) calculation of ζ_i(t+1)：

Wherein the updated ζ is a predetermined constant_i(t +1) updating the position of the pigeon at time t +1, instead of equation (10):

wherein λ is randomly selected p_ij(t) nearest neighbors. This step is repeated until the position and velocity updates for all pigeons are completed.

Step ten: evaluating fitness function of pigeon group at t +1 moment

Decoding the pigeons according to the sixth step to form clusters, and calculating the fitness function F of the pigeon group at the moment of t +1_i(t+1)＝[f_i1(t+1),f_i2(t+1)]. Comparing pigeons p based on objective function values_i(t +1) with pigeon p_jAnd (t +1) storing all non-dominated pigeons positioned at the first-level non-dominated layer into an external archive set AS.

Step eleven: sorting the external archive set AS in a non-dominated way, and selecting pigeons needing to be discarded in the AS according to the congestion distance

And (3) sorting all pigeons stored in the external archive set AS at the time of t +1 in a non-dominated way, discarding the pigeons which are not in the first-level non-dominated layer and calculating the crowding distance of the pigeons which are not in the first-level non-dominated layer, and discarding the pigeons with large crowding distance.

Step twelve: updating global optimum position and central position of pigeons, and updating pigeon group number

Randomly selecting a pigeon from an external archive set AS, and taking the position of the pigeon AS a global optimal position p at the moment t +1_best(t +1) the central position of the non-dominant pigeon in the first-stage non-dominant layer is defined as the central position p of the pigeon flock_center(t + 1). The number of pigeons will gradually decrease during each iteration.

P_size(t+1)＝P_size(t)-P_dec(11)

Wherein, P_decThe number of pigeons discarded.

Step thirteen: determining whether to stop iteration

The simulation iteration time t is t + 1. If T is larger than the maximum simulation iteration number T_maxIf yes, ending the simulation and entering a step fourteen; otherwise, returning to the step eight.

Fourteen steps: outputting data clustering results

And outputting a clustering result and drawing a Pareto leading edge curve.

The invention provides an unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization. The clustering method provided by the invention has the advantages that: firstly, a pigeon position (clustering result) coding mechanism effectively reduces the calculation load of large-scale data and the dimensionality of a decision space, so that an optimal solution (an optimal clustering result) is easier to search by a combined multi-target pigeon group optimization algorithm; secondly, the designed auxiliary vector effectively converts the continuous pigeon group optimization algorithm into a combined optimization algorithm, so that the original pigeon group optimization algorithm has the capability of solving the discrete optimization problem, and the application field of the pigeon group optimization algorithm is widened; finally, in the optimization process, the two clustering evaluation indexes of compactness and continuity are considered at the same time, so that the capability of better adapting to different clustering requirements can be obtained, different characteristics of a data set are obtained, and favorable support is provided for the unmanned aerial vehicle to obtain more valuable information from complex and massive situation information.

Drawings

FIG. 1 is a flow chart of data clustering analysis based on combination multi-objective pigeon swarm optimization

2a, b, c, d cluster pigeon position code map, wherein FIG. 2a is the minimum spanning tree of the data set and its representation; FIG. 2b is a diagram illustrating the ordering of the edges of the minimum spanning tree according to the weight of the connection; FIG. 2c is a diagram showing the generation of a pre-polymerization result from the strong connection edge set after the weak connection edge set is removed; fig. 2d shows the creation of a new connection (pigeon position storage connection termination) instead of a weak connection edge.

FIG. 3 clustered pigeon position decoding diagram

FIG. 4a-e clustering result plot and Pareto front curve for dataset 1

FIGS. 5a-f clustering result plot and Pareto front curve for dataset 2

FIGS. 6a-f clustering result plot and Pareto front curve for dataset 3

The reference numbers and symbols in the figures are as follows:

t-number of simulated iterations

p_size-total number of pigeons in a group of pigeons

i-number of pigeons

T_maxMaximum number of simulation iterations

U-end point corresponding to start point of weak connection edge to be determined

f₁-an objective function representing continuity of cluster evaluation index

f₂-objective function representing cluster evaluation index compactness

X-abscissa of two-dimensional data set

Y-ordinate of a two-dimensional data set

A-different clustering index values (f)₁，f₂) Corresponding clustering result

B-different clustering index values (f)₁，f₂) Corresponding clustering result

C-different clustering index values (f)₁，f₂) Corresponding clustering result

D-different clustering index values (f)₁，f₂) Corresponding clustering result

Detailed Description

The effectiveness of the method provided by the invention is verified by a specific data clustering example. The method comprises the following specific steps:

And loading the data set to be processed, and carrying out cluster analysis on three different types of data sets in total, wherein the first data set is used for explanation. The data set contains N _data12 data, respectively: x is the number of₁＝[1,1]、 x₂＝[1,1.2]、x₃＝[1.2,1]、x₄＝[1.2,1.2]、x₅＝[2,2]、x₆＝[2.2,2]、x₇＝[2,2.2]、 x₈＝[2.2,2.2]、x₉＝[3,1]、x₁₀＝[3.2,1]、x₁₁＝[3,1.2]、x₁₂＝[3.2,1.2]Dimension M of data_data＝2。

Step two: computing a difference matrix and an adjacency matrix for a dataset

Calculating Euclidean distance d between all nodes in data set_ij＝||x_i-x_jAnd normalizing the same

Then a node difference matrix D is obtained_dataComprises the following steps:

for difference matrix D_dataSorting the rows in ascending order to obtain an adjacent matrix N_nearestComprises the following steps:

each row of the data set represents that differences between all nodes in the data set and the first node in the row are sorted from small to large.

Step three: solving a minimum spanning tree

According to the adjacency matrix N_nearestAnd solving the minimum spanning tree by adopting a prim algorithm. Randomly selecting a node 5 as a starting point, and selecting a node 6 with the minimum difference to be added into a vertex set S-5]In (1), then S ═ U [6 ]]＝[5,6]. Edge set V ═ 6 → 5]. Set of starting points S_sp＝[5,6]Middle and end point set S_epS of_ep(6) 5. Repeating the step, sequentially selecting nodes which are not in the vertex set and have the minimum difference with any node in the vertex set, generating a new edge, updating the vertex set, the edge set, the starting point set and the end point set, terminating the step when all nodes in the data set are added into the vertex set, wherein the starting point set is S_sp＝[5,6,7,8,4,2,3,1,11,9,12,10]The end point set is S_ep＝[2,4,4,5,5,5,5,6,11,12,6,11]. According to S_epAnd the element index labels (i.e. the starting node of each edge) can be connected in the following relationship: node 1 → node 2, node 2 → node 4, node 3 → node 4, node 4 → node 5, node 5 → node 5, node 6 → node 5, node 7 → node 5, node 8 → node 6, node 9 → node 11, node 10 → node 12, node 11 → node 6, node 12 → node 11, which are the same or different, and which are the same or different from each otherThe generated minimum spanning tree is shown in fig. 2 a.

Step four: calculating the weight value of each edge in the minimum spanning tree, and dividing the edge set in the minimum spanning tree into a strong connection edge set and a weak connection edge set according to the weight values

According to equation (3) and adjacency matrix N_nearestAnd a difference matrix D_dataCalculating the weight value W of each edge j → i_ji. As shown in FIG. 2a, each edge of the minimum spanning tree (from set of starting points S)_sp＝[5,6,7,8,4,2,3,1,11,9,12,10]End point set S_ep＝[2,4,4,5,5,5,5,6,11,12,6,11]And an end-set element index S_{ep_index}＝[1,2,3,4,5,6,7,8,9,10,11,12]Is expressed) corresponds to a weight value of W ═ 2.1, 0, 2.1,2.1, 5.5,2.1,3.1,2.1]. By applying all weight values W_jiAfter descending order selection, if λ is 0.4, the edges of the minimum spanning tree can be divided into a strong connection edge set and a weak connection edge set as shown in fig. 2 b. Wherein the weak connection starting point set is S_spw＝[3,4,11,12]The weak connecting edge set is E _s1, {3 → 4,4 → 5,11 → 6, 12 → 11}, and the set of strong connecting start points is S_sps＝[1,2,6,7,8,9,10]The strongly connected edge set is E _w1 → 2,2 → 4,6 → 5,7 → 5,8 → 6,9 → 11,10 → 12 }. After deleting the weak connection edge, the terminal set is changed into S_ep＝[2,4,U,U,5,5,5,6,11,12,U,U]Wherein U represents a weak connection starting point set S to be confirmed_spw＝[3,4,11,12]E.g., fig. 2 d.

Set of strong connection starting points S_sps＝[1,2,6,7,8,9,10]Randomly selecting a node as a starting point, dividing all the nodes connected together into the same class, and introducing a classification mark vector

Reference numerals indicating classes to which respective nodes in the data set are assigned, e.g. a (2) ═ 3 indicates that node 2 is divided into class 3, for the weakly connected set of starting points S_spwThe classification flags of the nodes are initialized to A (i) -1, i belongs to S_spwRepeating theStep until all strong connection starting point sets S_spsThe nodes in (1) are all allocated. The result of pre-clustering is shown in FIG. 2 c.

Step six: generation of an initial Pigeon group

Random generation of p_sizeEach pigeon comprises a spatial position

And speed

Wherein N is_wsNumber of nodes is concentrated for weak connection edge starting point, and N_wsI is the number of the pigeon 4. Setting the current simulation time as T as 0 and the maximum iteration number as T_maxThe map and compass operator is R0.3, the transfer factor is σ 0.45, and the threshold is 3.6.

P is to be_i＝[p_i1,p_i2,p_i3,p_i4]And end point set S_ep＝[2,4,U,U,5,5,5,6,11,12,U,U]S can be obtained by reconstructing the full code length of the spatial position of the pigeon_ep＝[2,4,p_i1,p_i2,5,5,5,6,11,12,p_i3,p_i4]Decoding the node connection relation to obtain the node connection relation shown in fig. 3, and then identifying all classes as a (i) -1, i e S according to the connection relation between the nodes_spwThe node (S) of (a) is allocated to the corresponding class until the class identifiers A (i) ≠ 1 of all nodes, i ∈ S_spw. Obtaining a clustering result C of all data in the data set according to the classification mark vector A_i＝[c₁,c₂,…,c_γ]Where γ is the number of classes formed, c_ττ is 1,2, …, and γ is τ -th group.

According to the clustering result C_iAnd equations (4) to (6) calculate the objective function f_i1(Cluster continuity indicator, objective function f corresponding to ith pigeon₁) And an objective function f_i2(Cluster tightness index, target function f corresponding to ith pigeon₂) Repeating the steps until the meter is obtainedCalculating objective function values F corresponding to all pigeons in the pigeon group_i(t)＝[f_i1(t),f_i2(t)]。

Step eight: non-dominant ranking of initial pigeon lots

Comparing pigeons p based on objective function values_i(t) with pigeon p_j(t) the dominant relationship, layering the entire pigeon population by Pareto non-dominant ranking algorithm. All non-dominated pigeons positioned at the first-level non-dominated layer are stored in an external archive set AS, one pigeons is randomly selected from the external archive set AS and is used AS the global optimal position p at the time t_best(t) taking the average position of the non-dominant pigeons of the first-level non-dominant layer in the external archive set AS AS the central position p of the pigeon flock at the time t_center(t)。

Step nine: updating the position and velocity of pigeons

P is to be_best(t)、p_center(t) and the positions of all pigeons at the time t are substituted into formula (7) to be calculated to obtain the pigeon position p_i(t) corresponding auxiliary vector ζ_i(t) and then the position p of the t +1 velocity pigeon can be calculated according to the velocity updating formula (8), the position auxiliary vector updating formula (9) and the position updating formula (10) of the pigeon_i(t +1) and velocity v_i(t + 1). This step is repeated until the positions and velocities of all pigeons are updated.

Step ten: evaluating objective function values of a pigeon flock at time t +1

Randomly selecting a pigeon from an external archive set AS, and taking the position of the pigeon AS a global optimal position p at the moment t +1_best(t +1) the central position of the non-dominant pigeon in the first-stage non-dominant layer is defined as the central position p of the pigeon flock_center(t + 1). Updating the number P of pigeons according to equation (11)_size(t+1)。

Step thirteen: determining whether to stop iteration

The simulation iteration time t is t + 1. If T is larger than the maximum simulation iteration number T_maxIf the result is 50, the simulation is ended; and if not, returning to the step eight.

Fourteen steps: outputting data clustering results

And outputting a clustering result, and drawing a Pareto frontier curve, wherein the clustering results of three different data sets are respectively shown in fig. 4 to 6. Fig. 4a is a given clustering result of the data set 1, and fig. 4e is a clustering Pareto front curve of the data set 1 obtained by adopting a combined multi-target pigeon group optimization algorithm, which represents a set of the best solution of a clustering objective function, and different optimal data clustering results can be obtained according to the requirements of clustering indexes. Fig. 4B is a data clustering result corresponding to a point a on the Pareto curve in fig. 4e, fig. 4C is a data clustering result corresponding to a point B on the Pareto curve in fig. 4e, and fig. 4d is a data clustering result corresponding to a point C on the Pareto curve in fig. 4 e. Fig. 5a is a given clustering result of the data set 2, and fig. 5f is a clustering Pareto front curve of the data set 2 obtained by adopting a combined multi-objective pigeon group optimization algorithm. Fig. 5B, fig. 5C, fig. 5D and fig. 5e are data clustering results corresponding to four points, namely point a, point B, point C and point D, on the Pareto curve in fig. 5f, respectively. Fig. 6a is a given clustering result of the data set 3, and fig. 6f is a clustering Pareto front curve of the data set 3 obtained by adopting a combined multi-objective pigeon group optimization algorithm. Fig. 6B, fig. 6C, fig. 6D and fig. 6e are data clustering results corresponding to four points, namely point a, point B, point C and point D, on the Pareto curve in fig. 6 f.

The clustering results of three different types of data sets verify that the unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization can effectively realize data clustering analysis.

Claims

1. An unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization is characterized in that: the method comprises the following steps:

the method comprises the following steps: loading an unmanned aerial vehicle situation data set;

To represent the ith data node in the data set;

step two: calculating a difference matrix and an adjacency matrix of the data set;

step three: solving a minimum spanning tree;

solving a minimum spanning tree by adopting a primm algorithm according to the adjacency matrix;

step four: dividing the edge set in the minimum spanning tree into a strong connection edge set and a weak connection edge set;

step five: decoding the strong connection edge set to obtain a pre-polymerization result;

step six: generating an initial pigeon population;

random generation of p_sizeEach pigeon comprises a spatial position and a speed;

step seven: evaluating the target function of the pigeon group at the moment t-0;

step eight: carrying out non-dominated sorting on the initial pigeon group, and determining the global historical optimal position and the central position of the pigeon group at the current moment;

step nine: updating the position and the speed of the pigeons;

step ten: evaluating a fitness function of the pigeon group at the t +1 moment;

according to the sixth step, the pigeons are decoded to form clusters, and calculation is carried outA fitness function of the pigeon group at the moment t + 1; comparing pigeons p based on objective function values_i(t +1) with pigeon p_j(t +1) storing all non-dominated pigeons located at the first-level non-dominated layer into an external archive set AS;

step eleven: sorting the external archive sets AS in a non-dominated way, and selecting pigeons needing to be discarded in the AS according to the crowding distance;

step twelve: updating the global optimal position and the central position of the pigeons, and updating the number of the pigeon groups;

step thirteen: judging whether to stop iteration;

the simulation iteration time t is t + 1; if T is larger than the maximum simulation iteration number T_maxIf yes, ending the simulation and entering a step fourteen; otherwise, returning to the step eight;

fourteen steps: outputting a data clustering result;

outputting a clustering result and drawing a Pareto leading edge curve;

the specific process of the step four is as follows: adjacency matrix N according to equation (3) and data set_nearestAnd a difference matrix D_dataCalculating the weight value W of each edge j → i_ji：

Wherein nn (i, j) is used to calculate that node j is the nearest neighbor of node i; according to the adjacency matrix N_nearestIt can be obtained if nn (i, j) ═ n_ikThen, it means that node j is the kth nearest neighbor of node i; nn (j, i) is used to compute that node i is the next nearest neighbor to node j;

the normalized Euclidean distance between a node j and a node i in the data set is obtained;

calculating the weighted values W of all edges of the minimum spanning tree of the data set, sorting the weighted values corresponding to the edges in a descending order, and sorting the weighted values with larger weighted value of lambdx (N)_data-1), λ is more than or equal to 0 and less than or equal to 1 edge, and weak connection edge set E is added into the edge_sThe remaining (1-lambda) × (N)_data-1) adding strong connecting edge set E to edges_w(ii) a Simultaneous starting set S_spIs also divided into weakly connected start point sets S accordingly_spwAnd strong connection starting point set S_sps(ii) a λ is the selected code length coefficient.

2. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: the specific process of the second step is as follows: by using Euclidean distance between nodes

To represent the difference between the node i and the node j in the data set; calculating Euclidean distance d between all nodes in data set_ijAnd normalizing it

wherein d is_minIs the minimum of the Euclidean distances between all nodes, d_maxThe maximum value of the Euclidean distances among all the nodes is obtained;

wherein,

3. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: the specific process of the third step is as follows:

randomly selecting a node i as a starting point, selecting a node j with the minimum difference from the starting point, and adding the node into a vertex set S ═ i]In (e), then S ═ U [ i ] U]＝[i,j](ii) a In addition, the generated edge j → i is added to the edge set V, [ j → i ═ V](ii) a In j → i, the starting point j of the edge is stored in the starting point set S_spIn the middle, the terminal i is stored in the terminal set S_epIn, S_ep(j) I denotes that node j is connected to node i, i being the end set S_epAnd j is called element i in the end point set S_epThe element index number in (1); repeating the step, sequentially selecting nodes which are not in the vertex set and have the minimum difference with any node in the vertex set, generating a new edge, updating the vertex set, the edge set, the starting point set and the end point set, and terminating the step when all nodes in the data set are added into the vertex set; sequentially collecting the terminal points S_epAnd their element index labels, can generate a minimum spanning tree for the data set.

4. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: further, a strong connection starting point set S_spsAll nodes j in the cluster are redistributed with connection end points k, k are adjacent nodes in m fields of the nodes j, so that a new connection relation j → k is formed to replace the original weak connection edge j → i, and all the weak connection end points k form the position of the ith pigeon

Wherein N is_wsIs S_spsThe number of intermediate nodes.

5. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: the concrete process of the step five is as follows: from strong connectionSet of points S_spsRandomly selecting a node as a starting point, dividing all the nodes connected together into the same class, and introducing a classification mark vector

Reference numbers indicating classes to which respective nodes in the data set are assigned, for the weakly connected set of starting points S_spwThe classification flags of the nodes are initialized to A (i) -1, i belongs to S_spwRepeating the steps until all strong connection starting point sets S_spsThe nodes in (1) are all allocated.

6. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: the concrete process of the step six is as follows:

according to the spatial position of pigeons

And end point set S_epReconstructing the full coding length of the spatial position of the pigeon, decoding the reconstructed pigeon to obtain the connection relation between the nodes, and then identifying all the classification identifications as A (i) ═ 1 according to the connection relation between the nodes, wherein i belongs to S_spwUntil all the classification identifiers A (i) ≠ -1, i belongs to S_spw(ii) a Obtaining a clustering result C of all data in the data set according to the classification mark vector A_i＝[c₁,c₂,…,c_γ]Where γ is the number of classes formed, c_ττ is 1,2, …, γ is τ -th group;

is composed of N_wsPosition of i-th pigeon composed of elements, N_wsSet of starting points S for weak connection_spsThe number of intermediate nodes;

selecting two cluster evaluation indexes of compactness and continuity to evaluate the quality of a cluster result, and representing the continuity of the cluster by using a cluster clustering distance:

wherein cd_iRepresenting the average distance from each node in the ith class to the cluster center;

representing the cluster compactness by using the cluster clustering distance; firstly, calculating the average distance from each node in each class to a cluster center according to a formula (5);

cluster center for the τ -th class; repeating the steps until the objective function values corresponding to all pigeons in the pigeon group are calculated, wherein the pigeon i represents the ith clustering result C ═ C₁,c₂,…,c_γ]The corresponding objective function value is F_i(t)＝[f_i1(t),f_i2(t)]；x_τRepresents any node in the τ -th class; i | for calculating x_τAnd cen_τThe distance of (c).

7. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: the concrete process of the step eight is as follows:

comparing the domination relationship between the pigeon i and the pigeon j based on the objective function values, and layering the whole pigeon group through a Pareto non-domination sorting algorithm; saving all non-dominated pigeons at the first level non-dominated layer to an external archive set AS, fromRandomly selecting a pigeon from an external archive set AS AS a global optimal position p at the moment t_best(t) taking the average position of all pigeons in the first level non-dominant layer in the external archive set AS AS p_center(t)。

8. The unmanned aerial vehicle situation data clustering method based on combined multi-target pigeon swarm optimization according to claim 1, characterized in that: the concrete process of the ninth step is as follows: introducing auxiliary vectors

ζ_ij∈[-1,0,1]The optimization algorithm of the continuous pigeon flock is converted into a combined optimization algorithm, so that the combined optimization algorithm can be used for solving a clustering optimization problem:

wherein p is_centerj(t) center position p of pigeon group at time t_centerThe j-th element of (t), p_gbestj(t) History optimal position p of pigeon flock at time t_gbestThe jth element of (t); the velocity update formula of the pigeon obtained according to the formula (7) is as follows:

is 1 XN_swA row vector of dimensions; r is a map and compass operator, and sigma is a transfer factor; p is a radical of_i(t) is the position of the ith pigeon at time t;

then according to the speed v of the pigeon at the moment t +1_i(t +1) calculation of ζ_iValue at time t + 1:

wherein, is a predetermined constant, ζ_ij(t +1) is ζ_iThe jth element of (1); will update ζ_i(t +1) updating the position of the pigeon at time t +1, instead of equation (10):

wherein λ is randomly selected p_ij(t) nearest neighbors; this step is repeated until the position and velocity update of all pigeons is completed.