CN106685686B - Network topology estimation method based on simulated annealing - Google Patents

Network topology estimation method based on simulated annealing Download PDF

Info

Publication number
CN106685686B
CN106685686B CN201610817914.6A CN201610817914A CN106685686B CN 106685686 B CN106685686 B CN 106685686B CN 201610817914 A CN201610817914 A CN 201610817914A CN 106685686 B CN106685686 B CN 106685686B
Authority
CN
China
Prior art keywords
node
tree
network
topology
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610817914.6A
Other languages
Chinese (zh)
Other versions
CN106685686A (en
Inventor
费高雷
何俊武
胡光岷
蒋晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201610817914.6A priority Critical patent/CN106685686B/en
Publication of CN106685686A publication Critical patent/CN106685686A/en
Application granted granted Critical
Publication of CN106685686B publication Critical patent/CN106685686B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a network topology estimation method based on simulated annealing, which comprises the following steps: s1, carrying out network tomography detection on the network with the known network topology structure; s2, processing the detection result by utilizing a wavelet packet decomposition algorithm to obtain a clustering characteristic; s3, selecting characteristics by using a simulated annealing algorithm; and S4, taking the selected features as the features used in the estimation of all other unknown network topologies, and estimating the network topology by utilizing a coacervation hierarchical clustering method. The invention selects the clustering characteristics by using a random optimization algorithm, namely a simulated annealing algorithm, can select the characteristics most beneficial to topology estimation, and excludes the characteristics of increased topology estimation error caused by the fact that the clustering characteristics are not selected before the topology is estimated by using a coacervation hierarchical clustering method in the traditional network tomography topology detection method.

Description

Network topology estimation method based on simulated annealing
Technical Field
The invention particularly relates to a network topology estimation method based on simulated annealing.
Background
With the rapid development of network technologies such as internet communication and the like, the connection between the life of people and the network is more and more tight, but while the network brings a lot of convenience to users, the scale of the network is larger and larger, the complexity of the network is higher and higher, great difficulty is brought to the guarantee of the service quality of the network, and the performance characteristics of the network are more and more concerned by the users and network supervision departments. However, most of the existing network measurement systems are performed on the premise of knowing the network topology, while the actual network is usually variable, and if the topology structure cannot be accurately inferred, the network cannot be accurately supervised. To this end, scholars have proposed methods for measuring networks by network topology estimation. Network topology estimation has become an important component of modern network management systems, and has a very important position in the scientific development of communication networks.
Network topology estimation refers to the process of searching and capturing certain elements in a network, finding the interrelations among the elements, and then displaying the relationships by using an appropriate topology structure. From the perspective of the particular element we are interested in, network topologies can be divided into physical topologies and logical topologies. The physical topology represents the connection relationship between each entity device in the network, and the logical topology describes how traffic is transmitted in the network. In this regard, network topology estimation refers to measuring the network of interest and inferring its logical topology.
The network tomography technology is a new technology for detecting logic topology in the internet, and is a cross-domain application of the tomography technology. Network tomography is based on an end-to-end technique to obtain information in the network that cannot be directly observed. It considers that the routing nodes inside the network to be probed do not return information to the observer. The method sends and receives detection packets among controllable nodes at the edge of the network to be detected, and the detection packets pass through the network to be detected in the transmission process, so that the detection result reflects the internal characteristics of the network, and the internal structure of the network to be detected can be reversely estimated.
Currently, network tomography mainly consists of two parts: the first part is the collection of probe data, wherein the main research is how to collect relevant useful information inside the network; the second part is statistical reasoning, which mainly finds out the information and rules inside the network according to the collected data.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a network topology estimation method based on simulated annealing, which selects clustering characteristics by using a random optimization algorithm of a simulated annealing algorithm, can select the characteristics most beneficial to topology estimation and excludes the characteristics causing the increase of topology estimation errors.
The purpose of the invention is realized by the following technical scheme: the network topology estimation method based on simulated annealing comprises the following steps:
s1, carrying out network tomography detection on the network with the known network topology structure;
s2, processing the detection result by utilizing a wavelet packet decomposition algorithm to obtain a clustering characteristic;
s3, selecting characteristics by using a simulated annealing algorithm;
and S4, taking the selected features as the features used in the estimation of all other unknown network topologies, and estimating the network topology by utilizing a coacervation hierarchical clustering method.
Further, the specific implementation method of step S2 is as follows: each iteration of wavelet packet decomposition divides a last signal into a low-frequency part and a high-frequency part, wherein S is an original signal, A represents approximate filtering and is regarded as a low-frequency part of a previous-order signal; d represents detail filtering and is regarded as the high-frequency part of the previous-order signal;
wherein y [ n ]]For the signal to be decomposed, g [ n ]]Is a low-pass filter, h [ n ]]Is a high-pass filter, satisfies h [ n ]]=(-1)ng[1-n];g[n]A scale function ψ (t) corresponding to wavelet transform, called a scale vector; h [ n ]]Wavelet function corresponding to wavelet transform
Figure DEST_PATH_GDA0001248192080000021
And (3) calculating by taking the scale vector corresponding to the db4 wavelet as g [ n ]: performing low-pass and high-pass filtering on the signals respectively, and performing down-sampling to obtain low-frequency and high-frequency wavelet packet decomposition coefficients respectively:
Figure DEST_PATH_GDA0001248192080000022
Figure DEST_PATH_GDA0001248192080000023
in the same way, for Y1,0[n]And Y1,1[n]Performing the same treatment to obtain 4 groups of wavelet packet decomposition coefficients of the second layer, and repeating the steps to obtain any layerWavelet packet decomposition coefficients of an arbitrary number of layers; the recurrence formula is:
Figure DEST_PATH_GDA0001248192080000024
Figure DEST_PATH_GDA0001248192080000025
original signal X for each destination node using 4-layer wavelet packet decompositioniProcessing to obtain 16 groups of wavelet packet decomposition coefficients, and recording as Yi,jWhere i is 0, 1.. multidot.n-1 is the destination node number, j is 0, 1.. multidot.15 is the wavelet packet decomposition coefficient number, XiAnd Yi,jAre all vectors.
Further, the step S3 includes the following sub-steps:
s31, selecting several groups at random from the 16 groups of wavelet packet decomposition coefficients obtained in S2 as the first estimated features, and setting the feature selection state vector F to (F)1,f2,...,f16) Is shown in which f1,f2,...,f16∈ {0, 1}, a value of 0 indicating that the feature was not selected, and 1 indicating that the feature was selected;
let R be result set, used for depositing the key value pair that the result of characteristic choice and its error rate make up, R { }underthe initialization; initialization temperature T ═ T02m, wherein m is the number of destination nodes;
setting an upper limit L of the iteration times at the same temperature to be 20, initializing an iteration counter L to be 0, initializing an unaccepted new solution time t to be 0, and cooling the time n to be 0;
s32, carrying out topology estimation based on coacervation hierarchical clustering on the sample according to F, calculating error rate e (F) of the topology estimation result by using the tree editing distance, updating R ═ R { (F): e (f);
s33, when L is less than L, executing step S34, otherwise jumping to step S39;
s34, changing the selection state of the t-th feature in the F, and keeping the selection states of other features unchanged to obtain a new group of solutions F'i=(f′i,1,f′i,2,...,f′i,16) Which satisfies:
Figure DEST_PATH_GDA0001248192080000031
let i take all values through, resulting in 16 new sets of solutions: f'1,F′2,...,F′16
S35, judging whether the new solution exists in R or not, and executing the method same as the step S32 on the solution which does not exist in R to obtain e (F'i) Update R ═ R ∪ { F'i:e(F′i) Get e*=min(e(F′i))=e(F′m) And (b) corresponding F'm
S36, let l be l +1, calculate Δ e be e*-e (F) accepting a new solution F 'with probability p according to the metropolis criterion'mP is calculated as follows:
Figure DEST_PATH_GDA0001248192080000032
if new solution is received, let F ═ F'm,e(F)=e*When t is 0, go to step S33; otherwise, carrying out S37 when t is t + 1;
s37, when L is less than L, executing step S38, otherwise jumping to step S39;
s38, changing the existing selection state with a probability of 50% for each feature in F, obtaining a new set of solutions F'mThe same method as step S32 is performed to obtain e*=e(F′m) Update R ═ R ∪ { F'm:e(F′m) Jumping to step S36;
s39, let n be n +1,
Figure DEST_PATH_GDA0001248192080000033
l is 0; when t is less than 10 and n is less than 10, jumping to step S33; otherwise, selecting the optimal solution F from R*As a result of the feature selection, the algorithm ends.
Further, the specific implementation method of the topology estimation based on the agglomerative hierarchical clustering in step S32 is as follows:
s321, defining the distance between each cluster sample, and obtaining a distance matrix; for each destination node i, the selected characteristics are set as
Figure DEST_PATH_GDA0001248192080000034
Wherein k is not less than 01<k2<…<knK is not more than 151,k2,...kn∈ N, splicing the features into a new vector
Figure DEST_PATH_GDA0001248192080000035
For any two destination nodes i, j, solving the characteristic vector Y of the destination nodes i, jiAnd YjCorrelation coefficient between:
Figure DEST_PATH_GDA0001248192080000041
wherein
Figure DEST_PATH_GDA0001248192080000042
And
Figure DEST_PATH_GDA0001248192080000043
respectively represent YiAnd YjThe mathematical expectation of (1) is mean:
Figure DEST_PATH_GDA0001248192080000044
defining the correlation distance between the nodes i, j as:
di,j=1-|ρi,j|
when the correlation of the feature vectors of two destination nodes i, j is stronger, the absolute value rho of the correlation coefficient isi,jThe closer to 1, the relative distance d between the twoi,jThe closer to 0;
obtaining a correlation distance matrix among all destination nodes:
Figure DEST_PATH_GDA0001248192080000045
di,j=dj,i(i, j ═ 1, 2.., n) and d1,1=d2,2=…=dn,n0; so the matrix D is a symmetric matrix with 0 diagonal;
s322, hierarchical clustering is carried out to obtain topology, and the method comprises the following steps:
s3221, during initialization, each sample is classified into one type, n types in total, and marked as G1,G2,...,GnLet it form N leaf nodes of the cluster tree, denoted as N1,N2,...,Nn(ii) a Each node is assigned a weight w1,w2,...,wnLet w1=w2=…=wnThe distance matrix between classes is G-D; any two kinds of Gi,GjThe distance between is gi,j=di,jInitializing the clustering frequency l to be 1;
s3222, searching for minimum distance G in Gs,t=mini≠j(gi,j) Will correspond to two classes Gs,GtAre aggregated into a new class Gn+lConstructing a new node Nn+lAs Ns,NtThe parent node of (2) having its weight value wn+l=gs,t
S3223, calculating the new class G by using the formulan+lDistance to other classes, two classes Gp,GqThe distance between them is calculated by the formula:
Figure DEST_PATH_GDA0001248192080000046
s3224, wherein np,nqAre respectively class Gp,GqThe number of samples in (1);
s3225, updating the distance matrix G between classes: neutralizing G with Gs,GtThe associated rows and columns are eliminated and finally a new class G is addedn+lThe corresponding value is the distance from the new class to other classes;
s3226, updating the clustering frequency l ═ l + 1;
s3227, and repeating S3222 to S3225 until only one class G remains2n-1
Further, in the clustering process of step S322, in each round, two classes are merged into a new class until only one class remains, and a weighted binary tree with a node number of 2n-1 is generated; the nodes on the binary tree are merged according to the following rules: setting a threshold t equal to 0.02, and setting a slave node Nn+1Start to N2n-2If there is one node NiAnd its parent node NjIs satisfied with
Figure DEST_PATH_GDA0001248192080000051
Then N will beiAnd NjMerge, i.e. NiIs changed into NjPost-delete NiAnd finally, the obtained tree is used as the network logic topology of the estimation.
Further, the specific implementation method of calculating the error rate of the topology estimation result by using the tree edit distance in step S32 is as follows:
three tree editing operations are defined:
and (3) changing the node label: defining labels of leaf nodes of the tree network topology as respective corresponding numbers, and leaving labels of other nodes as null; setting the cost r (a → b) of changing the node label from a to b as 2m, wherein m is the number of target nodes in the network detection process, namely the number of leaf nodes of the tree;
and deleting the nodes: deleting a non-root node v in the tree T, setting a father node of the non-root node v as v ', and changing the father node of a child node of v into v'; setting the cost r (a → Λ) of the deleted node as 1, wherein a is a label of the node to be deleted, and Λ represents a null node;
adding nodes: the newly added node v is used as a child node of the node v ', and the father node of partial child nodes of the node v' is changed into v; setting the cost r (Λ → a) of the node to be added as 1, wherein Λ represents a null node, and a is a label of the node to be added;
let E be the Slave Tree T1To tree T2Contains a number of tree editing operations e1,e2,...,en(ii) a Memory T1Conversion to T2Total cost of (e) ═ r (e)1)+r(e2)+…+r(en) (ii) a The tree edit distance is min (r (E)), i.e. T1Conversion to T2The minimum total cost of;
to find the tree edit distance, the problem is converted to find two trees T1And T2Maximum matching subtree problem between: let the common subtree be M, T1And T2The matching relationship of (M, T)1,T2) Is of T1But node set not belonging to M is N1Is of T2But node set not belonging to M is N2,T1The label of the middle node i is marked as l1(i),T2The label of the middle node i is marked as l2(i) (ii) a Then there are:
Figure DEST_PATH_GDA0001248192080000052
by dynamic programming algorithm, r ((M, T) is obtained1,T2) ) the tree edit distance.
The invention has the beneficial effects that:
1. the invention selects the clustering characteristics by using a random optimization algorithm, namely a simulated annealing algorithm, can select the characteristics most beneficial to topology estimation, and excludes the characteristics that the clustering characteristics are not selected before the topology is estimated by using a coacervation hierarchical clustering method in the traditional network tomography topology detection method, so that the error of topology estimation is increased;
2. in the process of selecting characteristics by using a simulated annealing algorithm, the invention uses two methods for obtaining new solutions, wherein the first method can quickly find a local optimal solution in a certain range, and the second method can jump out of the range to find optimal solutions in other ranges.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a schematic diagram of a wavelet packet decomposition process;
fig. 3 is a flowchart of wavelet packet decomposition computation.
Detailed Description
The invention uses wavelet packet decomposition to process the time delay change curve of the original input signal, namely the source node to each destination node, and obtains the time frequency characteristics of the signal on different frequency bands. By comparing the correlation among wavelet packet decomposition coefficient vectors on different paths, the shared length depiction among different destination nodes can be obtained, and further the network logic topology is obtained by a coacervation hierarchical clustering method.
In principle, the more the changes of the transmission delay in each time-frequency domain are grasped, the more clearly the similarity of the paths from the source node to the different destination nodes can be delineated. However, in practice, not all the wavelet packet decomposition coefficients are good for obtaining correct results, where there is noise data, for example, the low-frequency dc component is affected by the actual physical link length, and if the ratio of the length of the non-shared path is high, the determination of the length of the shared path is seriously affected. In order to eliminate the negative influence of partial features on the topology estimation as much as possible and improve the accuracy of the topology estimation as much as possible, feature selection needs to be performed on the coefficients. On the basis of improving the accuracy of topology estimation, the reduction of the characteristic number also reduces the subsequent calculation amount.
To evaluate whether the selected features can accurately estimate the topology, a known network topology is required as a reference. The method of traceroute and the like can be used for obtaining a more accurate topological structure. After the topology is obtained, tomography detection is carried out on the network, the topology obtained through tomography is compared with the topology obtained through other modes, a feature set enabling the two topologies to be most similar is found out and serves as a feature set in later practical application, and the feature set is used for detecting and topology estimating on the network with unknown topology.
The technical scheme of the invention is further explained by combining the attached drawings.
As shown in fig. 1, the network topology estimation method based on simulated annealing includes the following steps:
s1, carrying out network tomography detection on the network with the known network topology structure;
s2, under the actual network environment, the network flow has high burstiness, which causes more medium-high frequency components in the time delay curve, in order to more finely acquire the time-frequency characteristics on the high-frequency components, the invention utilizes the wavelet packet decomposition algorithm to process the detection result to obtain the clustering characteristics; the specific implementation method comprises the following steps: as shown in fig. 2, each iteration of wavelet packet decomposition subdivides the previous signal into two parts, namely a low frequency part and a high frequency part, and the frequency band is subdivided the more the iteration times are, wherein S is the original signal, and a represents approximate (approximation) filtering and is regarded as the low frequency part of the previous-order signal; d represents detail (detail) filtering, which is regarded as the high-frequency part of the previous-order signal;
the calculation flow of wavelet packet decomposition is shown in FIG. 3, where y [ n ]]For the signal to be decomposed, g [ n ]]Is a low-pass filter, h [ n ]]Is a high-pass filter, satisfies h [ n ]]=(-1)ng[1-n];g[n]A scale function ψ (t) corresponding to wavelet transform, called a scale vector; h [ n ]]Wavelet function corresponding to wavelet transform
Figure DEST_PATH_GDA0001248192080000071
And (3) calculating by taking the scale vector corresponding to the db4 wavelet as g [ n ]: performing low-pass and high-pass filtering on the signals respectively, and performing down-sampling to obtain low-frequency and high-frequency wavelet packet decomposition coefficients respectively:
Figure DEST_PATH_GDA0001248192080000072
Figure DEST_PATH_GDA0001248192080000073
in the same way, for Y1,0[n]And Y1,1[n]Performing the same treatment to obtain 4 groups of wavelet packet decomposition coefficients of the second layer, and so on to obtain wavelet packet decomposition coefficients of any number of layers; the recurrence formula is:
Figure DEST_PATH_GDA0001248192080000074
Figure DEST_PATH_GDA0001248192080000075
original signal X for each destination node using 4-layer wavelet packet decompositioniProcessing to obtain 16 groups of wavelet packet decomposition coefficients, and recording as Yi,jWhere i is 0, 1.. multidot.n-1 is the destination node number, j is 0, 1.. multidot.15 is the wavelet packet decomposition coefficient number, XiAnd Yi,jAre all vectors. Since the 16 sets of wavelet packet decomposition coefficients are not all capable of correctly reflecting the correlation between nodes, the 16 sets of wavelet packet decomposition coefficients are subjected to feature selection to obtain several sets of coefficients with the best effect.
S3, selecting characteristics by using a simulated annealing algorithm, and judging whether the characteristics are selected by using the error between the estimated topology and the known topology; the method comprises the following substeps:
s31, selecting several groups at random from the 16 groups of wavelet packet decomposition coefficients obtained in S2 as the first estimated features, and setting the feature selection state vector F to (F)1,f2,...,f16) Is shown in which f1,f2,...,f16∈ {0, 1}, a value of 0 indicating that the feature was not selected, and 1 indicating that the feature was selected;
let R be result set, used for depositing the key value pair that the result of characteristic choice and its error rate make up, R { }underthe initialization; initialization temperature T ═ T02m, wherein m is the number of destination nodes;
setting an upper limit L of the iteration times at the same temperature to be 20, initializing an iteration counter L to be 0, initializing an unaccepted new solution time t to be 0, and cooling the time n to be 0;
s32, carrying out topology estimation based on coacervation hierarchical clustering on the sample according to F, calculating error rate e (F) of the topology estimation result by using the tree editing distance, updating R ═ R { (F): e (f);
the specific implementation method of the topology estimation based on the coacervation hierarchical clustering comprises the following steps:
s321, defining the distance between each cluster sample, and obtaining a distance matrix; for each destination node i, the selected characteristics are set as
Figure DEST_PATH_GDA0001248192080000081
Wherein k is not less than 01<k2<…<knK is not more than 151,k2,...kn∈ N, splicing the features into a new vector
Figure DEST_PATH_GDA0001248192080000082
For any two destination nodes i, j, solving the characteristic vector Y of the destination nodes i, jiAnd YjCorrelation coefficient between:
Figure DEST_PATH_GDA0001248192080000083
wherein
Figure DEST_PATH_GDA0001248192080000084
And
Figure DEST_PATH_GDA0001248192080000085
respectively represent YiAnd YjThe mathematical expectation of (1) is mean:
Figure DEST_PATH_GDA0001248192080000086
defining the correlation distance between the nodes i, j as:
di,j=1-|ρi,j|
when the correlation of the feature vectors of two destination nodes i, jThe stronger the absolute value ρ of the correlation coefficienti,jThe closer to 1, the relative distance d between the twoi,jThe closer to 0;
obtaining a correlation distance matrix among all destination nodes:
Figure DEST_PATH_GDA0001248192080000087
di,j=dj,i(i, j ═ 1, 2.., n) and d1,1=d2,2=…=dn,n0; so the matrix D is a symmetric matrix with 0 diagonal;
s322, hierarchical clustering is carried out to obtain topology, and the method comprises the following steps:
s3221, during initialization, each sample is classified into one type, n types in total, and marked as G1,G2,...,GnLet it form N leaf nodes of the cluster tree, denoted as N1,N2,...,Nn(ii) a Each node is assigned a weight w1,w2,...,wnLet w1=w2=…=wnThe distance matrix between classes is G-D; any two kinds of Gi,GjThe distance between is gi,j=di,jInitializing the clustering frequency l to be 1;
s3222, searching for minimum distance G in Gs,t=mini≠j(gi,j) Will correspond to two classes Gs,GtAre aggregated into a new class Gn+lConstructing a new node Nn+lAs Ns,NtThe parent node of (2) having its weight value wn+l=gs,t
S3223, calculating the new class G by using the formulan+lDistance to other classes, two classes Gp,GqThe distance between them is calculated by the formula:
Figure DEST_PATH_GDA0001248192080000091
s3224, wherein np,nqAre respectively class Gp,GqThe number of samples in (1);
s3225, updating the distance matrix G between classes: neutralizing G with Gs,GtThe associated rows and columns are eliminated and finally a new class G is addedn+lThe corresponding value is the distance from the new class to other classes;
s3226, updating the clustering frequency l ═ l + 1;
s3227, and repeating S3222 to S3225 until only one class G remains2n-1
In the clustering process, two classes are merged into a new class in each round until only one class remains, and a weighted binary tree with the node number of 2n-1 is generated; the nodes on the binary tree are merged according to the following rules: setting a threshold t equal to 0.02, and setting a slave node Nn+1Start to N2n-2If there is one node NiAnd its parent node NjIs satisfied with
Figure DEST_PATH_GDA0001248192080000092
Then N will beiAnd NjMerge, i.e. NiIs changed into NjPost-delete NiAnd finally, the obtained tree is used as the network logic topology of the estimation.
The specific implementation method for calculating the error rate of the topology estimation result by using the tree edit distance comprises the following steps:
three tree editing operations are defined:
and (3) changing the node label: defining labels of leaf nodes of the tree network topology as respective corresponding numbers, and leaving labels of other nodes as null; setting the cost r (a → b) of changing the node label from a to b as 2m, wherein m is the number of target nodes in the network detection process, namely the number of leaf nodes of the tree;
and deleting the nodes: deleting a non-root node v in the tree T, setting a father node of the non-root node v as v ', and changing the father node of a child node of v into v'; setting the cost r (a → Λ) of the deleted node as 1, wherein a is a label of the node to be deleted, and Λ represents a null node;
adding nodes: the newly added node v is used as a child node of the node v ', and the father node of partial child nodes of the node v' is changed into v; setting the cost r (Λ → a) of the node to be added as 1, wherein Λ represents a null node, and a is a label of the node to be added;
let E be the Slave Tree T1To tree T2Contains a number of tree editing operations e1,e2,...,en(ii) a Memory T1Conversion to T2Total cost of (e) ═ r (e)1)+r(e2)+…+r(en) (ii) a The tree edit distance is min (r (E)), i.e. T1Conversion to T2The minimum total cost of;
to find the tree edit distance, the problem is converted to find two trees T1And T2Maximum matching subtree problem between: let the common subtree be M, T1And T2The matching relationship of (M, T)1,T2) Is of T1But node set not belonging to M is N1Is of T2But node set not belonging to M is N2,T1The label of the middle node i is marked as l1(i),T2The label of the middle node i is marked as l2(i) (ii) a Then there are:
Figure DEST_PATH_GDA0001248192080000101
by dynamic programming algorithm, r ((M, T) is obtained1,T2) ) the tree edit distance.
S33, when L is less than L, executing step S34, otherwise jumping to step S39;
s34, changing the selection state of the ith feature in the F, and keeping the selection states of other features unchanged to obtain a new group of solutions F'i=(f′i,1,f′i,2,...,f′i,16) Which satisfies:
Figure DEST_PATH_GDA0001248192080000102
let iAll values were taken over to obtain 16 new solutions: f'1,F′2,...,F′16
S35, judging whether the new solution exists in R or not, and executing the method same as the step S32 on the solution which does not exist in R to obtain e (F'i) Update R ═ R ∪ { F'i:e(F′i) Get e*=min(e(F′i))=e(F′m) And (b) corresponding F'm
S36, let l be l +1, calculate Δ e be e*-e (F) accepting a new solution F 'with probability p according to the metropolis criterion'mP is calculated as follows:
Figure DEST_PATH_GDA0001248192080000103
if new solution is received, let F ═ F'm,e(F)=e*When t is 0, go to step S33; otherwise, carrying out S37 when t is t + 1;
s37, when L is less than L, executing step S38, otherwise jumping to step S39;
s38, changing the existing selection state with a probability of 50% for each feature in F, obtaining a new set of solutions F'mThe same method as step S32 is performed to obtain e*=e(F′m) Update R ═ R ∪ { F'm:e(F′m) Jumping to step S36;
s39, let n be n +1,
Figure DEST_PATH_GDA0001248192080000104
l is 0; when t is less than 10 and n is less than 10, jumping to step S33; otherwise, selecting the optimal solution F from R*As a result of the feature selection, the algorithm ends.
And S4, taking the selected features as the features used for estimating all other unknown network topologies, and estimating the network topology by using the same aggregation level clustering method as that in the step S32.
It will be appreciated by those of ordinary skill in the art that the embodiments described herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited embodiments and examples. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (2)

1. The network topology estimation method based on simulated annealing is characterized by comprising the following steps of:
s1, carrying out network tomography detection on the network with the known network topology structure;
s2, processing the detection result by utilizing a wavelet packet decomposition algorithm to obtain a clustering characteristic; the specific implementation method comprises the following steps: each iteration of wavelet packet decomposition divides a last signal into a low-frequency part and a high-frequency part, wherein S is an original signal, A represents approximate filtering and is regarded as a low-frequency part of a previous-order signal; d represents detail filtering and is regarded as the high-frequency part of the previous-order signal;
wherein y [ n ]]For the signal to be decomposed, g [ n ]]Is a low-pass filter, h [ n ]]Is a high-pass filter, satisfies h [ n ]]=(-1)ng[1-n];g[n]A scale function ψ (t) corresponding to wavelet transform, called a scale vector; h [ n ]]Wavelet function corresponding to wavelet transform
Figure FDA0002510625760000011
And (3) calculating by taking the scale vector corresponding to the db4 wavelet as g [ n ]: performing low-pass and high-pass filtering on the signals respectively, and performing down-sampling to obtain low-frequency and high-frequency wavelet packet decomposition coefficients respectively:
Figure FDA0002510625760000012
Figure FDA0002510625760000013
in the same way, for Y1,0[n]And Y1,1[n]Performing the same treatment to obtain4 groups of wavelet packet decomposition coefficients of the second layer are obtained by analogy, and wavelet packet decomposition coefficients of any number of layers are obtained; the recurrence formula is:
Figure FDA0002510625760000014
Figure FDA0002510625760000015
original signal X for each destination node using 4-layer wavelet packet decompositioniProcessing to obtain 16 groups of wavelet packet decomposition coefficients, and recording as Yi,jWhere i is 0, 1.. multidot.n-1 is the destination node number, j is 0, 1.. multidot.15 is the wavelet packet decomposition coefficient number, XiAnd Yi,jAre all vectors;
s3, selecting characteristics by using a simulated annealing algorithm; the method comprises the following substeps:
s31, selecting several groups at random from the 16 groups of wavelet packet decomposition coefficients obtained in S2 as the first estimated features, and setting the feature selection state vector F to (F)1,f2,...,f16) Is shown in which f1,f2,...,f16∈ {0, 1}, a value of 0 indicating that the feature was not selected, and 1 indicating that the feature was selected;
let R be result set, used for depositing the key value pair that the result of characteristic choice and its error rate make up, R { }underthe initialization; initialization temperature T ═ T02m, wherein m is the number of destination nodes;
setting an upper limit L of the iteration times at the same temperature to be 20, initializing an iteration counter L to be 0, initializing an unaccepted new solution time t to be 0, and cooling the time n to be 0;
s32, carrying out topology estimation based on coacervation hierarchical clustering on the sample according to F, calculating error rate e (F) of the topology estimation result by using the tree editing distance, updating R ═ R { (F): e (f);
the specific implementation method of the topology estimation based on the coacervation hierarchical clustering comprises the following steps:
s321, defining between each cluster sampleDistance, obtaining a distance matrix; for each destination node i, the selected characteristics are set as
Figure FDA0002510625760000021
Wherein k is not less than 01<k2<…<knK is not more than 151,k2,...kn∈ N, splicing the features into a new vector
Figure FDA0002510625760000022
For any two destination nodes i, j, solving the characteristic vector Y of the destination nodes i, jiAnd YjCorrelation coefficient between:
Figure FDA0002510625760000023
wherein
Figure FDA0002510625760000024
And
Figure FDA0002510625760000025
respectively represent YiAnd YjThe mathematical expectation of (1) is mean:
Figure FDA0002510625760000026
defining the correlation distance between the nodes i, j as:
di,j=1-|ρi,j|
when the correlation of the feature vectors of two destination nodes i, j is stronger, the absolute value rho of the correlation coefficient isi,jThe closer to 1, the relative distance d between the twoi,jThe closer to 0;
obtaining a correlation distance matrix among all destination nodes:
Figure FDA0002510625760000027
di,j=dj,i(i, j ═ 1, 2.., n) and d1,1=d2,2=…=dn,n0; so the matrix D is a symmetric matrix with 0 diagonal;
s322, hierarchical clustering is carried out to obtain topology, and the method comprises the following steps:
s3221, during initialization, each sample is classified into one type, n types in total, and marked as G1,G2,...,GnLet it form N leaf nodes of the cluster tree, denoted as N1,N2,...,Nn(ii) a Each node is assigned a weight w1,w2,...,wnLet w1=w2=…=wnThe distance matrix between classes is G-D; any two kinds of Gi,GjThe distance between is gi,j=di,jInitializing the clustering frequency l to be 1;
s3222, searching for minimum distance G in Gs,t=mini≠j(gi,j) Will correspond to two classes Gs,GtAre aggregated into a new class Gn+lConstructing a new node Nn+1As Ns,NtThe parent node of (2) having its weight value wn+l=gs,t
S3223, calculating the new class G by using the formulan+lDistance to other classes, two classes Gp,GqThe distance between them is calculated by the formula:
Figure FDA0002510625760000031
s3224, wherein np,nqAre respectively class Gp,GqThe number of samples in (1);
s3225, updating the distance matrix G between classes: neutralizing G with Gs,GtThe associated rows and columns are eliminated and finally a new class G is addedn+lThe corresponding value is the distance from the new class to other classes;
s3226, updating the clustering frequency l ═ l + 1;
s3227, and repeating S3222 to S3225 until only one class G remains2n-1
The specific implementation method for calculating the error rate of the topology estimation result by using the tree edit distance comprises the following steps:
three tree editing operations are defined:
and (3) changing the node label: defining labels of leaf nodes of the tree network topology as respective corresponding numbers, and leaving labels of other nodes as null; setting the cost r (a → b) of changing the node label from a to b as 2m, wherein m is the number of target nodes in the network detection process, namely the number of leaf nodes of the tree;
and deleting the nodes: deleting a non-root node v in the tree T, setting a father node of the non-root node v as v ', and changing the father node of a child node of v into v'; setting the cost r (a → Λ) of the deleted node as 1, wherein a is a label of the node to be deleted, and Λ represents a null node;
adding nodes: the newly added node v is used as a child node of the node v ', and the father node of partial child nodes of the node v' is changed into v; setting the cost r (Λ → a) of the node to be added as 1, wherein Λ represents a null node, and a is a label of the node to be added;
let E be the Slave Tree T1To tree T2Contains a number of tree editing operations e1,e2,...,en(ii) a Memory T1Conversion to T2Total cost of (e) ═ r (e)1)+r(e2)+…+r(en) (ii) a The tree edit distance is min (r (E)), i.e. T1Conversion to T2The minimum total cost of;
to find the tree edit distance, the problem is converted to find two trees T1And T2Maximum matching subtree problem between: let the common subtree be M, T1And T2The matching relationship of (M, T)1,T2) Is of T1But node set not belonging to M is N1Is of T2But node set not belonging to M is N2,T1The label of the middle node i is marked as l1(i),T2The label of the middle node i is marked as l2(i) (ii) a Then there are:
Figure FDA0002510625760000032
by dynamic programming algorithm, r ((M, T) is obtained1,T2) Minimum of) i.e. tree edit distance;
s33, when L is less than L, executing step S34, otherwise jumping to step S39;
s34, changing the selection state of the ith feature in the F, and keeping the selection states of other features unchanged to obtain a new group of solutions F'i=(f′i,1,f′i,2,...,f′i,16) Which satisfies:
Figure FDA0002510625760000041
let i take all values through, resulting in 16 new sets of solutions: f'1,F′2,...,F′16
S35, judging whether the new solution exists in R or not, and executing the method same as the step S32 on the solution which does not exist in R to obtain e (F'i) Update R ═ R ∪ { F'i:e(F′i) Get e*=min(e(F′i))=e(F′m) And (b) corresponding F'm
S36, let l be l +1, calculate Δ e be e*-e (F) accepting a new solution F 'with probability p according to the metropolis criterion'mP is calculated as follows:
Figure FDA0002510625760000042
if new solution is received, let F ═ F'm,e(F)=e*When t is 0, go to step S33; otherwise, carrying out S37 when t is t + 1;
s37, when L is less than L, executing step S38, otherwise jumping to step S39;
s38, changing the existing selection state with a probability of 50% for each feature in F, obtaining a new set of solutions F'mThe same method as that of step S32 is performed to obtainTo e*=e(F′m) Update R ═ R ∪ { F'm:e(F′m) Jumping to step S36;
s39, let n be n +1,
Figure FDA0002510625760000043
l is 0; when t is less than 10 and n is less than 10, jumping to step S33; otherwise, selecting the optimal solution F from R*Ending the algorithm as a result of the feature selection;
and S4, taking the selected features as the features used in the estimation of all other unknown network topologies, and estimating the network topology by utilizing a coacervation hierarchical clustering method.
2. The simulated annealing-based network topology estimation method according to claim 1, wherein in the step S322, in the clustering process, each round merges two classes into a new class until only one class remains, and generates a weighted binary tree with a node number of 2 n-1; the nodes on the binary tree are merged according to the following rules: setting a threshold t equal to 0.02, and setting a slave node Nn+1Start to N2n-2If there is one node NiAnd its parent node NjIs satisfied with
Figure FDA0002510625760000044
Then N will beiAnd NjMerge, i.e. NiIs changed into NjPost-delete NiAnd finally, the obtained tree is used as the network logic topology of the estimation.
CN201610817914.6A 2016-09-12 2016-09-12 Network topology estimation method based on simulated annealing Active CN106685686B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610817914.6A CN106685686B (en) 2016-09-12 2016-09-12 Network topology estimation method based on simulated annealing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610817914.6A CN106685686B (en) 2016-09-12 2016-09-12 Network topology estimation method based on simulated annealing

Publications (2)

Publication Number Publication Date
CN106685686A CN106685686A (en) 2017-05-17
CN106685686B true CN106685686B (en) 2020-09-18

Family

ID=58840010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610817914.6A Active CN106685686B (en) 2016-09-12 2016-09-12 Network topology estimation method based on simulated annealing

Country Status (1)

Country Link
CN (1) CN106685686B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107426207B (en) * 2017-07-21 2019-09-27 哈尔滨工程大学 A kind of network intrusions method for detecting abnormality based on SA-iForest
CN111478807B (en) * 2020-04-02 2023-03-24 山东省计算中心(国家超级计算济南中心) Construction method of minimum feedback node set of directed multilayer network

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015161872A1 (en) * 2014-04-23 2015-10-29 Telefonaktiebolaget L M Ericsson (Publ) Network tomography through selection of probing paths

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101540061B (en) * 2009-04-10 2011-06-22 西北工业大学 Topological and ordering matching method for disordered images based on simulated annealing
CN102711206B (en) * 2012-05-14 2014-08-06 南京邮电大学 Simulated annealing-based wireless sensor network (WSN) hierarchical routing method
CN103281256B (en) * 2013-04-26 2016-05-25 北京邮电大学 The end-to-end path packet loss detection method of chromatography Network Based
CN103678917B (en) * 2013-12-13 2016-11-23 杭州易和网络有限公司 A kind of real-time arrival time Forecasting Methodology of public transport based on simulated annealing

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015161872A1 (en) * 2014-04-23 2015-10-29 Telefonaktiebolaget L M Ericsson (Publ) Network tomography through selection of probing paths

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于PMU测量数据的暂态稳定预测分类特征选择;汪马翔;《中国优秀博硕士学位论文全文数据库 (硕士)(工程科技Ⅱ辑)》;20070515(第05期);第C042-151页 *
基于小波变换的非平稳网络拓扑估计方法研究;王鑫;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20160215(第02期);第I139-12页 *

Also Published As

Publication number Publication date
CN106685686A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
WO2021089013A1 (en) Spatial graph convolutional network training method, electronic device and storage medium
US20220309334A1 (en) Graph neural networks for datasets with heterophily
CN115455471A (en) Federal recommendation method, device, equipment and storage medium for improving privacy and robustness
CN106685686B (en) Network topology estimation method based on simulated annealing
Zhu et al. Portal nodes screening for large scale social networks
Bautista et al. L γ-PageRank for semi-supervised learning
Bi et al. Large-scale network traffic prediction with LSTM and temporal convolutional networks
Ghavipour et al. A streaming sampling algorithm for social activity networks using fixed structure learning automata
Hinder et al. Concept Drift Segmentation via Kolmogorov-Trees.
Fish et al. Entropic regression with neurologically motivated applications
Ren et al. Causal discovery with flow-based conditional density estimation
Liu et al. Distributed recursive filtering for time-varying systems with dynamic bias over sensor networks: Tackling packet disorders
Cai et al. Weighted message passing and minimum energy flow for heterogeneous stochastic block models with side information
CN115359297B (en) Classification method, system, electronic equipment and medium based on higher-order brain network
Dash DECPNN: A hybrid stock predictor model using Differential Evolution and Chebyshev Polynomial neural network
CN115908419A (en) Unsupervised hyperspectral image change detection method for optimizing pseudo label by using Bayesian network
Ma et al. Fast Monte Carlo dropout and error correction for radio transmitter classification
Khan et al. Stitching algorithm: A network performance analysis tool for dynamic mobile networks
Sun et al. Reinforced contrastive graph neural networks (RCGNN) for anomaly detection
Kovács et al. Optimistic search: Change point estimation for large-scale data via adaptive logarithmic queries
CN117808125B (en) Model aggregation method, device, equipment, federal learning system and storage medium
CN112884067B (en) Hop count matrix recovery method based on decision tree classifier
Mohammadi et al. High-Dimensional Bayesian Structure Learning in Gaussian Graphical Models using Marginal Pseudo-Likelihood
Radaelli et al. Parameter estimation for quantum jump unraveling
Le et al. VEAD: Variance profile Exploitation for Anomaly Detection in real-time IoT data streaming

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant