AU2019101412A4

AU2019101412A4 - A distributed adaptive clustering strategy algorithm

Info

Publication number: AU2019101412A4
Application number: AU2019101412A
Authority: AU
Inventors: Feng Chen; Rui Hu; Zhifeng Liu; Qing Shi
Original assignee: Southwest University
Current assignee: Southwest University
Priority date: 2019-11-17
Filing date: 2019-11-17
Publication date: 2020-01-02
Anticipated expiration: 2027-11-17

Abstract

Abstract: A distributed adaptive clustering algorithm over dynamic multi-task networks, the algorithm mainly comprising two processes including normal task adaptation and the same task clustering. In the process of abnormal task adaptation, the algorithm use the task anomaly detection based on the non cooperative least mean square (NC-LMS) algorithm; in the same task clustering process, the algorithm use the task based on the diffusion maximum entropy criterion (D-MCC) algorithm. Switch detection. In multi-task networks, a series of scenarios, such as dynamic network, time varying tasks and non-stationary (Gaussian and pulse interference). The algorithm also discuss optimization schemes to design the NC-LMS and D-MCC weights and examine the estimate performance and clustering effects of the proposed algorithm by simulation results. Compared to the prior art the present invention provides an effective distributed adaptive clustering strategy to solve distributed adaptive estimation over dynamic multi-task networks. Besides, since nodes in the network are constrained by communication power consumption and external interference in a non stationary environment, the objective pursued by the node is prone to change or abnormality. Fig.1 40- 1 a 0 20 40 60 80 100 120 x-coord nate Fig.2 x7o70 0 20 40 60 80 100 120 x-coord inate

Description

1. Background and Purpose

Distributed estimation for adaptation, learning, modeling, and optimization through cooperation between nodes plays a key role in reinforcement learning, signal processing, and online supervised learning and many other application areas, which aims to estimate a single parameter vector collaboratively. However, in reality, there are many parameters of interest happening to be multitask-oriented. In other words, there are multiple optimum parameter vectors that are simultaneously inferred in a collaborative manner. Multi-task problems have been studied in many important applications, such as multi-task clustering, multi-target tracking, and multi-model classification. In our work, we consider the situation where there are connected clusters of nodes, and each cluster has a parameter vector to estimate.

Recent years, several useful distributed strategies have been proposed, including incremental strategies, consensus strategies and diffusion strategies. In particular, diffusion strategies are attracted by many researchers because of their scalability and reliability. It is worth noting that has proved that the diffusion strategy in data processing on adaptive networks has better stability and performance range than consensus-based strategies. Accordingly, the diffusion adaptive learning algorithm is mainly considered in our work. Adaptive net-works are well-suited for decentralized inference, filtering and clustering tasks. However, previous work on topology design and tuning techniques included research that was not dynamic, and in the sense that they cannot track changes in the network. Motivated by the problem, we develop an adaptive clustering algorithm over dynamic multi-task network in this paper, which can reduce the impact of weak links on network estimation by selecting data subsets from neighbor nodes with normal tasks.

The purpose of this work is to motivate and develop a distributed clustering strategy algorithm based on diffusion MCC, for robust distributed multi-task network estimation in a nonstationary environment. Motivated by, the algorithm consider a general situation where there are connected clusters of nodes, and each cluster has a parameter vector to estimate. In addition, all nodes in the network do not know in advance which cluster they belong to, and they also do not know which neighbor are interested in their task. In summary, the main contributions are as follows:

1) Algorithm implementation: task anomaly detection and task switching detection steps in the clustering strategy to improve clustering accuracy are added to improve the accuracy of clustering;

2) Simulation: we simulate a variety of scenarios and investigate the estimated performance and clustering effect of the proposed algorithm.

2. System Model

Each node k in the connected network observes random measurements {d_ki,u_ki}, where d_{k i}

2019101412 22 Nov 2019 is a scalar data and u_{k i} is 1^XL regression vector data, which are assumed to be related to some unknown Lx 1 parameter vector w_ki by a linear regression model of the form:

^dk,i=^ukA,i^+vk,i ₍₁₎

Where u_ki at time instant i, is temporally white and independent over space with covariance matrix R_uk= Eu_k > 0 and zero means. v_ki is an additive temporally and spatially independent zero-mean noise process with a time-independent variance _fc, and it is independent of every other signal over space. It is considered that nodes of different clusters track different objectives (which also call tasks), and there is one task per cluster, namely, ^w°k=^wcpf^or^^{kE C} _q (2)

The task time-varying model is:

= ^skA-i ⁺ (¹ - ^SX^WC_P + ^Zk,i-1 · (3)

Where z_ki__k is the process noise for node k at time instant i — 1, which is independent of measurement noise v_ki and regression vector u_{k i}. Let s_{k i} denote a random indicator variable, i.e., ί l,if c_ki__l >c_r otherwise ’ ^W where c_{k i}__k is a positive potential communication cost for each node k at time instant i — 1, and c_r represents the network tolerable communication cost threshold. It is given by C_{k ;}_j — C_on_{k ;}_j, where c₀ is the communication cost between connected nodes.

3. Distributed Clustering Algorithm Description

3.1 The task anomaly detection based on non-cooperative least-mean-squares (NC-LMS) algorithm

Step 1 : Each node k in the network can updates the estimate learning strategy, namely,

K = Ka -/Pk^k,-ⁿk,Kk,i-M,i (5)

Due to process noise interference, especially impulse noise, task of nodes may be abnormal, sending or receiving abnormal tasks not only increases power consumption but also reduces the accuracy of parameter estimation. A hypothesis test based on the updated estimate W_k; is developed to ascertain whether the task of node k is abnormal.

Step 2: Detection for abnormal tasks (6) where the H_o hypothesis denotes the task of node k is normal, and node k sends data {d_{k i}, u_{k i} } to

2019101412 22 Nov 2019 neighbors I. Conversely, the hypothesis H_k denotes the task of node k is abnormal, and node k does not send data {d_ki, u_ki} to neighbors I . The threshold θ₀ is predefined. Besides, no exchange of data for abnormal task is needed during the adaptation, which makes the communication cost relatively low.

3.2 The task switching detection based on diffusion maximum correntropy criterion (D-MCC) algorithm

Step 1 : To reduce the impact of noise on the estimate, each node k in the network can update the intermediate estimate through the diffusion learning strategy over MCC.

V^i = ^k,i-l + Pk Σ j

I. = Σ ^cikWi,i

I (7)

Step 2: The correntropy between two random variables X and y is associated with a generalized correlation function, which scales the similarity of X and y via ^_)=E[^_exp(_w<^)], (8) where β is the Gaussian kernel size.

Step 3: With Gaussian kernel and local error e_{k t} — d_ki~ U_{k i}W_k ._j , the instantaneous correntropy cost function is e²

6>^CC(%) = Jk^CC(^ = E[^=exp(-^)]. (9)

By using instantaneous approximation, an approximation of the gradient vector is (10)

The diffusion algorithm based on MCC (D-MCC) is ¥k,i = ^wk,i-t ⁺ Σ ^aik^Gff^CC (^ei2^eu ^uli’

P kN_t where G^^cc(q1 f is a Gaussian kernel, as kernel size /? —>oo, then G^^cc(ez;) —> 1. Note that the kernel size for correntropy function is quite important, and the Gaussian kernel G^^cc(e(z)) versus local error e(z) for different values of size.

Step 4: The neighborhood set N_k will be time-dependent and expressed as , and

2019101412 22 Nov 2019 combination coefficient a_{lk t} is ||²<6»₀ \(Q, otherwise

The algorithm D-MCC is

Ψ k,i ^{= W}k,i-l^+rlk ^aik,i&p (^Ql,i)^Ql,i^Ul,i’ t=N_kj where η_k = —is the step size.

(12) (13)

Step 5: The weights can be used to minimize the instantaneous mean-square deviation (MSD) of the network:

minMW)^£E||wy-wJ| ]y _Λ=1 (14)

The combination coefficients C_{lk t} can be obtained, which can be approximated by

otherwise where N_{k t} = N_k \ {k} . The combination rule gives larger weights to neighbors with common cluster and smaller weights to neighbors that come from different clusters. Then, the combination step is rewritten by ^wkj = Σ feN.,· ₍₁₆₎

Step 6: Using these dynamically-evolving estimates, we introduce another hypothesis test based on the updated estimate w_{k t}, which is developed to ascertain whether the tasks of node k and node I are the same at time z.

II II² §01 (17)

Hj

Where the threshold θ₃ is predefined. The H_o hypothesis denotes the tasks of node k and node

I are the same, and the link between node k with neighbor I are active. Conversely, the hypothesis

2019101412 22 Nov 2019

Hj denotes the tasks of node k and node I are different, and the links between node k with neighbor I are dropped. Then, the cluster connection coefficient l_kl; given by i \,if HL success l^c .= l^c . = 1 ^{J 0} a,, ik,i \Q,_oth_erwi_se

4. Brief Description of the Drawings

Fig. 1 is network initial topology.

Fig.2 is the resulting topology of the subnetwork over no task switching or exception.

Fig.3 is the subnetwork average MSDs for NC-LMS and D-MCC over no task switching or exception.

Fig.4 is the resulting topology of the subnetwork over task switching and exception.

Fig.5 is the subnetwork average MSDs for NC-LMS and D-MCC overtask switching and exception.

5. Detailed Description

5.1 Model Validation

The topology of the network consisting of N = 20 nodes divided into Q = 3 clusters, i.e., 6!= {1, 2, 3, 4, 5}, C₂= {7, 8, 9, 10, 11, 12, 13, 14}, and C₃= {15, 16, 17, 18, 19, 20}, with connection is generated as a random geometric graph model as shown in Fig. 1. The location coordinates (x_k], y_k j) of each node k in the square region [0, 110] x [0, 110]. In the time-varying scenario, they vary according to the first-order Markov vector process:

x_k(i) = bx_k(i - Ϊ) + h(i), y_k (X = by_k O’-1) + h(i).

Where b = 0.98 and /z(z) is an independent zero-mean Gaussian vector process with variance:

₂ Γ 0.01 —> Small interference ^h [1 —> Big interference

5.2 Illustrative Example

Computer simulations are carried out to evaluate the performance of the proposed algorithm, under the assumption that all nodes have no prior knowledge about the clusters, clustering effects of the proposed method in the following two scenarios are illustrated. Scenario 1: The location coordinates of nodes in network fluctuate is gentle (the network structure changes slightly, i.e.,o^ = 0.01), then for zero-mean Gaussian interference with k, = 0 and σ|. = 0.5, the task of nodes does not switch and cannot evolve to exceptions. Scenario 2: The location coordinates of nodes in network fluctuate is dramatic (the network structure changes wildly, i.e., = 1), then for Impulse

2019101412 22 Nov 2019 interference with and — 10³, the task of nodes is switched and abnormal.

Scenario l(No task switching or exception)

The network node position fluctuates slightly, and it suffers from constrained Gaussian interference. After approximate 400 iterations of MSD curves, the clustering decision of the proposed algorithm does not change with time. The neighboring links within the same cluster are active whereas the neighboring links, which come from different clusters, are dropped. Fig. 2 illustrates the resulting topology when the network is in steady-state. From the simulation results, we find that there is no task switching or exception when the nodes are under Gaussian interference with small constraint. The MSD learning curves for the proposed clustering algorithm consisting of the recursions NC-LMS and D-MCC are plotted in Fig. 3. It is obvious that three clusters take MCC cooperation clustering policy to improve their MSD performance on average.

Scenario 2(Task switching and exception)

The network node position fluctuates wildly, and it suffers from Impulse interference. After approximate 400 iterations of MSD curves, similarly, the clustering decision of the proposed algorithm does not change with time. In addition, the neighboring links within the same cluster are active, whereas the neighboring links that come from different clusters, are dropped. Fig. 4 illustrates that the three subnetworks are themselves connected when net-work is at steady-state. There is a similar implied result with scenario 1 that the proposed clustering strategy can suppress the interference between clusters. From Fig. 4, we can see that there are task switching and exceptions when the nodes are Impulse interference. The MSD learning curves for the proposed clustering algorithm consisting of the recursions NC-LMS and D-MCC are plotted in Fig. 5. Obviously, the proposed algorithm for recursion D-MCC has a superior performance in comparison with recursion NC-LMS.

Claims

The claims defining the invention are as follows:

A distributed adaptive clustering strategy algorithm

1. Distributed Clustering Algorithm Description

1.1 The task anomaly detection based on non-cooperative least-mean-squares (NC-LMS) algorithm

Step 1 : Each node k in the network can updates the estimate learning strategy, namely, =+p_k (<k - ^_ky_kym^T _k. (5)

Due to process noise interference, especially impulse noise, task of nodes may be abnormal, sending or receiving abnormal tasks not only increases power consumption but also reduces the accuracy of parameter estimation. A hypothesis test based on the updated estimate W_{k t} is developed to ascertain whether the task of node k is abnormal.

Step 2: Detection for abnormal tasks

11 II² (6) where the H_o hypothesis denotes the task of node k is normal, and node k sends data {d_ki, u_ki} to neighbors I. Conversely, the hypothesis denotes the task of node k is abnormal, and node k does not send data {d_{k i}, u_{k i}} to neighbors I . The threshold θ₀ is predefined. Besides, no exchange of data for abnormal task is needed during the adaptation, which makes the communication cost relatively low.

1.2 The task switching detection based on diffusion maximum correntropy criterion (D-MCC) algorithm

Step 1 : To reduce the impact of noise on the estimate, each node k in the network can update the intermediate estimate through the diffusion learning strategy over MCC.

= +A- Σ j

I ™_k,i = Σ wu l ^fciVi (7)

Step 2: The correntropy between two random variables X and y is associated with a generalized correlation function, which scales the similarity of X and y via (8) where β is the Gaussian kernel size.

Step 3: With Gaussian kernel and local error e_k, — d_ki~ U_{k i}W_k , the instantaneous correntropy cost function is

2019101412 22 Nov 2019

1e = Jk^CC^fl = Ε[-^=οχρ(-^_)].(9)

By using instantaneous approximation, an approximation of the gradient vector is )» G“^cc (e _t,) e,., <(10)

The diffusion algorithm based on MCC (D-MCC) is = ^wk,i-i ^{+ u}i,i’ (¹¹)

P ‘N where is a Gaussian kernel, as kernel size /?—>oo, then G^^cc(Q_t;.) —> 1. Note that the kernel size for correntropy function is quite important, and the Gaussian kernel G^^CC(e(z’)) versus local error e(z) for different values of size.

Step 4: The neighborhood set N_k will be time-dependent and expressed as N_ki, and combination coefficient Cl_{lk t} is

Il²<#o

110, otherwise (12)

The algorithm D-MCC is

Ψk,i ^Wk,i-l^+rlk Σ ^aik,i^Gf (13) where η_k = —y is the step size.

Step 5: The weights can be used to minimize the instantaneous mean-square deviation (MSD) of the network:

minMSD(i) & ±£e || <- w_k, ||. (14)

The combination coefficients C_{/k t} can be obtained, which can be approximated by

ΙΙ^,,·-^,,·-! ΙΓ

Σ II Ψί,ι- ^,/-ι ΙΓ²ί^, ’if Ιε N_k.

(15) ι- Σ me. NT i

otherwise

Where N_{k t} = N_k \ {A:}. The combination rule gives larger weights to neighbors with common cluster and smaller weights to neighbors that come from different clusters. Then, the combination step is rewritten by = Σ ^,/^,,₍₁₆)

Step 6: Using these dynamically-evolving estimates, we introduce another hypothesis test based on the updated estimate w_k;, which is developed to ascertain whether the tasks of node k and node I are the same at time z.

II ^wk,i~^wi,i II² (¹⁷)

Where the threshold θ₃ is predefined. The H_o hypothesis denotes the tasks of node k and node I are the same, and the link between node k with neighbor I are active. Conversely, the hypothesis Hj denotes the tasks of node k and node I are different, and the links between node k with neighbor I are dropped. Then, the cluster connection coefficient l_{kI t} given by ( 1, if HL success l^c , = l^c . = 1 ^{J 0 kl,}‘ ^lk,‘ |θ, otherwise (18)