CN112188495B

CN112188495B - Cache pollution attack detection method based on federal learning in ultra-dense network

Info

Publication number: CN112188495B
Application number: CN202010902459.6A
Authority: CN
Inventors: 姚琳; 李佳; 吴国伟
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2020-09-01
Filing date: 2020-09-01
Publication date: 2021-11-19
Anticipated expiration: 2040-09-01
Also published as: CN112188495A

Abstract

The invention belongs to the technical field of information security, and provides a cache pollution attack detection method based on federal learning in an ultra-dense network. Firstly, calculating a weighted distance and determining whether to form a cluster independently by isolated small base stations which are not adjacent to any cluster in the network, and calculating distance similarity and load similarity by the small base stations adjacent to the cluster to select a proper cluster to join. Then, each small base station carries out data statistics according to the received interest packet and sends the data statistics to a cluster head, the cluster head is used as a working node in federal learning to be responsible for integrating data to train a local classifier, and the macro base station is used as a parameter server to be responsible for aggregating the received local classifiers to construct an improved global classifier. And finally, the final global classifier is broadcasted to all the small base stations, and after receiving the interest packets, the small base stations classify the contents by using the classifier, so that the contents requested by the malicious interest packets cannot be cached and the popularity can not be updated.

Description

Cache pollution attack detection method based on federal learning in ultra-dense network

Technical Field

The invention relates to a cache pollution attack detection method based on federal learning in an ultra-dense network, belonging to the technical field of information security.

Background

An Ultra-Dense Network (Ultra-Dense Network) is one of the key technologies of 5G, and by deploying Dense small base stations in hot spot areas such as shopping malls or transportation hubs, the distance between a terminal user and an access node can be further shortened, so that communication services such as low delay, high data rate and real-time transmission are provided. However, the traditional communication method based on the TCP/IP protocol is mainly designed for the host, and cannot meet the content requirement of a large number of users in the mobile network environment, and as the traffic from the small base station to the core network increases rapidly, the wireless backhaul link may become the performance bottleneck of the network. To solve this problem, in recent years, a Content Centric Network (Content Centric Network) is applied to the UDN as a new Network architecture. Unlike IP address based communication, the CCN does not need the IP address of the content provider to obtain the content, but directly obtains the required content according to the name of the content in a pull-based mode. There are two types of packets in the CCN, namely Interest packets (Interest) and Data packets (Data). When a user needs to request data, the interest packet is sent according to the given content name, and the interest packet is forwarded to the data provider by the intermediate node until the lifetime is over. When the data provider receives the interest packet, it will reply to the user with a data packet, which will be returned along the reverse path of the arrival of the interest packet, and when the intermediate node forwards the data packet, he can decide whether to store the corresponding content according to his own caching policy.

An intra-network caching mechanism in the CCN may facilitate distribution of popular content, reducing traffic on the backhaul link in the UDN, thereby reducing latency. However, this in-network caching mechanism is vulnerable to Cache Pollution attacks (Cache Pollution attach). An attacker caches such otherwise unpopular content in its limited cache space by frequently sending interest packets requesting such content, causing the network node to misunderstand that such otherwise unpopular content suddenly becomes popular. As such, the service experience of legitimate users is reduced due to reduced hit rates and increased access latency. There are two main types of cache pollution attacks: distributed Attack (Locality-distribution Attack) and collective Attack (False-Locality Attack). In LDA, an attacker destroys the locality of content in the cache by continually sending interest packets requesting new unpopular content. In FLA, an attacker repeatedly requests the same set of unpopular content to occupy cache space. Both attacks can severely impact the network performance of the UDN. Some cache replacement policies such as Least frequently used (Least recently used), Least recently used (Least recently used), and popularity based caching policies may all mitigate the effects of LDA. At present, research on cache pollution attack mainly focuses on static network environment, and cannot be directly applied to dynamic UDN (Universal description network) scenes. In UDN, the coverage of small base stations is small and a moving user may quickly migrate from one small base station to another. This results in the statistics collected by a single small cell not accurately capturing the characteristics of the abnormal interest packets. In addition, in life, a large number of users are interested in contents which are unpopular in the past suddenly, at the moment, the requests of the legal users should be responded normally, but most of research at present judges the interest packets as malicious interest packets and directly discards the malicious interest packets, so that the service experience of the legal users is seriously damaged.

Disclosure of Invention

In order to effectively detect and defend cache pollution attacks in the UDN, the invention provides a cache pollution attack detection method based on federal learning, and a high-quality classifier is obtained by efficiently utilizing distributed statistics values existing in a network under the cooperation of a plurality of small base stations. According to the scheme, a clustering algorithm based on distance and load similarity is firstly provided, and all small base stations form clusters in a distributed mode to increase training data on a single working node in federal learning. The cluster head is then responsible for training the local classifiers, and the macro base station is responsible for aggregating a plurality of local classifiers to form an improved global classifier. Finally, all small base stations use the improved classifier to detect whether cache pollution attack occurs. In addition, to mitigate the impact of cache pollution attacks, the scheme uses a popularity-based cache replacement policy.

The technical scheme of the invention is as follows:

a cache pollution attack detection method based on federal learning in an ultra-dense network comprises the following steps:

(1) firstly, all small base stations form clusters in a distributed mode according to distance and load similarity, so that a cluster head generates small cost when collecting statistical values, the statistical values distributed among the clusters are balanced as much as possible, and the situation that the convergence speed of a shared classifier is reduced due to overlarge training time difference of a local classifier in federal learning is avoided;

the cluster forming and adjusting process comprises the following specific steps:

(1.1) first, for each isolated small cell, the weighted distance sum to the neighboring small cells is calculated:

wherein N is_nIs a small base station s_kOf the neighboring small base stations, L_kAnd e_kAre respectively s_kIn this scheme, assuming that all users request content at the same frequency, the load can be reduced to the number of users connected to the small base station.

The small base station with the smallest weighted distance and the smallest weighted distance becomes a cluster alone and considers itself as a cluster head. The cluster head then encapsulates the cluster state information into a communication frame, including the location of the cluster head and the total load e of the cluster_sum(g_k) Wherein.

(1.2) when a small base station s_iAfter periodically receiving the communication frame of the adjacent cluster from the neighbor, it will calculate the joint similarity with the adjacent cluster and select the cluster head g of the cluster with the highest similarity_haSending the position, the load and the joint similarity of the node to request to join, wherein the joint similarity is calculated based on the position and the load in the following way:

distance similarity: small base station s_iAnd adjacent cluster g_kThe formula is defined as follows:

wherein L is_iAnd L_hkRespectively represents s_iAnd g_kCluster head g of_hkThe physical coordinates of (a). Epsilon_lIs a constant for controlling the range of similarity.

Load similarity: small base station s_iAnd adjacent cluster g_kBased on the similarity of the loads, the formula is defined as follows:

wherein e_sum(g_k) ' is an assumption of s_iAdding g_kThe calculation formula of the total load is defined as

I.e. the current g_kThe sum of the loads of all small base stations in the cell plus s_iOf N, wherein N_kIs the current g_kThe number of inner small base stations.

Represents a sum of s_iAll clusters g adjacent to each other_j，1≤j≤N_bIs defined as

Wherein N is_bIs and s_iThe number of adjacent clusters. And epsilon_nFor controlling the range of similarity.

Joint similarity: and combining the distance similarity and the load similarity to obtain the load similarity.

J(s_i，g_k)＝θ·D(s_i，g_k)+(1-θ)·N(s_i，g_k)

Where theta is used to coordinate the effects of distance and load similarity.

(1.3) Cluster head g_haIn the last step, multiple join requests may be received, but it only selects the small base station with the highest joint similarity to allow the joining, and updates the cluster head:

wherein N is_aIs an adjacent cluster g_aNumber of inner small base stations，g′_haIndicating the newly elected cluster head.

Then g_haAnd encapsulating the latest state information of the cluster into a communication frame to reply the join request of the small base station.

And (1.4) all the small base stations in the cluster encapsulate the latest cluster state information into own communication frame after receiving the latest cluster state information.

(1.5) the above four steps will be iterated until there are no isolated small base stations. If the members of the cluster are too few, the collected statistics may not be sufficient to train the classifier; conversely, when there are too many members, the training time of a single working node may be too long, thereby slowing federal learning. When the members of one cluster are less than 10% or more than 20% of all the small base stations, it will be merged with other clusters or split into two clusters, and the splitting or merging may cause load imbalance between the clusters, so that the merging and splitting process will need to be continued for a new cluster to adjust the clusters many times (1.1) to (1.4).

(2) When the cluster is stable, in order to reasonably utilize the statistical values distributed in the network, the scheme trains a classifier based on a FedAvg which is a classical federal learning algorithm. The whole process is carried out by a shared initial classifier omega₀Start, ω₀The macro base station is trained by using statistics collected by the macro base station in a preheating stage. Then, a final global model can be generated by iterating a plurality of local training and central aggregation processes. In each iteration, a new shared classifier is generated for the next iteration. In the scheme, a cluster head serving as a working node in the FedAvg is responsible for integrating statistical values of all members in the cluster to train a local classifier, and a macro base station located in the center serves as a parameter server and is responsible for aggregating all local classifiers in a weighted average mode.

The specific process of classifier generation is as follows:

(2.1) before training, firstly, data collection is needed, and in order to ensure that the training data on each working node is sufficient, the data collection of each small base station including the cluster head is carried out by the following statistics:

request ratio: in the k-th time slice, the time slice,request content c_iThe number of interest packages to all received requests.

γ_k(c_i)＝n(c_i)/N

Wherein n (c)_i) Is requesting content c_iN is the number of all requests.

Individual request strength variance: individual to content c_iThe variance of the request strength of (c).

Wherein N is_uIs to the content c_iNumber of users sending requests, nj₍c_i) Is sent to content c by the jth user_iNumber of requests of, N_jIs the number of all requests sent by the jth user.

Request ratio variance: past m time slices of content c_iThe variance of the request rate of (c).

Wherein E (gamma (c)_i) Is content c)_iThe request rate of (c).

Request diversity degree: the degree of request diversity in the kth time slice.

d(γ_k(C))＝||γ_k(C)||₀

Wherein gamma is_k(C)＝[γ_k(c₁)，γ_k(c₂)，…，γ_k(c_n)]。

Request time interval variance: variance of requested time intervals over the past m time slices.

Wherein t is_j(c_i) Is for content c in the jth time slice_iRequest time interval of (d), E (t (c)_i) Is content c)_iThe request time interval of (2).

Cache retention ratio: the ratio of the number of the contents cached in the (k-1) th time slice and the k-th time slice in the cache to the number of the contents cached in the k-th time slice.

Wherein C is_kIs the set of contents in the cache in the kth time slice.

(2.2) this scheme only detects FLA type attacks because some cache replacement strategies can mitigate the effects of LDA. Using LSTM as a classifier to learn the relationship between request patterns and corresponding content tags, all parameters of the classifier implemented based on LSTM are labeled ω with a loss function as follows:

where y is the original tag of the content, 0 indicates normal, 1 indicates malicious,

representing the probability that the content is predicted to be malicious.

Each small base station periodically transmits to the respective cluster head after collecting the above statistics. The cluster head constructs six statistical values of each time slice into six-dimensional feature vectors, and organizes the six statistical values into a time sequence with the length L in time sequence for calculating the current shared classifier omega_pUpdate of

p denotes the p-th iteration and i denotes the identification of the cluster head.

(2.3) each time the cluster head completes training of the local classifier, the updated local classifier is sent to the macro base station serving as the parameter server, and in each iteration, after L time slices from the first updated local classifier are received, the macro base station aggregates the local classifiers in a weighted average mode to obtain a new shared classifier for the next iteration:

wherein | D_iI is the number of samples collected by the ith cluster head, N_gIs the total number of clusters.

The three steps are iterated for multiple times until the shared classifier is converged, and the finally obtained global classifier is broadcast to the small base station for detecting FLA.

(3) And after receiving the interest packet, all the small base stations count and construct a specific time sequence label for predicting the content based on the classifier. In addition, in order to reduce the influence of LDA, the scheme adopts a cache replacement strategy based on popularity, and content c_iThe popularity calculation formula is as follows:

ρ(c_i)＝ρ(c_i)·α^Δt+β

where α is a decay constant and Δ t is two consecutive requests c_iIs a popularity growth constant. The prevalence decays exponentially with increasing time interval.

Therefore, the scheme only needs to detect the FLA type attack, once the content is predicted to be malicious, the scheme still continues to forward the interest packets considering that a large number of users suddenly interest some unpopular content, but the corresponding content cannot be cached, and the popularity cannot be updated, so that the content is prevented from occupying the cache space.

The invention has the beneficial effects that: the Ultra Dense Network (UDN) can realize higher data transmission rate and lower time delay by deploying dense small base stations, and the distribution of popular content can be further promoted by applying a Content Centric Networking (CCN) name-centric routing mode and an in-network caching mechanism to the UDN, so that the pressure of a wireless backhaul link is relieved. However, an in-network cache mechanism is easily attacked by cache pollution, and currently, research on cache pollution attack is mainly focused on a static network environment, and cache pollution attack cannot be accurately detected in a dynamic UDN scene, so that a cache pollution attack detection method based on federal learning is designed, and statistics values distributed in a network are efficiently utilized under cooperation of a plurality of small base stations to obtain a high-quality classifier.

Drawings

Fig. 1 is an organizational chart of a cache pollution attack detection method according to the present invention.

Fig. 2 is a flowchart of clustering of small base stations according to the present invention.

FIG. 3 is a flow chart of classifier generation according to the present invention.

Fig. 4 is a flow chart of the detection of the cache pollution attack according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail by examples and drawings.

A cache pollution attack detection method based on federal learning in an ultra-dense network comprises the steps that a small base station is clustered based on distance and load distribution, statistics is carried out after an interest packet is received, and a classifier is trained based on federal learning, and is used by the small base station to detect cache pollution attacks.

Referring to fig. 2, the specific operation process of the small cell clustering is as follows:

step 1, each isolated small base station in the network broadcasts the position and load of the base station.

Step 2, after receiving the position and load information of all neighbor base stations, each small base station in the network calculates the weighted distance sum to all the neighbors and broadcasts the weighted distance sum to the neighbors, wherein the formula is as follows:

wherein N is_nIs a small base station s_kOf the neighboring small base stations, L_kAnd e_kAre respectively s_kI.e. the number of users connected to the small cell.

And 3, the small base station with the smallest weighted distance and the smallest weighted distance is singly formed into a cluster and is regarded as a cluster head. The cluster head then encapsulates the cluster state information into a communication frame, including the location of the cluster head and the total load e of the cluster_sum(g_k)。

Step 4, when a small base station s_iAfter periodically receiving the communication frames of the adjacent clusters from the neighbors, it will calculate the joint similarity with the adjacent clusters and select the cluster with the highest joint similarity to send its position, load and joint similarity to request joining. The joint similarity is obtained from the distance similarity and the load similarity, and the calculation formula is as follows:

wherein e_sum(g_k) ' represents s_iAdding g_kThe calculation formula of the total load is defined as

Wherein N is_kIs a cluster g_kThe number of inner small base stations.

Represents a sum of s_iThe average load of all adjacent clusters is defined as

Wherein N is_nIs s_iThe number of contiguous clusters. And epsilon_nFor controlling the range of similarity.

Joint similarity: combining the distance similarity and the load similarity.

J(s_i，g_k)＝θ·D(s_i，g_k)+(1-θ)·N(s_i，g_k)

Where theta is used to coordinate the effects of distance and load similarity.

Step 5. Cluster head g_haIn the last step, multiple join requests may be received, but it only selects the small base station with the highest joint similarity to allow the joining, and updates the cluster head:

wherein N is_aIs g_aNumber of inner small base stations, g'_haIndicating the newly elected cluster head.

Step 6.g_haThe latest state information of the cluster is encapsulated into a communication frame to reply to the small base station.

And 7, all the small base stations in the cluster package the new cluster state information into own communication frames after receiving the new cluster state information.

And 8, repeating the steps 1 to 7 until no isolated small base station exists.

Step 9, each cluster checks whether the number of members is less than 10% of the number of all the small base stations, and if the number of members is less than 10%, the step 10 is carried out; otherwise, go to step 11.

And 10, selecting the adjacent cluster with the highest similarity for merging by the cluster head of the cluster, then reselecting the cluster head by a new cluster, and simultaneously entering an adjusting period.

Step 11, each cluster checks whether the number of members is more than 20%, and if so, the step 12 is carried out; otherwise, the cluster reaches a steady state.

And 12, firstly finding two small base stations with the farthest distance as the centers of two new clusters by the cluster head of the cluster, then dividing other small base stations in the current cluster into the two new clusters according to the shortest distance, and then reselecting the cluster head of the two newly formed new clusters and simultaneously entering an adjustment period.

Referring to fig. 3, the specific operation process of calculating the statistical value and generating the classifier based on the FedAvg algorithm is as follows:

step 13, each small base station and the macro base station perform the following statistics according to the interest packets received by each time slice:

request ratio: in the k-th time slice, for the content c_iThe ratio of requests to all received requests.

γ_k(c_i)＝n(c_i)/N

Wherein n (c)_i) Is to the content c_iN is the number of all requests.

Wherein N is_uIs to the content c_iNumber of users sending requests, n_j(c_i) Is sent to content c by the jth user_iNumber of requests of, N_jIs the number of all requests sent by the jth user.

Wherein E (gamma (c)_i) Is content c)_iThe request rate of.

d(γ_k(C))＝||γ_k(C)||₀

Wherein gamma is_k(C)＝[γ_k(c₁)，γ_k(c₂)，…，γ_k(c_n)]。

Wherein t is_j(c_i) Is the jth related content c_iRequest time interval of (d), E (t (c)_i) Is content c)_iThe request time interval of (2).

Wherein C is_kIs the set of contents in the cache in the kth time slice.

And step 14, the small base station periodically transmits the statistic value to the cluster head.

Step 15, the cluster head and the macro base station construct six statistical values of each time slice into a six-dimensional characteristic vector

Where t denotes the t-th time slice, and then organizes the statistics of consecutive L time slices into a time series X ═ X⁽¹⁾，x⁽²⁾，…，x^(L)]Each time series corresponding to a content tag, 1Indicating malicious and 0 indicating normal.

Step 16, pre-training an initial shared classifier omega by the macro base station according to the collected data₀The cross entropy loss function used is as follows:

where y is the original tag of the content,

representing the probability that the content is predicted to be malicious.

Step 17, downloading the current shared classifier from the macro base station by all cluster heads, and inputting the cluster heads into the current shared classifier omega every time the cluster heads obtain a time sequence_pWhere p denotes the p-th iteration, using gradient descent for multiple iterations to compute an updated local classifier

i represents the identification of the current cluster head, and the loss function used is the same as the previous step.

And 18, the cluster head sends the updated local classifier to the macro base station.

Step 19, after L time slices of the first updated local classifier for the current shared classifier are received, the macro base station aggregates all the received local classifiers in a weighted average manner to obtain a new improved shared classifier:

wherein | D_iI is the number of samples collected by the ith cluster head, N_gIs the total number of clusters and is,

updated local classifier computed in p-th iteration being computed by ith cluster head。

And 20, repeating the steps 17, 18 and 19 until the accuracy of the shared classifier reaches the requirement.

And step 21, the macro base station broadcasts the final global classifier to all the small base stations.

Referring to fig. 4, the specific operation process of the detection of the cache pollution attack is as follows:

and 22, each small base station constructs a time sequence of each content according to the mode of the step 15 based on the statistic value obtained in the step 13.

Step 23, detecting whether FLA occurs or not based on the classifier, and if the content is normal, performing step 23; otherwise, go to step 24.

And 24, the base station uses a cache replacement strategy based on popularity, and the popularity calculation formula of the normal content is as follows:

ρ(c_i)＝ρ(c_i)·α^Δt+β

where α is a decay constant and Δ t is two consecutive requests c_iIs a popularity growth constant. The popularity tends to decay exponentially with increasing time interval.

And 25, all the small base stations still continue to forward the interest packets judged to be malicious, but do not cache corresponding contents and calculate the popularity of the contents so as to avoid the contents occupying cache space.

Claims

1. A cache pollution attack detection method based on federal learning in an ultra-dense network is characterized by comprising the following steps:

wherein N is_nIs a small base station s_kOf the neighboring small base stations, L_kAnd e_kAre respectively s_kIn the scheme, assuming that all users request contents at the same frequency, the load is reduced to the number of users connected to the small cell;

the small base station with the smallest weighted distance sum independently becomes a cluster and is regarded as a cluster head; the cluster head then encapsulates the cluster state information into a communication frame, including the location of the cluster head and the total load e of the cluster_sum(g_k)；

wherein L is_iAnd L_hkRespectively represents s_iAnd g_kCluster head g of_hkPhysical coordinates of (a); epsilon_lIs a constant for controlling the range of similarity;

wherein e is_sum(g_k) ' is an assumption of s_iAdding g_kThe calculation formula of the total load is defined as

I.e. the current g_kThe sum of the loads of all small base stations in the cell plus s_iOf N, wherein N_kIs the current g_kThe number of inner small base stations;

Wherein N is_bIs and s_iThe number of adjacent clusters; and epsilon_nA range for controlling the similarity;

joint similarity: combining the distance similarity and the load similarity to obtain the load similarity;

J(s_i，g_k)＝θ·D(s_i，g_k)+(1-θ)·N(s_i，g_k)

wherein θ is used to coordinate the effects of distance and load similarity;

wherein N is_aIs an adjacent cluster g_aNumber of inner small base stations, g'_haRepresenting newly elected cluster heads;

then g_haPackaging the latest state information of the cluster into a communication frame to reply the join request of the small base station;

(1.4) all the small base stations in the cluster package the latest cluster state information into own communication frame after receiving the latest cluster state information;

(1.5) repeating the above four steps until there is no isolated small base station; when the members of one cluster are less than 10% or more than 20% of all the small base stations, the cluster is merged with other clusters or split into two clusters, the splitting or merging may cause load imbalance among the clusters, so that the merging and splitting process generates new clusters which need to be continued for a plurality of times (1.1) to (1.4) to adjust the clusters;

(2) after the cluster is stable, in order to reasonably utilize the distributed statistical values existing in the network, the scheme trains a classifier based on a FedAvg which is a classical federal learning algorithm; the whole process is carried out by a shared initial classifier omega₀Start, ω₀The macro base station is obtained by training by using statistics values collected by the macro base station in a preheating stage; then, a final global model can be generated by iterating a plurality of local training and central aggregation processes; in each iteration, a new shared classifier is generated for the next iteration; in the scheme, a cluster head serving as a working node in the FedAvg is responsible for integrating the statistical values of all members in the cluster to train a local classifier, and a macro base station located in the center serves as a parameter server and is responsible for aggregating all the local classifiers in a weighted average mode;

the specific process of classifier generation is as follows:

request ratio: in the k-th time slice, content c is requested_iThe ratio of the number of interest packages to all received requests;

γ_k(c_i)＝n(C_i)/N

wherein n (c)_i) Is requesting content c_iN is the number of all requests；

Individual request strength variance: individual to content c_iThe variance of the request strength of (c);

wherein N is_uIs to the content c_iNumber of users sending requests, n_j(c_i) Is sent to content c by the jth user_iNumber of requests of, N_jIs the number of all requests sent by the jth user;

request ratio variance: past m time slices of content c_iThe variance of the request rate of (a);

wherein, E (gamma (c)_i) Is content c)_i(ii) a desire for a request rate of;

request diversity degree: the degree of request diversity in the kth time slice;

d(γ_k(C))＝||γ_k(C)||₀

wherein, γ_k(C)＝[γ_k(c₁)，γ_k(c₂)，…，γ_k(c_n)]；

Request time interval variance: variance of requested time intervals within the past m time slices;

wherein, t_j(c_i) Is for content c in the jth time slice_iRequest time interval of (d), E (t (c)_i) Is content c)_i(ii) a request time interval of (d);

cache retention ratio: the ratio of the number of the cached contents in the (k-1) th time slice and the kth time slice in the cache to the number of the cached contents in the kth time slice;

wherein, C_kIs the set of contents in the cache in the kth time slice;

(2.2) because some cache replacement strategies can reduce the influence of LDA, the scheme only detects the attack of the FLA type; using LSTM as a classifier to learn the relationship between request patterns and corresponding content tags, all parameters of the classifier implemented based on LSTM are labeled ω with a loss function as follows:

representing a probability that the content is predicted to be malicious;

each small base station periodically sends the collected statistics to each cluster head; the cluster head constructs six statistical values of each time slice into six-dimensional feature vectors, and organizes the six statistical values into a time sequence with the length L in time sequence for calculating the current shared classifier omega_pUpdate of

p represents the p iteration, i represents the identification of the cluster head;

wherein, | D_iI is the number of samples collected by the ith cluster head, N_gIs the total number of clusters;

the three steps are iterated for multiple times until the shared classifier is converged, and the finally obtained global classifier is broadcast to a small base station for detecting FLA;

(3) all small base stations count and construct a special time sequence label for predicting content based on a classifier after receiving the interest packet; in addition, in order to reduce the influence of LDA, the scheme adopts a cache replacement strategy based on popularity, and content c_iThe popularity calculation formula is as follows:

ρ(c_i)＝ρ(c_i)·α^Δt+β

where α is a decay constant and Δ t is two consecutive request contents c_iIs a popularity growth constant; the prevalence decays exponentially with increasing time interval;