CN110611582A - Opportunistic social network effective data transmission method based on node socialization - Google Patents

Opportunistic social network effective data transmission method based on node socialization Download PDF

Info

Publication number
CN110611582A
CN110611582A CN201910347872.8A CN201910347872A CN110611582A CN 110611582 A CN110611582 A CN 110611582A CN 201910347872 A CN201910347872 A CN 201910347872A CN 110611582 A CN110611582 A CN 110611582A
Authority
CN
China
Prior art keywords
node
nodes
value
community
clustering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910347872.8A
Other languages
Chinese (zh)
Inventor
吴嘉
严晔琴
陈志刚
刘佳琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201910347872.8A priority Critical patent/CN110611582A/en
Publication of CN110611582A publication Critical patent/CN110611582A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/52User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail for supporting social networking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/142Network analysis or design using statistical or mathematical methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network

Abstract

The invention provides an opportunistic social network effective data transmission method based on node socialization, which comprises the steps of dividing nodes in a network into a plurality of different communities, removing some low-efficiency nodes according to the attributes of optimal relay nodes, and carrying out community reduction; measuring the availability of the nodes by proposing the concepts of sending trust, receiving trust, residual caching and activity indexing; in an opportunistic social network, it is more likely that a node is the optimal relay node if the nodes meet these characteristics at the same time. The reduced efficient community transmission data packet is beneficial to maintaining the continuity, stability and efficiency of the data transmission process. Simulation results show that the packet delivery rate of the ETNS is 13% higher than that of the epidemic algorithm, and the ETNS has lower transmission delay and routing overhead.

Description

Opportunistic social network effective data transmission method based on node socialization
Technical Field
The invention belongs to the technical field of computers, and particularly relates to an opportunistic social network effective data transmission method based on node socialization.
Background
With the popularization of networks and the development of social informatization, information dissemination based on various online social platforms has become an extremely important means. Many social platforms, such as Facebook, Instagram, and Twitter, have sufficient capabilities to support billions of users participating in the information transfer process. Through the network platform, people can attract more people to pay attention by sharing interesting things in life. When users communicate and surf the internet through mobile devices, they can publish photos or videos anytime and anywhere.
In social networks, it is important that the network have the ability to communicate at high speed. Through the evaluation of human communication activities and their interest preferences, historical information of data exchange activities may be recorded and analyzed. With the development of online communication platforms, personal commodity recommendation becomes effective. However, the process of retrieving large amounts of structured data from human activity is very complex, requiring significant storage and computational resources. This makes some conventional wireless sensor network approaches unsuitable.
When we face this problem, it becomes necessary and important to establish a suitable environment in the wireless network to ensure the stability of data transmission. Opportunistic networks are an operating architecture suitable for wireless communication research. The biggest characteristic of the scheme is that information transmission among nodes needs to find out 'opportunity'. This information transfer method can provide communication services through cooperation between node movement and nodes. Opportunistic networking approaches are increasingly being applied in social networking scenarios because people's movements cause intermittent connections between wireless devices carried. In an online social network, "opportunities" may be provided by reliable neighbors that have sufficient resources and cache space to hold what we want to share, such as pictures and videos, or have similar points of interest to share our experiences. The "storage" and "carry" states also apply to the online social network, as nodes in the online social network need to wait for the presence of appropriate neighbors. A "forward" state in a social network may represent an efficient data transfer process. In the study of social networks, "opportunities" mean the possibility to decide whether useful information can be propagated. In a social network, only reliable neighbors can participate in the selection of an optimal relay node when node communication is established.
Despite the emergence of community-based opportunistic network propagation strategies, how to partition effective communities remains a hot issue. This is not because existing community partitioning methods are not feasible, but because community partitioning does not consider whether all nodes in the community can meet the transmission requirements. In fact, most processes of community division consider interest points and social relationships of nodes in real scenes, but not every node in the community is suitable for propagation. Such a community node is an inefficient node requiring a large amount of resources for transmission, but its transmission performance is not ideal, so it is necessary to provide a method for reducing the cost. Social network communications that simultaneously transmit large amounts of data may result in excessive energy consumption, low transmission rates, and transmission delays. Therefore, there is a need to propose a community reduction method to improve the performance of community-based algorithms.
Disclosure of Invention
The invention provides an opportunistic social network effective data transmission method based on node socialization, which aims to divide nodes in a network into a plurality of different communities by using a clustering method and simultaneously provides a group reduction method based on optimal relay node attributes, so that message transmission among a source node, the communities and a target node is more efficient.
An opportunistic social network effective data transmission method based on node socialization comprises the following steps:
constructing an undirected graph of network nodes, and calculating a clustering coefficient of each node according to the undirected graph;
selecting a node with the largest clustering coefficient from a node set to be subjected to community clustering division as a clustering center, performing community clustering division on neighbor nodes of the selected clustering center, and transmitting data to be transmitted by using nodes in the divided communities;
judging whether the similarity between the neighbor nodes in the circular area where the clustering center is located and the clustering center is higher than the average value of the similarity between all the neighbor nodes in the circular area where the clustering center is located and the clustering center, if so, dividing the corresponding neighbor nodes into the community where the clustering center is located, otherwise, waiting for the next clustering division;
if no node with the similarity larger than the average value of the similarities exists in the circular area where the current clustering center is located, or the current clustering is ended after the number of the nodes contained in the current community reaches a set Minpts, and the node with the largest aggregation coefficient in the current node set is selected as the next clustering center;
finishing node clustering after all nodes finish community division or when the maximum value of clustering times is reached;
the initial value of a node set to be subjected to community clustering division comprises all nodes in a network, each node is deleted from the node set after being divided into communities, and each clustering center is used as an initial node in one community;
the neighbor nodes of the clustering center are nodes contained in a circular area with the clustering center as a central point and the radius of the circular area as an Eps set value.
The values of the Eps set value and the set Minpts are determined by experimental adjustment in different scenes by adopting empirical values.
The nodes in the network are divided into a plurality of different communities by using a clustering method, so that data transmission is more effective;
further, the clustering coefficient of the nodes is calculated according to the following formula:
wherein, CiRepresenting a node viCluster coefficient of (v), node viDegree in undirected graph is ki,EiRepresented in an undirected graph, node viAnd k isiActual number of connecting edges, T, between individual neighbor nodesiRepresenting a node viKth of (1)iThe maximum number of connections that can be formed by a neighbor node.
Further, in the node community clustering and dividing process, if overlapping communities exist, the nodes are divided into communities with higher modularity values;
wherein the modularity value of the community is Qc(Xn),
Qc(Xn) Representing Community XnThe value of the module-wise value of (c),representing Community XnThe number of internal nodes;representing Community XnNumber of connecting edges when containing x nodes, dxIs community XnDegree of a node when x nodes are included;is the current community XnThe total number of inner node connecting edges; xnThe number n of communities is represented, the initial value of n is 0, and the maximum value is the maximum value of the clustering times.
Further, the similarity between nodes is calculated according to the following formula:
wherein, S (x)i,xj) Representing a node viAnd node vjSimilarity between, xik,xjkRepresented in an undirected graph, node vi,vjShortest paths to node k, respectively; m represents the number of nodes contained in the undirected graph corresponding to the network node.
Further, each node v is calculatediProperty integrated characteristic value k ofiComparing the attributes of each nodeSum of the characteristic value and a set threshold value k, ifiIf the value is less than kappa, the corresponding node viData is not transmitted as a relay node:
NR(j,i)=NRh(j,i)+NRm(j,i),NS(j,i)=NSh(j,i)+NSm(j,i),
aijis the value of the ith row and jth column element in the matrix A, A is the characteristic evaluation matrix of the node,
omega represents a weighting coefficient of the node credit value in calculation, and the value range is [0,1 ]; n is 0 as the initial value, and the maximum value is the maximum value of the clustering times;
the normalization coefficient of the information entropy is defined asFor keeping the entropy value of each attribute positive; when ω is 0, the local credit value is the average ratingThe value of the node participating in the transaction, if there are many malicious nodes in the network and collusion exists, the omega is set to 0, so that the trust value can be more fair and fair; experiments show that when the number of malicious nodes is greater than 40% of the total number of nodes, collusion cheating can be avoided by setting ω to 0. But when the number of malicious nodes is less than 40% of the total number of nodes, the effect of ω being 1.6 is the best.
VriRepresents a collection of nodes, Vs, that have traded with the node and are on the receiving sideiRepresenting a collection of nodes that have transacted with the node and are on the sender side. LRijRepresenting a node viAs a reputation evaluation for sender node j at the receiver, LSijRepresenting a node viTo a receiver node v when acting as a senderjThe reputation of (2) is evaluated; LRijRepresenting a node viAs a reputation evaluation for sender node j at the receiver, LSijRepresenting a node viTo a receiver node v when acting as a senderjA reputation evaluation of. LRijAnd LSijHas a value interval of [0,1]]The initial values are all 0.5;
NRh(i, j) and NRm(i, j) respectively represent a node viAs a receiver node with vjThe number of times loyalty and malicious transactions occur;
NSh(i, j) and NSm(i, j) respectively represent a node viAs a sender node and vjThe times of the integrity and the malicious transaction are obtained through the statistics of historical transmission records;
within a time interval t, node viThe amount of data received is denoted ri(t),BSri(t) represents the amount of buffer occupied and data collected r during data receptioni(t) is in a linear relationship, BSCollecting buffers occupied by units of data for nodes, BTConsumption of the cache; j. the design is a squaren,k(t) represents the number of channels, Σ, assigned to a nodek∈κJn,k(t)≤1;
The activity level of a node is measured by converting a daily timestamp T into a mapping time. The seconds of the mapping time and the seconds of the daily time are defined asAnd τ, which are both in the value range of [0,86400 ], representing each second of the day.
The current time stamp of the node is defined as T, and the time zone where the node is located is represented as Nzone,MaxτIs the maximum value of τ; v represents the average velocity of all nodes in the history.
The method belongs to a method for simplifying communities, can screen inefficient nodes which do not accord with transmission conditions in a community structure, and improves the efficiency of a data transmission strategy based on the community.
Further, the maximum value of the clustering times is the number of important nodes;
the important node is a node with a nomination value larger than the average nomination value of all nodes;
the node nomination value obtains the interactive relation among users according to the behavior logs of the users, the interactive relation is obtained according to the interactive relation among the users, the nomination value of the node is increased by 1 every time the node successfully forwards data, and the initial value of the nomination value of each node is 1;
node viThe nomination value after the s-th successful data transmission in the history record is
Wherein, the node viThe number of adjacent nodes of (a) is pi-1;representing a node viThe size of the adjacent matrix is pi x pi, nodes are sequentially arranged at the row head and the column head of the adjacent matrix, if a connecting edge exists between two nodes, the element of the corresponding two nodes in the adjacent matrix takes a value of 1, otherwise, the element takes a value of 0,the value of (d) is 0.
The nomination value is determined by the times of successful information transmission of the nodes in the current history record, and each node is endowed with a nomination in the initial stage. After each successful data forwarding, the nomination value of one node contains the original nomination and the nomination of other nodes connected with the node. After data forwarding is successfully performed each time, the nomination value needs to be updated correspondingly. The nomination value is used for measuring the importance of the node, the node with the importance higher than the average value is considered as an important node, and the number of the important nodes determines the clustering frequency.
Further, assume node viAnd node vjWhen they meet, viStoring data to be forwarded;
if node vjIs the target node for data transmission, then node viTransmitting information to node vjAnd deleting the message in the send queue;
if node vjNot the target node for data transmission, node viThe transmission method of (1) is divided into the following two types:
step 7.1: intra-community transmission;
if the target node is at the current node viWithin a community of v, and vjAlso in the current community, node viTransmitting the data information to the node, otherwise, not transmitting;
step 7.2: inter-community transmission;
if the target node is not at node viCommunity of interest, node viWill be sent to node vjA request frame asks whether the target node is at node vjThe community to which the user belongs; after receiving the request frame, the node vjWill check the current community, confirm whether the target node is in the current community, and send a response frame to the node vj(ii) a If the target node is at vjIn the community, node viSending data information to node vjOtherwise, the hair is not sent.
Most community-based routing algorithms take into account node attributes and social relationships, and do not take into account the energy consumption of inefficient nodes, which accounts for a large proportion of the routing cost. In order to improve a network propagation strategy, the invention provides an effective propagation strategy based on node socialization, and nodes in a network are divided into a plurality of different communities. The scheme also relates to a community reduction method for removing some inefficient nodes according to the attributes of the optimal relay node. The invention provides concepts of sending trust, receiving trust, keeping cache and activity index to measure the availability of the node. In an opportunistic social network, it is more likely that a node is the optimal relay node if the nodes meet these characteristics at the same time. According to the entropy value of each characteristic, the usability of the node is determined by comprehensively considering a plurality of characteristics, and the number of inefficient nodes in the community can be effectively reduced. The reduced efficient community transmission data packet is beneficial to maintaining the continuity, stability and efficiency of the data transmission process. Simulation results show that the packet delivery rate of the ETNS is 13% higher than that of the epidemic algorithm, and the ETNS has lower transmission delay and routing overhead.
Advantageous effects
The invention provides an opportunistic social network effective data transmission method based on node socialization, which comprises three stages, wherein in the first stage, the interactive relationship among users is obtained according to behavior logs of the users, so that communities are divided by adopting a clustering method according to the similarity of nodes; in the second stage, an attribute quantization strategy of the user node characteristics is constructed by combining the attribute characteristics which need to be met by the relay node in the opportunistic social network for successful message transmission, and the attribute quantization strategy is used as a judgment basis of the low-efficiency nodes in the community; and finally, carrying out effective data transmission by combining the reduced efficient communities. Based on online social network user behavior records and the association relation of heterogeneous nodes, a characteristic condition which needs to be met when the nodes are comprehensively considered for successful data transmission is provided, and the community structure is reduced by combining the condition, so that the inefficient nodes in the community are reduced;
modeling an experiment under The ONE (Opportunistic network environment) simulation platform by means of Mapreduce and Rdd calculation frames based on a real social network data set; the experimental result shows that compared with the FCNS algorithm, the ESR algorithm, the EWDCR algorithm and the traditional Epidemic algorithm, the method has better transmission success rate and routing overhead.
Drawings
FIG. 1 is a schematic flow diagram of the process of the present invention;
FIG. 2 is a diagram illustrating a clustering-based community partitioning process proposed by the present invention;
FIG. 3 is a process flow diagram of the clustering-based community partitioning process proposed by the present invention;
fig. 4 is a quartering graph of transmission success rates in a data transmission strategy performed by 5 different methods, namely ETNS, FCNS, ESR, EWDCR, and Epidemic, in example 1, wherein (a) is a quartering graph of transmission success rates when an Infocom5 data set is selected for simulation, (b) is a quartering graph of transmission success rates when an Infocom6 data set is selected for simulation, (c) is a quartering graph of transmission success rates when a Cambridge data set is selected for simulation, and (d) is a quartering graph of transmission success rates when an Intel data set is selected for simulation;
fig. 5 is a graph showing comparison of transmission success rates in data transmission strategies performed by 5 different methods, namely ETNS, FCNS, ESR, EWDCR, and Epidemic, in example 1, where (a) is a graph showing comparison of transmission success rates when an Infocom5 data set is selected for simulation, (b) is a graph showing comparison of transmission success rates when an Infocom6 data set is selected for simulation, (c) is a graph showing comparison of transmission success rates when a Cambridge data set is selected for simulation, and (d) is a graph showing comparison of transmission success rates when an Intel data set is selected for simulation;
fig. 6 is a comparison graph of end-to-end delay in a data transmission strategy performed by 5 different methods, namely ETNS, FCNS, ESR, EWDCR, and Epidemic, in embodiment 1, where (a) is a comparison graph of end-to-end delay when an Infocom5 dataset is selected for simulation, (b) is a comparison graph of end-to-end delay when an Infocom6 dataset is selected for simulation, (c) is a comparison graph of end-to-end delay when a Cambridge dataset is selected for simulation, and (d) is a comparison graph of end-to-end delay when an Intel dataset is selected for simulation;
fig. 7 is a comparison graph of the routing overhead in the data transmission strategy performed by 5 different methods, namely ETNS, FCNS, ESR, EWDCR, and Epidemic, in example 1, where (a) is a comparison graph of the routing overhead when an Infocom5 dataset is selected for simulation, (b) is a comparison graph of the routing overhead when an Infocom6 dataset is selected for simulation, (c) is a comparison graph of the routing overhead when a Cambridge dataset is selected for simulation, and (d) is a comparison graph of the routing overhead when an Intel dataset is selected for simulation.
Detailed Description
The invention will be further described with reference to the following figures and examples.
The invention provides a schematic diagram of an opportunistic social network effective data transmission method based on node socialization, which is shown in fig. 1-3 and comprises the following concrete implementation steps:
step 1: constructing an undirected graph of network nodes, and calculating a clustering coefficient of each node according to the undirected graph;
calculating the number of important nodes in the network to obtain clustering times, and calculating the number of important nodes in the network to obtain clustering times;
the nodes of the network actively participating in information transmission and data forwarding are considered as important nodes, and the importance of the nodes is measured through a nomination mechanism of the nodes, and the specific process is as follows: node viThe nomination value of the node every time data is successfully forwardedIt is increased by 1. The node nomination accumulation process is as follows:
wherein the content of the first and second substances,representing a node viNumber of nominations after s-th successful transmission of data, initial nominationsIs defined as 1.Node v representing a networkiIs adjacent to the element in the ith row and the jth column in the matrix. To enable comparison of nodes in the network, the nominated cumulative values are normalized and expressed as:
if the normalization process is performed after each iteration, the nomination value accumulation process can be expressed as:
step 2: carrying out community division on the nodes;
firstly, selecting a node with the largest clustering coefficient in a node set as a clustering center, then comparing the similarity of surrounding nodes and the clustering center with the average similarity, dividing nodes higher than the average level into the community, not dividing nodes lower than the average level into the current community, and waiting for the next division process. And if the overlapped communities exist, dividing the nodes into communities with higher modularity values. The specific calculation method is as follows:
wherein, a node v is setiDegree of (is k)i,EiRepresenting a node viIs actually between the kth neighbor nodesThe number of connecting sides of (a); t isiRepresenting a node viC, the maximum number of connections that the kth neighbor node may formiRepresenting a node viThe cluster coefficient of (2). In equation (5), G is assumed to be an undirected graph with M nodes, where G ═ x1,x2,...,xm},xik,xjkRespectively represent nodes vi,vjShortest path to node k, S (x)i,xj) Representing a node viAnd node vjThe similarity between them. In the formula (6), Qc(Xn) Representing Community XnThe value of the module-wise value of (c),representing Community XnThe number of internal nodes;representing Community XnNumber of connecting edges when containing x nodes, dxIs community XnDegree of a node when x nodes are included;is the current community XnThe total number of inner node connecting edges; xnThe number n of communities is represented, the initial value of n is 0, and the maximum value is the maximum value of the clustering times.
And step 3: and based on the transmission attribute existing in the interactive process of the user, the attribute characteristics required to be met by the node in the transmission process are presumed, and the attribute characteristics are measured to obtain the evaluation standard of the inefficient node. The method comprises the following specific steps:
step 3.1: each node performs local reputation evaluation on other nodes based on historical transaction records and social relationships. Since social relationships are difficult to quantify, integrity assessments are made based only on historical transaction records. Defining an evaluation mechanism during each transaction, giving good or bad evaluation by the node after each transaction, quantizing the local credit value according to the evaluation information, and calculating the model as follows:
among them, LRijRepresenting a node viAs receiver to sender node vjReputation evaluation of (LS)ijRepresenting a node viTo a receiver node v when acting as a senderjA reputation evaluation of. LRijAnd LSijHas a value interval of [0,1]]The initial values are all 0.5; LRijAnd LSijA value of greater than 0.5 is considered trustworthy, with closer to 1 being more trustworthy, whereas the node is considered untrustworthy. N is a radical ofRh(i, j) and NRm(i, j) respectively represent a node viAs a receiver node with vjThe number of times loyalty and malicious transactions occur. N is a radical ofSh(i, j) and NSm(i, j) respectively represent a node viAs a sender node and vjThe times of the integrity and the malicious transaction are obtained through the statistics of historical transmission records. Setting a penalty factor NpunThe descending speed of the credit value is faster than the ascending speed, and the punishment to the malicious transaction is reflected.
The local reputation evaluation of the node can intuitively evaluate the honesty degree of a certain node. However, in a trusted mechanism, the global reputation value of a node as a sender (receiver) should be evaluated by all nodes that have traded with the node. The global reputation value for a node as a sender is defined herein as GRiAnd the global reputation value of the node as the receiver is defined as GSi. For the evaluation content of the node, the opinion of the node with higher integrity is more important than that of the node with lower integrity. Similarly, if a node has multiple stable connections and data transmissions with the node, the opinion of the evaluation between the two nodes is more reliable. Thus, the global reputation value GR of a nodeiAnd GSiLocal reputation value LR that should be for all nodes connected to itijAnd LSijAnd comprehensively measuring the transaction times and the credit evaluation value. Here, the expression is performed by means of weighted average:
wherein N isR(j,i)=NRh(j,i)+NRm(j,i),NS(j,i)=NSh(j,i)+NSm(j, i), ω represents the weighting coefficient of the node reputation value in the calculation, VriRepresents a collection of nodes, Vs, that have traded with the node and are on the receiving sideiRepresenting a collection of nodes that have transacted with the node and are on the sender side. LRijRepresenting a node viAs a reputation evaluation for sender node j at the receiver, LSijRepresenting a node viTo a receiver node v when acting as a senderjA reputation evaluation of. And the weighting function 1-exp (-N (j, i)/5), and the evaluation content of the node is more important as the number of times of connecting the node with the node is more negative and exponential increases.
If a node passes n rounds of transaction of time slices and the like, the global credit value of the node as a sender can be considered to be based on the global credit value of the node as a receiver in the previous round. This way, the number of iterative computations of equation (8) can be reduced, saving communication and computation overhead:
step 3.2: the residual cache of the nodes in the network is an important factor to be considered when information transmission and data forwarding are carried out;
within a time interval t, node viThe amount of data received is denoted ri(t) of (d). Amount r of data occupied by buffer and collected during data receptioni(t) is linear and is denoted BSri(t) wherein BSThe cache occupied by the unit data is collected for the node. If the sink node allocates a channel to node viThen node viB is consumed as cacheTAnd sending the data. Node v is thusiTotal buffer in time slot tCan be expressed as:
due to the amount r of data receivedi(t)≤rmaxAnd allocates the channel number Σk∈κJn,k(t) is less than or equal to 1, and the upper limit of the cache of any node in a time gap is Bmax=Bsrmax+BT. The remaining caches of the nodes at the current time t are:
step 3.3: and defining an activity index. To convert the daily time to the mapped time, we define the seconds of the mapped time and the seconds of the daily time as the mapped time, respectivelyAnd τ, which are both in the value range of [0,86400 ], representing each second of the day. The difference between these two times is that T can only be a positive integer butMay be a decimal number. By analyzing the data set, the average number of messages transmitted per second M can be obtained*And the number of messages transmitted per second Mτ. At MτWhile T is continuously changing, M*Is unchanged in size. Very obviously, M in the operating periodτIs obviously higher than M under the rest time at nightτSo that persistent changes related to node activity can be described. The mapping function is defined as follows:
for the original timestamp T of the data set, we define its corresponding mapping timestamp asTo convert the original timestamp into a mapping timestamp, we first need to convert T into the second τ of its corresponding mapping time by the following formula*
Where mod (a, b) returns the modulus value of a divided by b. The time zone in which the node is located is represented as Nzone,MaxτIs the maximum value of τ. The time zone is related to the location of the node, so we should add one N for each T in the calculationzone*NsecondsThe value of (a). After integrating equations (12) and (13), the mapping time is obtained as:
in a social network, the movement of nodes brings about some communication opportunities. We determine whether a node belongs to an active time based on the sum of the node's distance over the mapping period, represented by equation 15, where v represents the average speed of the node in the history:
and 4, step 4: and distributing the weight of each attribute characteristic according to the information entropy function. For each community in the network, analysis is carried out based on various social attributes of the nodes, and the influence of the social attributes of various nodes on information transmission is measured, so that the low-efficiency nodes in the community are reduced. Here, we use the concept of information entropy for weight assignment. The entropy of the information is a variable quantity which can describe the disorder degree of the information, and the larger the entropy value is, the higher the disorder degree of the information is, and the corresponding information has the lowest utility. The information entropy is defined as:
wherein E (F)i) Each representsCharacteristic FiEntropy of (2). p (x)i) Is represented by FiThe selection of the function has different selection modes according to different scenes, so that the function can be determined when the scene is selected.
According to the property of the weight of the attribute feature in the application scene of the scheme, the function E needs to have symmetry, monotonicity, continuity and additivity. When the information entropy is used for weight analysis, the arrangement sequence of the characteristic values is changed without changing the weight corresponding to the characteristic values. Meanwhile, the feature number determination time function has continuity for its variables and changes monotonically with the degree of importance of the evaluation feature. Based on these principles, the function is constructed:
wherein, E (x)1,...,xu) Representing the entropy function, x, of the information used in the methodiEach attribute feature represents a node, and u represents the number of attribute features.
And 5: the method comprises the following steps of analyzing the global trust value of a node as a sender, the global trust value as a receiver, node cache and the frequency of movement of the node in a period of time, and establishing an evaluation matrix A of the characteristics:
and carrying out normalization processing on the data matrix to obtain a calculation matrix Y. Wherein maxxij,minxijRespectively representing the maximum value, the minimum value and the average value of the jth column element of the data matrix A.
According to formula (19), the corresponding entropy value of each feature index is calculated. Here taking the negative signTo ensure that the entropy value is positive, the normalization coefficient is defined as
For each feature index, the relative weight can be found as:
step 6: for all nodes in the community, node filtering is carried out according to the node comprehensive characteristic indexes, and the attribute characteristics of the nodes are combined in a weighted mode to obtain comprehensive characteristics:
in order to measure whether a node has transmission capability, a threshold value k is set to determine that the node can meet the transmission condition required as a relay node, and then inefficient nodes are deleted. The core of the method lies in providing a viewpoint that nodes in the community can not meet the transmission condition, and creatively providing a community reduction method comprehensively considering the transmission required condition. By the scheme of reducing the nodes in the community, the nodes which do not meet the transmission requirement in the community can be filtered and deleted. After a small number of nodes which do not meet the forwarding condition are reduced, the nodes in the community are closely connected, and the transmission capacity is high.
And 7: through the steps, several communities with close social relations are obtained in the network. The nodes in these communities have a high degree of confidence, activity and sufficient cache space. The effective transmission is performed by data transfer between communities. Suppose node viAnd node vjWhen they meet, viAnd storing the data to be forwarded. If node vjIs the target node, then node viWill transmit information to node vjAnd in the send queueThis message is deleted. If node v meetsjNot the target node, node viThe transmission method of (a) can be divided into the following two types:
step 7.1: and (4) intra-community transmission. If the target node is at the current node viWithin a community of v, and vjAlso in the current community, node viAnd transmitting the data information to the node, otherwise, not transmitting.
Step 7.2: and (5) inter-community transmission. If the target node is not at node viCommunity of interest, node viWill be sent to node vjA request frame asks whether the target node is at node vjThe community to which the user belongs. After receiving the request frame, the node vjWill check the current community, confirm whether the target node is in the current community, and send a response frame to the node vj. If the target node is at vjIn the community, node viSending data information to node vjOtherwise, the hair is not sent.
Example 1:
in this example, using data sets from the social network of people in CRAWARD that move with imote devices, the original four data sets being social data provided by Cambridge university, we extracted key fields about user behavior records and user attribute information, including 4546 photos, 2662 photo publisher nodes, 40808 user nodes, and 618491 edges. The four datasets employed are the Infocom5 dataset, the Infocom6 dataset, the Cambridge dataset, and the Intel dataset, respectively.
The implementation is realized on The ONE simulation tool, a calculation programming model is built as a data calculation layer by means of a Mapreduce and Rdd calculation framework by means of taking an HDFS (distributed file system) as a data storage layer, data are efficiently and quickly processed in parallel, The model and The algorithm are built to solve The initial node with maximized influence, different comparative experiments are designed to analyze The selection effect and quality of The initial node, and therefore The correctness of The theoretical analysis method is verified.
In the embodiment, an ETNS algorithm based on node socialization information is mainly designed, and compared with ETNS, FCNS, ESR, EWDCR and Epidemic models, a design comparison experiment compares a propagation effect, and the effectiveness of the model and the algorithm on a data transmission strategy is verified.
Simulation results show that the ETNS algorithm has good performance in the community division process. The algorithm in the Infocom6 dataset performed best among the four datasets used in the experiment. This is the only dataset that results in the actual 6 clusters, which is the closest dataset to the actual results of humans. The community partitioning results in the Infocom5 and Cambridge datasets are also good, but slightly lower than the algorithmic performance in the Infocom6 dataset. Because the number of experimental copies in the data set is small, or the group of nodes are randomly distributed and do not accord with the characteristic of human clustering, the performance of the data set is poorer than that of other three data sets. Simulation results show that the ETNS algorithm is feasible and effective for carrying out community division in an actual data set.
In fig. 4, we used the quartile map to analyze the experimental results. The quartile map has 5 symbols (minimum, 1/4 values, median, 3/4 values, and maximum). The quartile may represent the distribution center, the concentration of distribution ratios, and the distribution range. In fig. 4, ETNS has a higher center of interest, a smaller spread range and a more focused range of interest than other algorithms.
FIG. 5 is a graph showing the comparison of transmission success rates in the data transmission strategies performed by 5 different methods, ETNS, FCNS, ESR, EWDCR and Epidemic, in example 1; when the simulation time is less than one day, the advantages of the ETNS algorithm are not obvious, and the performance of ETNS is similar to the other four algorithms. With the increase of simulation time, we can find that the transmission rate of ETNS is always the highest of these algorithms, because successful data transmission can be achieved by filtering the active nodes in the community. In the ETNS algorithm, nodes in a network are divided into a plurality of communities, and each pair of nodes with high similarity in the communities may communicate frequently. Meanwhile, the ETNS algorithm provides a node reduction strategy based on multiple attributes, so that a large number of inappropriate and inefficient nodes can be reduced, and the high availability of community nodes can bring the highest delivery rate. The ESR algorithm is a routing algorithm based on communities, but the reduction method of the nodes does not consider social attributes, so ETNS has better performance. For FCNS and EWDCR algorithms, the similarity does not take into account the trustworthiness of the nodes and the available buffer space, which may result in the selected relay node being unavailable. In addition, the epidemic algorithm has a large number of message copies that affect the data transmission efficiency, and thus the ETNS method has a relatively low transmission rate compared to the other four algorithms.
FIG. 6 is a comparison graph of end-to-end latency in the data transmission strategy performed by 5 different methods, ETNS, FCNS, ESR, EWDCR, and Epidemic, in example 1; ETNS has the lowest average end-to-end delay compared to the other four algorithms. Because the ETNS analyzes the comprehensive characteristics of the nodes, a community reduction strategy is provided, and the low-efficiency nodes which are not beneficial to the transmission process can be reduced, so that the average end-to-end time delay is reduced. In contrast, the epidemic algorithm has no requirement for the next hop node, and messages are transmitted blindly, resulting in a drastic increase in routing and forwarding delays. The ESR algorithm effectively limits the number of copies and therefore the transmission delay is lower than the epidemic algorithm. In addition, the FCNS algorithm analyzes the transmission preferences before data transmission. In the EWDCR algorithm, data is passed through neighbors and related nodes. Thus, the average end-to-end latency of the FCNS and EWDCR algorithms is lower than that of the conventional routing algorithms. Of these five algorithms, the average end-to-end delay of ETNS is optimal
FIG. 7 is a comparison of the routing overhead in the data transmission strategy performed by 5 different methods, ETNS, FCNS, ESR, EWDCR and Epidemic, in example 1. The average overhead of the ETNS algorithm is kept to a minimum level at all times because it employs a community-aware strategy that takes into account the comprehensive nature of the transmission. In the ETNS algorithm, nodes are divided into several closely related groups, and the probability of successful transmission between nodes is high. Therefore, the ETNS routing scheme occupies less time and resources, and the cost is greatly reduced on average. The ESR algorithm only considers the effect of nodes on the information flow, ignoring the current availability of the next hop node, resulting in latency overhead. In epidemic algorithms, redundant message replication requires a lot of time and resources, which is a major cause of huge routing overhead. In the FCNS and EWDCR algorithms, the similarity between nodes can effectively reduce the routing overhead, but the routing overhead can still be optimized because the resource consumption caused by some unavailable nodes can be reduced. In summary, ETNS has the lowest routing overhead among the five algorithms.
From the above experiments, the research method comprehensively considers the advantages of the community in the transmission process based on the user behavior record and the complex social relationship of the user, and provides a screening mode of low-efficiency nodes in the community, so that the transmission strategy based on the community can be more efficient, and the experiments show that the research method provided by the inventor has higher data transmission efficiency and lower routing overhead.

Claims (7)

1. An opportunistic social network effective data transmission method based on node socialization is characterized by comprising the following steps:
constructing an undirected graph of network nodes, and calculating a clustering coefficient of each node according to the undirected graph;
selecting a node with the largest clustering coefficient from a node set to be subjected to community clustering division as a clustering center, performing community clustering division on neighbor nodes of the selected clustering center, and transmitting data to be transmitted by using nodes in the divided communities;
judging whether the similarity between the neighbor nodes in the circular area where the clustering center is located and the clustering center is higher than the average value of the similarity between all the neighbor nodes in the circular area where the clustering center is located and the clustering center, if so, dividing the corresponding neighbor nodes into the community where the clustering center is located, otherwise, waiting for the next clustering division;
if no node with the similarity larger than the average value of the similarities exists in the circular area where the current clustering center is located, or the current clustering is ended after the number of the nodes contained in the current community reaches a set Minpts, and the node with the largest aggregation coefficient in the current node set is selected as the next clustering center;
finishing node clustering after all nodes finish community division or when the maximum value of clustering times is reached;
the initial value of a node set to be subjected to community clustering division comprises all nodes in a network, each node is deleted from the node set after being divided into communities, and each clustering center is used as an initial node in one community;
the neighbor nodes of the clustering center are nodes contained in a circular area with the clustering center as a central point and the radius of the circular area as an Eps set value.
2. The method of claim 1, wherein the clustering coefficients of the nodes are calculated according to the following formula:
wherein, CiRepresenting a node viCluster coefficient of (v), node viDegree in undirected graph is ki,EiRepresented in an undirected graph, node viAnd k isiActual number of connecting edges, T, between individual neighbor nodesiRepresenting a node viKth of (1)iThe maximum number of connections that can be formed by a neighbor node.
3. The method according to claim 1, wherein in the node community clustering division process, if there are overlapping communities, the nodes are divided into communities with higher modularity values;
wherein the modularity value of the community is Qc(Xn),
Qc(Xn) Representing Community XnThe value of the module-wise value of (c),representing Community XnThe number of internal nodes;representing Community XnNumber of connecting edges when containing x nodes, dxIs community XnDegree of a node when x nodes are included;is the current community XnThe total number of inner node connecting edges; xnThe number n of communities is represented, the initial value of n is 0, and the maximum value is the maximum value of the clustering times.
4. The method of claim 1, wherein the similarity between nodes is calculated according to the following formula:
wherein, S (x)i,xj) Representing a node viAnd node vjSimilarity between, xik,xjkRepresented in an undirected graph, node vi,vjShortest paths to node k, respectively; m represents the number of nodes contained in the undirected graph corresponding to the network node.
5. The method of claim 1, wherein each node v is computediProperty integrated characteristic value k ofiComparing the attribute comprehensive characteristic value of each node with a set threshold value kappa ifiIf the value is less than kappa, the corresponding node viData is not transmitted as a relay node:
τ*=mod(Ti+Nzone*Nseconds,Maxτ)
NR(j,i)=NRh(j,i)+NRm(j,i),NS(j,i)=NSh(j,i)+NSm(j,i),
aijis the value of the ith row and jth column element in the matrix A, A is the characteristic evaluation matrix of the node,
omega represents a weighting coefficient of the node credit value in calculation, and the value range is [0,1 ]; n is 0 as the initial value, and the maximum value is the maximum value of the clustering times;
Vrirepresents a collection of nodes, Vs, that have traded with the node and are on the receiving sideiRepresenting a collection of nodes that have transacted with the node and are on the sender side. LRijRepresenting a node viAs a reputation evaluation for sender node j at the receiver, LSijRepresenting a node viTo a receiver node v when acting as a senderjThe reputation of (2) is evaluated; LRijRepresenting a node viAs a reputation evaluation for sender node j at the receiver, LSijRepresenting a node viTo a receiver node v when acting as a senderjA reputation evaluation of. LRijAnd LSijHas a value interval of [0,1]]The initial values are all 0.5;
NRh(i, j) and NRm(i, j) respectively represent a node viAs a receiver nodeAnd vjThe number of times loyalty and malicious transactions occur;
NSh(i, j) and NSm(i, j) respectively represent a node viAs a sender node and vjThe times of the integrity and the malicious transaction are obtained through the statistics of historical transmission records;
within a time interval t, node viThe amount of data received is denoted ri(t),BSri(t) represents the amount of buffer occupied and data collected r during data receptioni(t) is in a linear relationship, BSCollecting buffers occupied by units of data for nodes, BTConsumption of the cache; j. the design is a squaren,k(t) represents the number of channels, Σ, assigned to a nodek∈κJn,k(t)≤1;
The current time stamp of the node is defined as T, and the time zone where the node is located is represented as Nzone,MaxτIs the maximum value of τ; v represents the average velocity of all nodes in the history.
6. The method according to any one of claims 1-5, wherein the maximum value of the clustering times is the number of significant nodes;
the important node is a node with a nomination value larger than the average nomination value of all nodes;
the node nomination value obtains the interactive relation among users according to the behavior logs of the users, the interactive relation is obtained according to the interactive relation among the users, the nomination value of the node is increased by 1 every time the node successfully forwards data, and the initial value of the nomination value of each node is 1;
node viThe nomination value after the s-th successful data transmission in the history record is
Wherein, the node viThe number of adjacent nodes of (a) is pi-1;representing a node viIn the ith row of the adjacency matrixElement of column j, node viThe size of the adjacent matrix is pi x pi, the nodes are sequentially arranged at the row head and the column head of the adjacent matrix, if a connecting edge exists between the two nodes, the element of the corresponding two nodes in the adjacent matrix takes the value of 1, otherwise, the element takes the value of 0,the value of (d) is 0.
7. The method of claim 6, wherein assume node viAnd node vjWhen they meet, viStoring data to be forwarded;
if node vjIs the target node for data transmission, then node viTransmitting information to node vjAnd deleting the message in the send queue;
if node vjNot the target node for data transmission, node viThe transmission method of (1) is divided into the following two types:
step 7.1: intra-community transmission;
if the target node is at the current node viWithin a community of v, and vjAlso in the current community, node viTransmitting the data information to the node, otherwise, not transmitting;
step 7.2: inter-community transmission;
if the target node is not at node viCommunity of interest, node viWill be sent to node vjA request frame asks whether the target node is at node vjThe community to which the user belongs; after receiving the request frame, the node vjWill check the current community, confirm whether the target node is in the current community, and send a response frame to the node vj(ii) a If the target node is at vjIn the community, node viSending data information to node vjOtherwise, the hair is not sent.
CN201910347872.8A 2019-04-28 2019-04-28 Opportunistic social network effective data transmission method based on node socialization Pending CN110611582A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910347872.8A CN110611582A (en) 2019-04-28 2019-04-28 Opportunistic social network effective data transmission method based on node socialization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910347872.8A CN110611582A (en) 2019-04-28 2019-04-28 Opportunistic social network effective data transmission method based on node socialization

Publications (1)

Publication Number Publication Date
CN110611582A true CN110611582A (en) 2019-12-24

Family

ID=68889440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910347872.8A Pending CN110611582A (en) 2019-04-28 2019-04-28 Opportunistic social network effective data transmission method based on node socialization

Country Status (1)

Country Link
CN (1) CN110611582A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597396A (en) * 2020-05-13 2020-08-28 深圳计算科学研究院 Heterogeneous network community detection method and device, computer equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254918A1 (en) * 2014-03-07 2015-09-10 Jeffrey F. Miller System, process, or method for the use of cross-inhibitive-voting in collaborative societal decision making within social networks
CN105812254B (en) * 2016-03-21 2017-06-23 湖南城市学院 A kind of opportunity network data transmission method
CN108920678A (en) * 2018-07-10 2018-11-30 福州大学 A kind of overlapping community discovery method based on spectral clustering with fuzzy set

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150254918A1 (en) * 2014-03-07 2015-09-10 Jeffrey F. Miller System, process, or method for the use of cross-inhibitive-voting in collaborative societal decision making within social networks
CN105812254B (en) * 2016-03-21 2017-06-23 湖南城市学院 A kind of opportunity network data transmission method
CN108920678A (en) * 2018-07-10 2018-11-30 福州大学 A kind of overlapping community discovery method based on spectral clustering with fuzzy set

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YEQING YAN等: "Effective Data Transmission Strategy Based on Node Socialization in Opportunistic Social Networks", 《IEEE ACCESS》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111597396A (en) * 2020-05-13 2020-08-28 深圳计算科学研究院 Heterogeneous network community detection method and device, computer equipment and storage medium
CN111597396B (en) * 2020-05-13 2021-05-28 深圳计算科学研究院 Heterogeneous network community detection method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
Xu et al. Asynchronous federated learning on heterogeneous devices: A survey
US9250975B2 (en) Elastic and scalable publish/subscribe service
Asheralieva et al. Reputation-based coalition formation for secure self-organized and scalable sharding in iot blockchains with mobile-edge computing
Guan et al. Effective data communication based on social community in social opportunistic networks
Yan et al. Effective data transmission strategy based on node socialization in opportunistic social networks
Li et al. Method of resource estimation based on QoS in edge computing
Wu et al. Behavior prediction based on interest characteristic and user communication in opportunistic social networks
Zhang et al. Cooperative edge caching based on temporal convolutional networks
Guo et al. NOMA-assisted multi-MEC offloading for IoVT networks
Shao et al. An online orchestration mechanism for general-purpose edge computing
Al-Hilo et al. Vehicle-assisted RSU caching using deep reinforcement learning
CN116669111A (en) Mobile edge computing task unloading method based on blockchain
Li et al. Task offloading mechanism based on federated reinforcement learning in mobile edge computing
Wu et al. Data transmission scheme based on node model training and time division multiple access with IoT in opportunistic social networks
CN115967990A (en) Classification and prediction-based border collaborative service unloading method
Xu et al. Online learning algorithms for offloading augmented reality requests with uncertain demands in MECs
Xu et al. Energy or accuracy? Near-optimal user selection and aggregator placement for federated learning in MEC
CN110611582A (en) Opportunistic social network effective data transmission method based on node socialization
CN113778675A (en) Calculation task distribution system and method based on block chain network
Yan et al. An effective transmission strategy exploiting node preference and social relations in opportunistic social networks
CN116227632A (en) Federation learning method and device for heterogeneous scenes of client and heterogeneous scenes of data
Wilhelmi et al. Analysis and evaluation of synchronous and asynchronous FLchain
Li et al. Community clustering routing algorithm based on information entropy in mobile opportunity network
Ma et al. A multi attribute decision routing for load-balancing in crowd sensing network
Li et al. ESMO: Joint frame scheduling and model caching for edge video analytics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20191224

WD01 Invention patent application deemed withdrawn after publication