WO2015126297A1

WO2015126297A1 - Privacy preservation of influential users in a communication environment

Info

Publication number: WO2015126297A1
Application number: PCT/SE2014/050219
Authority: WO
Inventors: Saravanan Mohan; Kumaresh SREEDHAR
Original assignee: Telefonaktiebolaget L M Ericsson (Publ)
Priority date: 2014-02-21
Filing date: 2014-02-21
Publication date: 2015-08-27

Abstract

A privacy retaining arrangement in a communication environment obtains a representation of users as nodes in a group of nodes, where the nodes are connected to each other with directional links representing relationships, where the direction is based on which of the users is initiator of the relationship, the links have weights, the nodes are provided in clusters of different types and a cluster type is based on between which nodes links are provided and their direction, the arrangement also determines at least one preferred node having a link directed away from itself with a weight that is among the highest in the weights of the group, locates a primary cluster (PCL1) comprising the preferred node,selects a corresponding secondary cluster (SCL) of the same type and replaces the weights(46, 52) in the primary cluster (PCL1) with the weights(4, 1) in the secondary cluster (SCL1).

Description

PRIVACY PRESERVATION OF INFLUENTIAL USERS IN A COMMUNICATION ENVIRONMENT

TECHNICAL FIELD

The invention relates to retention of privacy in communication

environments. More particularly, the invention relates to a a method and computer program product for retaining the privacy of a user in a communication environment as well as to a privacy retaining arrangement and a communication environment comprising such a privacy retaining arrangement.

BACKGROUND For an operator operating a communication environment, such as telecom operator operating a telecommunication network like a mobile

communication network or an operator of a social community network, it may be of interest to provide data about the users of the environment to third parties. This may be of interest for a third party in order to find out trends of usage, which may in turn lead to the development of new products and services. In the field of telecommunication it may for instance lead to the development of new applications, often termed apps.

However, even if it is of interest for the operator to export such data, it is even more important that privacy is retained. This is especially true for those users that use the services of the environment the most and influence others, which users are sometimes termed influential users. These users may even leave the environment if they find out that data about them have been exported. There is thus a need for exporting data of users while at the same time masking the identity of at least some of the users in order to retain privacy. Aspects of the invention are directed towards solving this problem. SUMMARY One object of the invention is thus to retain the privacy of users in a communication environment when connection related data about is exported out from the communication environment.

This object is according to a first aspect achieved by privacy retaining arrangement for retaining the privacy of a user in a communication environment. The privacy retaining arrangement comprises a processor acting on computer instructions whereby the behaviour investigating arrangement

obtains a representation of at least some users as nodes in a group of nodes within the communication environment,

where the nodes are connected to each other with directional links representing relationships between the users, where the direction is based on which of the corresponding users is initiator of the relationship, where the directional links each have a weight representing the strength of the relationship, where the nodes of the group are provided in clusters of different types and where a cluster type is based on between which nodes links are provided and their direction,

determines at least one preferred node for which the user identity is to be masked, where the at least one preferred node has a link directed away from itself with a weight that is among the highest in the weights of the group,

locates a primary cluster comprising the preferred node,

selects a corresponding secondary cluster of the same type as the located primary cluster, and

replaces the weights of the links in the located primary cluster with the weights of the links of the selected secondary cluster. In this way a changed group of nodes that masks the identity of the user corresponding to the preferred node is obtained.

This object is according to a second aspect also achieved by a

communication environment comprising a privacy retaining arrangement. The privacy retaining arrangement in turn comprises a processor acting on computer instructions whereby the privacy retaining arrangement is operative to

obtain a representation of at least some users as nodes in a group of nodes within the communication environment,

determine at least one preferred node for which the user identity is to be masked, where the at least one preferred node has a link directed away from itself with a weight that is among the highest in the weights of the group,

locate a primary cluster comprising the preferred node,

select a corresponding secondary cluster of the same type as the located primary cluster, and

replace the weights of the links in the located primary cluster with the weights of the links of the selected secondary cluster in order to obtain a changed group of nodes that masks the identity of the user corresponding to the preferred node.

The object is according to a third aspect achieved through a method for retaining the privacy of a user in a communication environment. The method is performed in a privacy retaining arrangement in the

communication environment and comprises

obtaining a representation of at least some users as nodes in a group of nodes within the communication environment,

determining at least one preferred node for which the user identity is to be masked, where the at least one preferred node has a link directed away from itself with a weight that is among the highest in the weights of the group,

locating a primary cluster comprising the preferred node,

selecting a corresponding secondary cluster of the same type as the located primary cluster, and

replacing the weights of the links in the located primary cluster with the weights of the links of the selected secondary cluster in order to obtain a changed group of nodes that masks the identity of the user corresponding to the preferred node. The object is according to a fourth aspect achieved through a computer program for retaining the privacy of a user in a communication

environment. The computer program comprises computer program code which when run in a privacy retaining arrangement in the communication environment, causes the privacy retaining arrangement to:

obtain a representation of at least some users as nodes in a group of nodes within the communication environment, where the nodes are connected to each other with directional links representing relationships between the users, where the direction is based on which of the corresponding users is initiator of the relationship, where the directional links each have a weight representing the strength of the relationship, where the nodes of the group are provided in clusters of different types and where a cluster type is based on between which nodes links are provided and their direction,

locate a primary cluster comprising the preferred node,

The object is according to a fifth aspect finally also achieved through a computer program product for retaining the privacy of a user in a communication environment , said computer program product being provided on a data carrier and comprising said computer program code according to the fourth aspect.

The invention according to the above-mentioned aspects has a number of advantages. It allows the identities of influential user to be masked through their influence, which is represented by a high weight, now being placed at nodes that actually represent influenced users. At the same time the node structure is unchanged and data is retained in the structure. No data is thus lost. This means that an interested party may still obtain valuable information even though the identities of influential users have been masked.

In an advantageous variation of the first aspect, the privacy retaining arrangement is further configured to obtain more than one secondary cluster of the same type as the primary cluster and select an obtained secondary cluster the links of which have the lowest weights.

In a corresponding variation of the third aspect, the method further comprises obtaining more than one secondary cluster of the same type as the primary cluster and the selecting of a corresponding secondary cluster comprises selecting an obtained secondary cluster the links of which have the lowest weights. A preferred node may be included in more than one primary cluster and a corresponding node of a secondary cluster may be present in more than one selected secondary cluster.

The types of clusters may comprise three major types, a first type where one central node is linked to and points at all the other nodes in the cluster, a second type where the nodes follow each other in a chain and the links point in the same direction and a third type where one central node is linked to and points at all the other nodes and a limited number of these other nodes has a link pointing back to the central node. There may also exist minor types of clusters. It is in some cases possible to consider also these minor types of clusters should the need arise. The number of nodes in the clusters may also be three.

According to a further variation of the first aspect, the privacy retaining arrangement is further operative to receive a request for the group of nodes from a third party interested in data about the communication environment and provide the changed group of nodes as a response. According to a corresponding variation of the third aspect, the method further comprises receiving a request for the group of nodes from a third party interested in data about the communication environment and providing the changed group of nodes as a response.

The communication environment may comprise a telecommunication network, which may be a mobile telecommunication network. The communication environment may also comprise more than one group of nodes, where the strength of the relationship between groups is lower than the strength of the relationship within a group.

It should be emphasized that the term "comprises/comprising" when used in this specification is taken to specify the presence of stated features, integers, steps or components, but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof. BRIEF DESCRIPTION OF THE DRAWINGS

The invention will now be described in more detail in relation to the enclosed drawings, in which:

fig. l schematically shows two users in an communication environment exemplified by a mobile communication network as well as a third party device,

fig.2 shows a block schematic of a first way of realizing a privacy retaining arrangement in the mobile communication network,

fig. 3 shows a block schematic of a second way of realizing the privacy retaining arrangement in the mobile communication network,

fig. 4 shows a flow chart of method steps for forming a node representation of users in the mobile communication network, fig. 5 schematically shows three major types of node clusters that may exist in the node representation,

fig. 6 shows a flow chart of method steps in a method for retaining the privacy of users in the mobile communication network according to a first embodiment,

fig. 7 shows a flow chart of method steps in a method for retaining the privacy of users in the mobile communication network according to a second embodiment,

fig. 8 shows an exemplifying group of nodes in the node representation having three preferred nodes, where the nodes are connected with directional links having weights,

fig. 9a and 9b show primary clusters and corresponding secondary clusters in the group of nodes,

fig. 10A and 10B show the replacing of links between one primary cluster and a corresponding secondary cluster in the group of nodes,

fig. 11a and 11b show the replacing of links between another primary cluster and a corresponding secondary cluster in the group of nodes, fig. 12a and 12b show the replacing of links between a further primary cluster and a corresponding secondary cluster in the group of nodes, fig. 13a and 13b show the replacing of links between yet another primary cluster and a corresponding secondary cluster in the group of nodes, fig. 14 shows the exemplifying group of nodes after the changing of weights has been performed, and

fig. 15 shows a computer program product comprising a data carrier with computer program code for implementing the functionality of the privacy retaining arrangement.

DETAILED DESCRIPTION In the following description, for purposes of explanation and not limitation, specific details are set forth such as particular architectures, interfaces, techniques, etc. in order to provide a thorough understanding of the invention. However, it will be apparent to those skilled in the art that the invention maybe practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known arrangements, devices, circuits and methods are omitted so as not to obscure the description of the invention with unnecessary detail.

Fig. l schematically shows an example of a communication environment. In this example the communication environment is a telecommunication network, which more particularly is an exemplifying mobile

telecommunication network. The mobile telecommunication network will in the following be referred to simply as a mobile communication network MN 20. The mobile communication network 20 comprises a base station BS 14 connected to a serving gateway SGW 16. The serving gateway 16 is in turn connected to a PDN Gateway PGW 18, where PDN is an acronym for Packet Data Network. In the mobile communication network 20 there is also a privacy retaining arrangement 22. The arrangement 22 is in the mobile communication network 20 connected to the SGW 16, the PGW 18 and to a Graph DataBase GDB 23. It should be realized that the mobile communication network 20 may comprise several more devices. There may also be more devices of the same type, such as PGWs, SGWs and base stations. The mobile communication network may furthermore be a network allowing Internet connectivity such as Long Term Evolution (LTE) or Wideband Code Multiple Access (WCDMA). Aspects of the invention will in the following be described in relation to the mobile communication network 20. However, the telecommunication network is not limited to mobile communication networks, but may for instance be a Public Switched Telecommunication Network (PSTN). Furthermore, the communication environment is however not limited to telecommunication but may for instance be a social community environment such as

Linkedln, Facebook etc. The base station 14, which is often termed eNodeB or just NodeB, is provided in a part of the mobile communication network 20 termed access network, while the other devices are provided in a part of the mobile communication network 20 termed a core network.

A first user Ui of the mobile communication network 20 is furthermore equipped with a first terminal 10 or a first mobile station MSi, via which he or she may communicate with other users via the mobile communication network 20. In fig. 1 there is also shown a second user U2 of the mobile communication network 20 also equipped with a terminal in the form of a second mobile station MS2 12. Both of these are also shown as

communicating with the base station 14. The two users Ui and U2 and their mobile stations 10 and 12 are only shown for exemplify how

communication occurs in the network. It should be realized that the mobile communication network does comprise several more uses and terminals. A mobile station may furthermore be any type of cellular phone, where one example is a smart phone. It may also be a computer such as palm top or a lap top computer. As can be seen in fig. 1, the privacy retaining arrangement 22 is also connected to a third party device 24 outside of the mobile network 20. The privacy retaining arrangement 22 receives a request RQ from this third party device 24 and answers to the request with a response RP. The request RQ and response RP typically concern a group of nodes

representing users in the mobile communication network. This will be discussed in more detail later

Fig. 2 shows a block schematic of a first way of realizing the privacy retaining arrangement PRA 22. It maybe provided in the form of a processor PR 26 connected to a program memory M 28. The program memory 28 may comprise a number of computer instructions

implementing the functionality of the privacy retaining arrangement 22 and the processor 26 implements this functionality when acting on these instructions. It can thus be seen that the combination of processor 26 and memory 28 provides the privacy retaining arrangement 22. Fig. 3 shows a block schematic of a second way of realizing the privacy retaining arrangement PRA 22. The privacy retaining arrangement 22 may comprise a Graph Creator GC 30, a Group Identifier GI 32, an Influential User Identifier IUI 34, a Cluster Identifier CI 36 and a Weight Changer WC 38.

The elements in fig. 3 may be provided as software blocks, for instance as software blocks in a program memory, but also as a part of dedicated special purpose circuits, such as Application Specific Integrated Circuits (ASICs) and Field-Programmable Gate Arrays (FPGAs). It is also possible to combine more than one element in such a circuit.

The users of the mobile communication network 20 communicate with each other. This communication may involve a user calling another user, as well as sending a message to the other user. It may also involve the first mentioned user posting an item, such as a text, an image, a sound or video clip at a server and allowing other users to access the item and possibly also comment on it. A user making a call, sending message or posting an item thus initiates a relationship or contact with one or more other users and is therefore a relationship or contact initiator or just initiator, while a user receiving a call, a message and reading or commenting a posted item is a responding user or a recipient. A dominant initiator of a relationship, i.e. an initiator having a strong relationship to a recipient, is furthermore often termed an influential user, since he or she may influence one or more of the recipients, which are thus influenced users. Data concerning different users and especially pertaining to with who they initiate a relationship may be of interest to collect and use for various purposes. It may for instance be of interest to provide third parties with data to analyse and target specific key individuals for effective influencer marketing. Influencer marketing is a form of marketing that has emerged from a variety of recent practices and studies, in which focus is placed on specific key individuals, i.e. the influential users, rather than the entire network. Through identifying the individuals who have influence over potential customers, and business activities a are oriented around these influencers. While it may lie in the interest of the operator of the community or communication environment to provide such data about influential users to a third party, it is at the same time troublesome in that the privacy of the user may be violated. Influencer's privacy is of great concern to the operator since the lack of a privacy-preserving technique may lead to a higher churn rate (a measure of the number of individuals moving out of a network) among influential customers. As an influential user influences other users, he or she may in such a case also influence them to leave the communication environment. The operator may thus also lose a significant number of influenced users, which may cause a complete collapse of the environment. To help guiding public policy to protect individuals' privacy as well as promoting complex mobile analytics, operators are faced with the problem of providing third-parties with a fairly precise picture of the quantities or trends of the network without disclosing information about specific key individuals of the environment.

One way of providing data of the users that has emerged lately is the representation of the users in the communication environment as nodes in a graph. The users are thus nodes in a graph, where the nodes representing the users are connected to each other with directional links, sometimes termed edges, representing relationships between the users. In these links the direction is based on which of the corresponding is initiator in the relationship. Furthermore the communication among a group of users of the communication environment is often more extensive than outside the group and hence groups of users or sub-graphs are often of more interest than the complete graph of the whole communication environment. Subgraphs, for instance based on location and time in various networks especially telecommunication networks, have provided third-parties with unprecedented opportunities.

This way of exporting data in the form of a graph or sub-graph does provide the third party with data about the users while still retaining privacy. However, on-going research in security and privacy has shown potential risks for influential individual's identification even with the anonymized graph representation of such networks. It may thus still be possible to learn the identity of the users when data is provided in this format.

There is therefore a need for a node-level privacy preservation technique for specific key individuals in a communication environment that guarantees the maintaining of the privacy of influencers while providing an accurate sub-graph to the third party for analysis.

Aspects of the invention are directed towards such privacy preservation.

Aspects of the invention are related to providing a non-interactive mechanism in releasing the statistical graph data that maintains both the privacy and the utility. It is to be duly noted that every influential individuals' identity is at stake while publishing communication

environment data in the form of a graph or sub-graph to third-parties even when anonymized. A third party is thus interested to obtain such graphs or sub-graphs.

Therefore there will now follow a description of how such a graph may be created. A graph may be based on data concerning communication made by the user of the communication environment. Therefore such data may have to be collected.

How this data collection may be performed in order to create a graph will now be described with reference also being made to fig. 4, which shows a flow chart of method steps for forming a node representation of users in the mobile communication network.

The users of the mobile communication network 20, exemplified by the first and second users Ui and U2, are involved in various types of communications with each other. Thereby they have different

relationships. A user, such as the first user Ui, may for instance call another user, such as the second user U2, and/ or send a message to the other user, such as a Short Message Service (SMS) message or Multimedia Messaging Service (MMS) message or an e-mail. It is also possible that the first user Ui posts an item at a site, such as a text, an image, a sound clip or a video clip on a server accessed by the other users and that the second user U2 then accesses and perhaps also comments on the posted item. In these types of communications the first user Ui is an initiator or influencer while the second user U2 is a recipient or responding or influenced user. Here it is possible that the second user U2 in a similar manner connects to the first user Ui, in which case also the second user U2 is an influencer.

In order to be able to form a graph, the graph creator GC 30 obtains network data for the users of the network, step 40. The network data may comprise session specific data such as data about actual communication sessions set up between users. In relation to the first users Ui, the communication network 20 may thus collect data of the calls made by him or her, the messages sent by him or her as well as other types of activities, such as file transfers to different servers. It is at the same way possible to collect records of which other users that connect to a server to which the first user has transferred a file (for instance based on identified Uniform Resource Locators used by these other users). It can thereby be

established which other users, such as the second user U2, are recipients in relation to such a file transfer. Such session specific data may be collected in session data records (SDR), which in the case of calls may be so-called call data records (CDRs). SDRs may thus be analysed in order to find out with which other users the user Ui has relationships, as is for instance evidenced through connections being set up. Also the strength of the relationship, for instance based on the frequency and length of sessions involving such connections as well as what response the connection generates, may be of interest. Such session data record may be obtained from the SGW 16 and/or the PGW 18. The graph creator 30 may therefore communicate with the SGW 16 and/or the PGW 18 in order to obtain SDRs. This activity may be performed in respect of all the users of the mobile communication network 20.

After having obtained such network data, the graph creator 30 then analyses the network data with regard to the relationships between the users, step 42. The analysis may comprise an analysis of the connections being set up by the various users. The graph creator 30 thus analyses the network data with regard to which other users in the network the first user Ui has a relationship, such as which other users he or she connects to, i.e. acts as initiator, as well as with which other users the same user acts as recipient. All the connections made from the first user Ui to another user, such as the second user U2, may then be combined and given a value. In the same manner all connections initiated by the second user U2 to the first user may also be combined and given a value. The value of the connections initiated by the first user Ui may then be the sum of connections in the given time interval. Some types of connections may also be considered to be more important. Telephone calls may for instance be considered more important than messages and a posting may be l6 considered the most important. Each sum may therefore be multiplied by a corresponding importance factor in order to form the value. It is also possible to use an averaging. With regard to the first user Ui, there is thus obtained a value for all connections to other used in which the first user acts as initiator. The same operations are also performed for all users. As the same operations are performed for the other users, corresponding values are also obtained for all connections from other users to the first user Ui where the first user is a recipient. The values then represent the strength of the relationships between the nodes. It can also be seen that the strength of the relationship may have a dependency on the amount of contact between the users.

Thereafter a graph with nodes is formed, step 44, where the graph is made for all the users in the mobile communication network 20. Each node in the graph then represents one user and the nodes are furthermore interconnected with directional links or edges. The link between two nodes thereby represents the relationship between the corresponding users and the direction indicates which of the users is initiator of the relationship. A link between an initiator and a recipient is then directed towards or points at the recipient. The link will furthermore be provided with a weight corresponding to the above mentioned value. The links are thus provided with weights representing the strength of the relationship between two nodes. As can be seen from the discussion above there may be two links between the same two nodes, each representing the role of initiator or recipient for a corresponding node. The link of an initiator thus points away from the node of the initiator while the link of a recipient points towards the node of the recipient. The communication patterns of the users in the

communication environment are thus represented by a graph, where each user is a node in the graph. In the study of complex communication environments (such as

telecommunication networks), an environment is said to have community structure if the nodes of the graph can easily be grouped into groups of nodes such that each group of nodes is densely connected internally. This implies that the environment divides naturally into groups of nodes with dense connections internally and sparser connections between groups. The more general definition is based on the principle that pairs of nodes are more likely to be connected if they are both members of the same community, and less likely to be connected if they do not share

communities. A suitable community detection algorithm may therefore also be employed on the graph to detect various communities involved.

Thus, in a communication environment like the mobile communication network 20, the users communicate within communities, where a group of nodes corresponding to users in such a community may be represented by a sub-graph. A user does thus not have a relationship with every other user in the mobile communication network 20 but only with a limited group, such as friends, customers and colleagues. The amount of communicating or the number of relationships within a community may be fairly high while the amount of communication or the number of relationships outside of the community may be considerably lower. It is then possible to locate or determine groups of nodes in the graph representing users communicating more with each other and thus having more relationships with each other than the rest of the users. Therefore after the graph has been formed by the graph creator 30, the group identifier 32 determines or identifies groups of nodes corresponding to such communities based on the links, weights and nodes, step 46. The determination of a group may be based on the fact that the strength of the relationship between groups is lower than the strength of the relationship within a group. This maybe done through comparing the weights of the links with a group defining threshold. It is more particularly possible to obtain a first average of all links of a prospective group. It is also to obtain a second average of all the l8 links leaving such a prospective group. These two averages may then be compared. If then the first average is considerably higher than the second, such as if the difference is higher than a first communication density threshold or the quotient is higher than a second communication density threshold, then a sub-graph may be considered to exist. The users in the community are thus represented in the sub-graph as nodes in a group of nodes within the communication environment, which group corresponds to the community. The nodes are furthermore found to be connected to each other in one of three major types of clusters , where a cluster type is based on between which nodes links are provided and their direction. A cluster may be a triad, i.e. a cluster of three. There is in this case a first type Ti where one central node is linked to and points at all the other nodes in the cluster, a second type T2 where the nodes follow each other in a chain and the links point in the same direction, and a third type T3 where one central node is linked to and points at all the other nodes and a limited number of these other nodes has a link pointing back to the central node. In the case of a triad the limited number of nodes pointing back is one. The group identifier 32 thus also notes the types of clusters that are present and more particularly what type of cluster each node is connected in, step 48. There may in some cases exist minor types of clusters, which minor types thus differ from the major types. Fig. 5 schematically shows the three different major node types Ti, T2 and

T3 for a triad cluster, i.e. for a cluster of three nodes.

When this has been done there is data present in the form of the graph and sub-graphs providing information about the communication pattern and relationships of the users in the communication environment. This data may then be stored by the group identifier 32 in the graph database 23. This graph database 23 may furthermore be continuously updated based on the communications made by the users. The graph data may then later be exported to a third party, for instance in order to allow the third party to use directed advertising towards influential users.

The identity of the user cannot be directly gathered from such node data. However, it may be possible for the third party to learn the identity of a user representing a node, which is in many cases not acceptable. The user does not necessarily have accepted that data about his or her

communication patterns is exported in this way, especially if he or she has not been notified of the delivery and finds out that it has been made.

Further problems may occur if the user in question is influential. Influential users are the important users who possess a certain amount of influence on other users within a community. By identifying the influential users in each community, it is possible to analyse the behaviour of the entire community and also increase the probability of recommendations being more relevant to the users. But this identification mechanism must not result in loss of privacy of individuals involved, especially while publishing the network sub-graph to third-parties.

Furthermore if an influential user proceeds and leaves the communication environment, he or she will most likely be followed by others, so called followers, which may be detrimental to the operator and perhaps the whole environment. It is thus necessary that the privacy of the influential user is enhanced. Aspects of the invention are directed towards such privacy retention when exporting graph or sub-graph data to a third party. A first embodiment outlining this will now be described with reference also being made to fig. 6, which shows a flow chart of method steps in a method for retaining the privacy of users in the mobile communication network 20 and being performed in the privacy retaining arrangement 22.

The method starts with the influential user investigator 34 obtaining a group of nodes, step 50, for instance a group of nodes or a sub-graph desired by the third party 24. This group or sub-graph may be fetched from the previous mentioned graph database 23. As an alternative the graph creator 30 and group identifier 32 may create a graph or a subgraph from new data. If creating a sub-graph from new data, it is possible to use previous knowledge about what users are parts of the community based on previously generated graphs with sub-graphs. Thereafter the influential user investigator 34 determines at least one preferred node in the group, step 52. This preferred node is a node for which the user identity is to be masked. The node is furthermore a node representing an influential user. The preferred node may be a node having an outgoing directional link with a weight that is among the highest in the weights of the group, which may be the case if the weight exceeds a weight threshold. Alternatively an average of the weights of the outgoing links may be compared with a threshold. It is also possible that one or more of the nodes having the highest weights or averages in the group are determined to be preferred nodes, such as the top node or the top ten nodes in the group. The above described preferred node determination is a reliable influencer identification algorithm, which maybe applied on each of the communities represented by a network sub-graph to make a list of individuals whose identities are to be preserved to prevent third-parties from violating their privacy.

Each such node determined to be a preferred node is then analysed with regard to at least one cluster that it is a part of as well as to cluster type. The cluster identifier 36 therefore locates at least one primary cluster, step

54, where a primary cluster is a cluster comprising a preferred node. If there is only one preferred node, the cluster identifier thus looks at the sub-graph or group and identifies at least one primary cluster that comprises this preferred node. Also the type of each of the primary clusters is noted. This locating is made for all of the preferred nodes. Thereafter the cluster identifier 36 goes on and selects a secondary cluster of the same type, step 56. For a given primary cluster of a certain type, it may thus identify at least one secondary cluster that is of the same type. If only one such secondary cluster is identified, this is also selected for matching to the primary cluster. However, if there is more than one such candidate secondary cluster of the same type, the cluster identifier 36 selects a secondary cluster according to a selection scheme. The secondary cluster selected may for instance be the cluster having the lowest average weight or the cluster having the link with the lowest weight. Each secondary cluster is then mapped to the corresponding primary cluster. The weight changer 38 is then informed of the mapping.

Thereafter the weight changer 38 changes the weights of the mapped clusters, step 58. It thus replaces the weights of the links in the primary clusters with weights of the corresponding links of the selected secondary clusters. This may mean that the link in a primary cluster switches weight with a corresponding link of the corresponding secondary cluster.

After the weights have been changed in this way, the identities of the users have been masked as the weights, which represent the influence of the influential users, are now placed at nodes that actually represent influenced users. At the same time the data is retained in the structure. No data is thus lost.

The changed sub-graph may then be stored in the graph database 23 or perhaps delivered to the third party 24.

A second embodiment will now be described with reference also being made to fig 7, which shows a flow chart of method steps in a method for retaining the privacy of users in the mobile communication network being performed by the privacy retaining arrangement 22, as well as to fig. 8 - 12, which show an exemplifying group of nodes and how these are handled for changing weights.

The method does in this case start with the mobile communication network receiving a request RQ for a group of nodes from the third party device 24, step 60. The third-party 24 may wishes to target specific key individuals on the mobile communication network 20, and the request RQ ,may therefore be a request to the operator to release a sub-graph containing users and their links. The request maybe received by an operator, who after accepting it obtains the corresponding sub-graph from the graph database 23 (based on location and time) and provide this sub-graph as input to the influential user identifier 34 of the privacy retaining arrangement 22 The operator. As an alternative, the request RQ may be directly received by the influential user investigator 34, which then goes on and obtains the group or sub-graph from the graph database or from the group identifier 32.

The sub-graph may as an example be the sub-graph of a group G of nodes shown in fig. 8, which group G represents the users of the community that the third party 24 is interested in. In fig. 8 it can be seen that there is a first node 1, a second node 2 and a third node 3, which are all linked to a fourth node 4 as recipients. The node 4 is thus initiator and has a directional link with a weight 46 to node 3, a directional link with a weight 25 to node 1 and a directional link with a weight 32 to node 2. There is also fifth node 5 acting as initiator in relation to node 4 and recipient in relation to node 2. Node 2 thereby has a directional link with a weight 19 to node 5 and node 5 has a directional link with weight 14 to node 4. There is also a sixth node 6 acting both as initiator and recipient in relation to a seventh node 7. Node 6 thus has a directional link with a weight 24 to node 7 and node 7 has a directional link with a weight 37 to node 6. Node 7 also has a directional link with a weight 41 to node 3 and a directional link with a weight 52 to an eighth node 8. Node 8 acts as a recipient in relation to a ninth node 9 and thus node 9 has a directional link with a weight 3 to node 8. Node 9 also has a directional link with a weight 5 to node 5. There is also a tenth node 10 having a directional link with weight 12 to node 7, an eleventh node 11 having a directional link with weight 46 to node 7 and a twelfth node 12 having a directional link with weight 6 to node 8. Here it may be

mentioned that node 12 is recipient in relation to node 9. Node 9 therefore has a directional link with weight 7 to node 12. There is also a thirteenth node 13 and a fourteenth node 14, where node 13 has a directional link with weight 9 to node 10 and a directional link with weight 8 to node 14. Also node 10 has a directional link with weight 4 to node 14. Finally node 14 has a directional link with weight 10 to node 11, a directional link with weight 1 to node 12 and a directional link with weight 2 to node 13.

After having obtained the sub-graph with the group G, the influential user identifier 34 then determines influential users, step 62. Generally, the influencers of a community in a network sub-graph are characterized by high weighted degree values, which are defined as the average of the weights of outgoing links from a node. And the average weighted degree of a sub-graph can be regarded as the average of weighted degrees of all the nodes present in the sub-graph. In order to determine influential users, the influential user identifier 34 therefore looks at the nodes that have initiator links with the highest weights. It may furthermore look at the nodes having the highest average weights. The weights going out from each node may thus be averaged, and the node with the highest average maybe considered to correspond to the most influential user of the community represented by the group G. This means that the nodes that have links with a direction away from themselves, the averages of which are the highest in the group or sub-graph are considered to correspond to influential users. It is here possible that one or more of the nodes having the highest average are selected, such as the top ten, top five or the single node having the highest average. As an alternative it is possible to compare the average with a threshold. All nodes having a link with a value above an influential user threshold may be considered to be preferred nodes and thus correspond to influential users.

In the example of fig 8 the three top nodes are selected i.e. the three top nodes that have the highest averages. As can be seen these are node 11, node 7 and node 4, since node 11 has an average of 46, node 7 has an average of 43.3 and node 4 has an average of 34. Node 7 may here represent the first user Ui. Node 11 is thereby a first preferred node PNi, node 7 is a second preferred node PN2 and node 4 a third preferred node PN3. The influential user identifier 34 then informs the cluster identifier 36 about the preferred nodes PNi, PN2 and PN2.

The cluster identifier 36 then proceeds and identifies or locates the clusters in which the preferred nodes are a part as well as the types of nodes, step 64. A list of clusters, here in the form of triads, comprising at least one of the influencers may thereby be extracted out from the sub- graph. A number of predominant or major clusters Τι, T2 and T3 in telecommunication networks comprising at least one influential user may thus be extracted out from the sub-graph.

How the locating may be performed is exemplified in fig. 9A.

It can be seen that the first preferred node PNi, node 11 is in a cluster of the second type T2 together with nodes 7 and 8, where node 11 points at node 7, which in turn points at node 8. This cluster is a first primary cluster PCLi. It can be seen that the first primary cluster PCLi comprises more than one preferred node and in this example also node 7 in addition to node 11. In a similar manner it can be seen that the second preferred node PN2, node 7, is in a cluster together with node 3 and 6, which cluster is a second primary cluster PCL3 that is of the third type T3. Node 7 is here a central node having a directional link to node 3 and a directional link to node 6, where node 6 has a directional link back to node 7. The third preferred node PN3, node 4, is in a cluster of the first type Ti together with node 1 and node 3. This cluster is a third primary cluster PCL3. Node 4 is here a central node having a directional a link to both nodes 1 and 3. Node 4 is furthermore also in a cluster of the second type T2 together with nodes 2 and 5, where node 4 points at node 2, which in turn points at node 5. This cluster is a fourth primary cluster PCL4.

When studying the clusters it can be seen that a preferred node may be a dominant initiator in a cluster. It may have more directional links leaving itself than it receives. It can be seen that a preferred node may be a central node in both the first and third types, i.e. a node with links pointing to all the other nodes and the first in the chain in the second type.

Thereafter the cluster identifier 36 obtains secondary clusters of the same types as the primary clusters step 66. These are thus clusters that differ from the primary clusters. If the primary clusters are provided in a list, then for each cluster in this list, an analogous (pattern matching) cluster that does not contain any of the preferred nodes is found out. The subgraph may therefore be mined for all analogous (pattern-matching) clusters lacking preferred nodes. Secondary clusters are typically clusters having links with weights that are low. These may be clusters having the links with the lowest weights as well as the clusters where all links are below a low weight threshold. What is important though is that the clusters have to be of the same type as the primary clusters.

A number of secondary clusters of the same type as those present in the primary clusters are first selected. There may here be more candidates than there are secondary clusters being mapped. If there is more than one cluster that may be mapped to a corresponding primary cluster, the secondary cluster having the lowest weight maybe selected for being mapped, step 68. When there is more than one analogous cluster, the one with the least average weighted degree value may be chosen and if there is a tie between analogous triads one may be arbitrarily selected.

If a preferred node is present in more than one primary cluster such as in two clusters, two secondary clusters may have to be selected where one corresponding node has the same positions in both secondary clusters. If a preferred node is included in more than one primary cluster, then it is thus possible that a corresponding node may be present in more than one selected secondary cluster. The mapping of secondary clusters is exemplified in fig. 9B.

The first primary cluster PCLi is of type T2, which may be mapped to a number of clusters. One candidate is the cluster made up of nodes 8, 9 and 12, where node 9 points at node 12, which in turn points at node 8. Another is the cluster made up of nodes 8, 12 and 14, where node 14 points at node 12 and node 12 points at node 8. A further candidate is the cluster made up of nodes 10, 12 and 14, where node 10 points at node 14 and node 14 points at node 12.. Here the cluster made up of nodes 10, 12 and 14 is selected to be a first secondary cluster SCLi since the average weight of this cluster is lower than the average weight of the other candidates. In thus cluster node 10 furthermore corresponds to node 11. Node 10 may here represent the second user U2.

The second primary cluster PCL2 is of type T3. As can be seen there is only one cluster of this third type T3 outside of the primary clusters and that is the cluster formed by nodes 10, 13 and 14. Consequently this cluster is selected to be a second secondary cluster SCL2 corresponding to the second primary cluster PCL2.

The third primary cluster PCL3 is of the first type. However, there are two further clusters of the first type Ti and that is the cluster formed by nodes 5, 8 and 9 and that formed by nodes 8, 9 and 12. Since the average weight of the former is less than the latter, the former cluster is mapped to the third primary cluster PCL3. This means that the cluster of nodes 5, 8 and 9 is made into a third secondary cluster SCL3 that is mapped to the third primary cluster PCL3. Here it may also be seen that node 9 has the same role as node 4, i.e. acting as a central node. It is thus assigned to node 4.

The fourth primary cluster PCL4, i.e. of the second type T2 and may be mapped to either the cluster made up of nodes 8, 9 and 12 or the cluster made up of nodes 8, 12 and 14. However, since node 9 in the third secondary cluster has already been assigned to the third primary node PN3, this node will have to be present also in the cluster that is to be the fourth secondary cluster. Therefore, the cluster made up of nodes 8, 9 and 12 is selected to be a fourth secondary cluster SCL4.

In the example above only clusters that were of the major type were investigated. If the need arises it is possible to consider at least some of the minor types of clusters as well. The cluster identifier 36 then informs the weight changer 38 about which secondary clusters have been mapped to which primary clusters, whereupon the weight changer 38 proceeds and changes the weights, step 70. This means that the nodes of a primary cluster receive the weights of the corresponding secondary cluster and vice versa. The link weights of the primary clusters are thus interchanged with the link weights of corresponding analogous secondary clusters containing no preferred nodes in order to distribute the effect of high weighted degrees exhibited by preferred nodes among all other nodes in the sub-graph.

The weight changer 38 may here operate from the bottom and up. It may thus start with the least preferred node and end with the most preferred node. However, it should be realized that this order is not important once the mapping has been made. Weights maybe changed in any order or simultaneously for all nodes. The changing may for instance start with the mappings made in relation to the third preferred node PN3 and then perhaps with the third primary and secondary clusters.

How the changing of weights is done for the third primary and secondary clusters PCL3 and SCL3 is shown in fig. 10A and 10B. It can be seen that node 4 is linked to node 1 with a link having a weight 25 and to node 3 with a link having a weight 46, while node 9 is linked to node 8 with a weigh 46 and to node 5 with a weight 25. As mentioned earlier node 9 corresponds to node 4 and both are central. Furthermore the clusters are symmetrical and therefore in the exchange of weighs the link between node 4 and 3 may receive either of the weights between node 9 and 8 or 9 and 5. In this case it receives the weight of the link between node 9 and 8, since this is the lowest. It can thus be seen that the link between node 4 and 3 changes weight with the link between nodes 9 and 8, while the link between node 4 and 1 changes weight with the link between nodes 9 and 5.

The change of weights for the fourth primary and secondary clusters PCL4 and SCL4 is shown in fig. 11A and 11B. In this case an exchange of weights is made in the direction of the chain. This means that the link between node 4 and 2 changes weight with the link between node 9 and 12 and the link between nodes 2 and 5 changes weight with the link between nodes 12 and 8. The change of weights for the second primary and secondary clusters PCL2 and SCL2 is shown in fig. 12A and 12B. It can be seen that node 7 is linked to node 6 with a weight 24 and to node 3 with a weight 41 and node 6 is linked to node 7 with a weight 24. Node 13 is linked to node 14 with a weight 8 and to node 10 with a weight 9, where node 14 is linked back to node 13 with a weight 2. As can be seen node 13 corresponds to node 7 and both are central. However, in this case the cluster is unsymmetrical, why node 14 has to correspond to node 6 and node 10 to node 3. Therefore in the exchange of weights the link from node 7 to node 6 changes weight with the link from node 13 to 14, the link from node 6 to node 7 changes weight with the link from node 14 to node 13 and the link between node 7 and 3 changes weight with the link between node 13 and 10. The change of weights for the first primary and secondary clusters PCLi and SCLi is shown in fig. 13A and 13B. As the clusters are of the same types as in the fourth primary and secondary clusters PCL4 and SCL4, the same principles are used as described in relation to clusters PCL4 and SCL4. This means that the link between node 11 and 7 changes weight with the link between node 10 and 14 and the link between nodes 7 and 8 changes weight with the link between nodes 14 and 12.

Fig. 14 shows the sub-graph after all these changes have been made. It can here be seen that the weights of the links connected to the nodes

representing the influential users, i.e. nodes 7, 4 and 11 have all been completely changed compared with the original weights in fig. 8 and thereby the privacy of the influential users have been preserved. It can also be seen that no changes have been made to the node structure and no information has been deleted. Thereby also the third party will be satisfied. After having changed the weights the weight changer 38 may then store the changed sub-graph in the graph database 23. It may also deliver the changed sub-graph as a response R to the third party 24, step 72. After the above mentioned steps have been run through, the influencers' privacy-preserving sub-graph that is depicted in Figure 14 can be released to the third-party. It is to be duly noted that the effect of influence present amongst influencers is evenly distributed between all other users without making any structural changes to the sub-graph. Thus there is provided a good trade-off between privacy and utility.

The swapping of link weights between analogous clusters thus achieves a distribution of the effect of high weighted degrees exhibited by influencers among all other users in the network sub-graph. It may for instance be seen that influence of the real influential first user Ui as manifested by the weights has been transferred from the node representing this user Ui to a node representing the second user U2, which is an influenced user. At the same time, the utility of the sub-graph is well-preserved since the privacy- preservation technique does not alter the overall structure of the sub- graph.

Table 1 below lists various structural properties of the sub-graph before and after the application of influencers' privacy-preserving mechanism. It should be noted that there is no change in any of the structural properties except for the fact that nodes 9, 14 and 13 instead of 4, 7 and 11 take the role of influencers in the newly formed sub-graph. This attributes to the fact that the privacy of influencers and also the utility associated with the sub-graph are well-preserved. Even though the solution looks simpler, it satisfies the requirement without any loss in accuracy which is one of the important criteria industry requires from privacy research Structural Properties Before weight change After weight change

Avg. No. outgoing links 1-5 1-5

Avg. No. Weighted 28.786 28.786

Outgoing links

Diameter 3 3

Average Path Length 1.796 1.796

Number of Shortest Paths 54 54

Density 0.115 0.115

Modularity 0.381 0.381

Average Clustering Coeff. 0.155 0.155

Eigen Vector Centrality 0.00287 0.00287

Table I

An algorithm implementing the functionality of the influential user identifier, cluster identifier and weight changer may schematically be implemented as:

Input: A communication environment sub-graph (containing users as nodes and relationships as links)

Output: Influencers' identity -preserving environment sub-graph S=input sub-graph

Group_Detect() /*Detects Groups in the environment*/

or every Group G in Gi,G₂,...,G_n

V=set of nodes in G

Influencer_Identify() /identifies influencers in a group*/ I containing first 'n' influencers in G obtained by a reliable influencer identification algorithm

/*n is chosen by the operator*/ Cluster_Identification() /*Mines the community and identifies all possible sets of clusters*/

SETi=set of all predominant clusters in G containing at least one influencer (el)

SET2 =set of all predominant cluster in G containing none of the influencers in I

for every cluster teSETi

Cluster_Isomorphism() /*Finds a set of pattern- matching clusters for t*/

m=an analogous cluster in SET2 with the least average degree value

interchange link weights in clusters t and m output influencer identity -preserving sub-graph The various aspects of the invention have a number of advantages. The identity (and hence the privacy) of influencers is well-preserved together with the utility associated with a sub-graph of individuals that may be released to third-parties. This reduces the risk of losing influential customers (which in turn leads to the loss of other influenced customers) due to privacy concerns. It may in fact increases the loyalty score of the customers towards operators. The described privacy preservation scheme does not enable the third party to map the real influencers even with auxiliary information. In spite of the fact that new nodes resemble the nodes representing the influencers, the third parties will not achieve any success on their approach in identifying influential users. Operators may also monetarily benefit by providing the influencer's details separately to third party by masking their details (privacy preserving) based on the trust established. There is no change in graph structures, which is generally exhibited by applying other anonymizing techniques. There are a number of different variations that are possible to make of the privacy retaining arrangement. It is for instance possible to omit the graph creator and the group identifier. In it's simplest form the arrangement only comprises the functionality of the influential user identifier, the cluster identifier and the weight changer. Furthermore, as the functionality of the graph creator and the group identifier do not need to be provided together with the functionality of influential user identifier, the cluster identifier and the weight changer, these may be provided in separate devices. The influential user identifier, the cluster identifier and the weight changer may be provided in one device, while the graph creator and the group identifier may be provided in another device.

The privacy retaining arrangement 22 may, as was mentioned initially, be provided in the form one or more processors with associated program memories comprising computer program code with computer program instructions executable by the processor for performing the functionality of the privacy retaining arrangement.

The computer program code of a privacy retaining arrangement may also be in the form of computer program product for instance in the form of a data carrier, such as a CD ROM disc or a memory stick. In this case the data carrier or memory stick carries a computer program with the computer program code, which will implement the functionality of the above-described user privacy retaining arrangement. One such data carrier 74 with computer program code 76 is schematically shown in fig. 15.

Furthermore the graph creator of the privacy retaining arrangement may be considered to form means for creating a representation of at least some users of the communication environment as nodes within the

communication environment, where the nodes are connected to each other with directional links representing relationships between the users, where the direction is based on which of the corresponding users has initiated the relationship, said directional links each having a weight representing the strength of the relationship.

The group identifier may in turn be considered to form means for identifying at least one group of nodes within the communication environment to which the at least some users belong and means for determining clusters of nodes in the group and types of clusters to which these nodes belong, where a cluster type is based on between which nodes links are provided and their direction.

The influential user identifier may in turn be considered to form means for obtaining a representation of at least some users as nodes in a group of nodes within the communication environment and means for determining at least one preferred node for which a user identity is to be masked.

The cluster identifier may be considered to form means for locating a primary cluster comprising the preferred node and means for selecting a corresponding secondary cluster of the same type as the located primary cluster. The cluster identifier may furthermore be considered to comprise means for obtaining more than one secondary cluster of the same type as the primary cluster and the means for selecting a corresponding secondary cluster may be considered to be means for selecting an obtained secondary cluster the links of which have the lowest weights. The weight changer may be considered to form means for replacing the weights of the links in the located primary cluster with the weights of the links of the selected secondary cluster.

The privacy retaining arrangement may furthermore comprise means for receive a request for the group of nodes from a third party interested in data about the communication environment and means for providing the changed group of nodes as a response. While the invention has been described in connection with what is presently considered to be most practical and preferred embodiments, it to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements. Therefore the invention is only to be limited by the following claims.

Claims

1. A privacy retaining arrangement (22) for retaining the privacy of a user (Ui) in a communication environment, the privacy retaining arrangement comprising a processor (26) acting on computer instructions whereby said behaviour investigating arrangement is operative to obtain a representation of at least some users as nodes in a group (G) of nodes within the communication environment,

where the nodes are connected to each other with directional links representing relationships between the users, where the direction is based on which of the corresponding users is initiator of the relationship, said directional links each having a weight

representing the strength of the relationship, the nodes of the group being provided in clusters of different types (Ti, T2, T3), and a cluster type being based on between which nodes links are provided and their direction,

determine at least one preferred node (PNi, PN2. PN3) for which the user identity is to be masked, said at least one preferred node (PNi, PN2. PN3) having a link directed away from itself with a weight that is among the highest in the weights of the group (G),

locate a primary cluster (PCLi, PCL2, PCL3, PCL4) comprising the preferred node,

select a corresponding secondary cluster (SCLi, SCL2, SCL3, SCL4) of the same type as the located primary cluster(PCLi, PCL2, PCL3, PCL4), and replace the weights of the links in the located primary cluster (PCLi, PCL2,

PCL3, PCL4) with the weights of the links of the selected secondary cluster (SCLi, SCL2, SCL3, SCL4) in order to obtain a changed group of nodes that masks the identity of the user corresponding to the preferred node.

2. The privacy retaining arrangement according to claim 1, being further configured to obtain more than one secondary cluster of the same type as the primary cluster and select an obtained secondary cluster the links of which have the lowest weights.

3. The privacy retaining arrangement according to claim 1 or 2, wherein a preferred node may be included in more than one primary cluster and a corresponding node in a secondary cluster may be present in more than one selected secondary cluster.

4. The privacy retaining arrangement according to any previous claim, wherein the types of clusters comprises three major types, a first type (Ti) where one central node is linked to and points at all the other nodes in the cluster, a second type (T2) where the nodes follow each other in a chain and the links point in the same direction and a third type (T3) where one central node is linked to and points at all the other nodes and a limited number of these other nodes has a link pointing back to the central node.

5. The privacy retaining arrangement according to claim 4, wherein the number of nodes in the clusters is three.

6. The privacy retaining arrangement according to any previous claim, being further operative to receive a request (RQ) for the group of nodes from a third party (24) interested in data about the communication environment and provide the changed group of nodes as a response (RS).

7. The privacy retaining arrangement according to any previous claim, wherein the communication environment comprises a telecommunication network (20).

8. The privacy retaining arrangement according to any previous claim, wherein the communication environment comprises more than one group of nodes, where the strength of the relationship between groups is lower than the strength of the relationship within a group.

9. A communication environment comprising a privacy retaining arrangement (22), the privacy retaining arrangement comprising a processor (26) acting on computer instructions whereby said privacy retaining arrangement (22) is operative to

obtain a representation of at least some users as nodes in a group (G) of nodes within the communication environment,

determine at least one preferred node (PNi, PN2, PN3) for which the user identity is to be masked, said at least one preferred node having a link directed away from itself with a weight that is among the highest in the weights of the group (G),

locate a primary cluster (PCLi, PCL2, PCL3, PCL4) comprising the preferred node (PNi, PN2, PN3),

select a corresponding secondary cluster (SCLi, SCL2, SCL3, SCL4) of the same type as the located primary cluster, and

replace the weights of the links in the located primary cluster (PCLi, PCL2, PCL3, PCL4) with the weights of the links of the selected secondary cluster

(SCLi, SCL2, SCL3, SCL4) in order to obtain a changed group of nodes that masks the identity of the user corresponding to the preferred node.

10. A method for retaining the privacy of a user (Ui) in a

communication environment, the method being performed in a privacy retaining arrangement (22) in the communication environment and comprising obtaining (50) a representation of at least some users as nodes in a group

(G) of nodes within the communication environment,

determining (52; 62) at least one preferred node (PNi, PN2, PN3) for which the user identity is to be masked, said at least one preferred node

(PNi, PN2, PN3) having a link directed away from itself with a weight that is among the highest in the weights of the group (G),

locating (54; 64) a primary cluster (PCLi, PCL2, PCL3, PCL4) comprising the preferred node (PNi, PN2, PN3),

selecting (56; 68) a corresponding secondary cluster (SCLi, SCL2, SCL3, SCL4) of the same type as the located primary cluster (PCLi, PCL2, PCL3, PCL4), and

replacing (58; 70) the weights of the links in the located primary cluster

(PCLi, PCL2, PCL3, PCL4) with the weights of the links of the selected secondary cluster (SCLi, SCL2, SCL3, SCL4) in order to obtain a changed group of nodes that masks the identity of the user corresponding to the preferred node.

11. The method according to claim 10, further comprising obtaining (66) more than one secondary cluster of the same type as the primary cluster, where the selecting of a corresponding secondary cluster comprises selecting (68) an obtained secondary cluster the links of which have the lowest weights.

12. The method according to claim 10 or 11, wherein a preferred node may be included in more than one primary cluster and a corresponding node in a secondary cluster may be present in more than one selected secondary cluster.

13. The method according to any of claims 10 - 12, wherein the types of clusters comprises three major types, a first type (Ti) where one central node is linked to and points at all the other nodes in the cluster, a second type (T2) where the nodes follow each other in a chain and the links point in the same direction and a third type (T3) where one central node is linked to and points at all the other nodes and a limited number of these other nodes has a link pointing back to the central node.

14. The method according to claim 13, wherein the number of nodes in the clusters is three.

15. The method according to any of claims 10- 14, further comprising receiving (60) a request for the group of nodes from a third party (24) interested in data about the communication environment and providing (72) the changed group of nodes as a response (RS).

16. The method according to any of claims 10 - 15, wherein the communication environment is a telecommunication network (20).

17. The method according to any of claims 10 - 16, wherein the

communication environment comprises more than one group of nodes, where the strength of the relationship between groups is lower than the strength of the relationship within a group.

18. A computer program for retaining the privacy of a user (Ui) in a communication environment, the computer program comprising computer program code (76) which when run in a privacy retaining arrangement (22) in the communication environment, causes the privacy retaining arrangement to:

representing the strength of the relationship, the nodes of the group being provided in clusters of different types (Ti, T2. T3), and a cluster type being based on between which nodes links are provided and their direction,

determine at least one preferred node (PNi, PN2, PN3) for which the user identity is to be masked, said at least one preferred node (PNi, PN2, PN3) having a link directed away from itself with a weight that is among the highest in the weights of the group (G),

select a corresponding secondary cluster (SCLi, SCL2, SCL3, SCL4) of the same type as the located primary cluster(PCLi, PCL2, PCL3, PCL4), and replace the weights of the links in the located primary cluster (PCLi, PCL2, PCL3, PCL4) with the weights of the links of the selected secondary cluster (SCLi, SCL2, SCL3, SCL4) in order to obtain a changed group of nodes that masks the identity of the user corresponding to the preferred node.

19. A computer program product for retaining the privacy of a user in a communication environment , said computer program product being provided on a data carrier (74) and comprising said computer program code (76) according to claim 18.