CN109063173A - A kind of semi-supervised overlapping community discovery method based on partial tag information - Google Patents
A kind of semi-supervised overlapping community discovery method based on partial tag information Download PDFInfo
- Publication number
- CN109063173A CN109063173A CN201810953974.XA CN201810953974A CN109063173A CN 109063173 A CN109063173 A CN 109063173A CN 201810953974 A CN201810953974 A CN 201810953974A CN 109063173 A CN109063173 A CN 109063173A
- Authority
- CN
- China
- Prior art keywords
- node
- label
- community
- network
- prior information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 16
- 230000006854 communication Effects 0.000 claims description 14
- 239000011159 matrix material Substances 0.000 claims description 14
- 230000008859 change Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 3
- 230000008901 benefit Effects 0.000 abstract description 3
- 230000007246 mechanism Effects 0.000 abstract description 2
- 235000013162 Cocos nucifera Nutrition 0.000 description 4
- 244000060011 Cocos nucifera Species 0.000 description 4
- 230000007704 transition Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of semi-supervised overlapping community discovery method based on partial tag information.The present invention takes full advantage of the positive and negative label information in network, and takes asynchronous propagation mechanism.These prior informations that the present invention will make full use of in network, and part Must-link prior information is converted into Must-in information, Cannot-link prior information is converted into Cannot-in information, and community discovery process is instructed with this, time complexity of the invention is O (mnlog (m)), although therefore the present invention improves the not increase of community discovery accuracy time complexity.
Description
Technical field
The present invention relates to community discovery method technical fields, and in particular to a kind of based on the semi-supervised heavy of partial tag information
Folded community discovery method.
Background technique
In real world, many complication systems can be described as the form of complex network.Community structure is as complicated
One of key property of network plays important role in people's lives.In time, it accurately finds to hide in network
Community structure, and then analyze the internal feature of complication system, can not only instruct the production activity of people, but also for understanding
And Control complex systems are also very helpful.
Traditional community discovery algorithm is since time complexity is high, division result accuracy rate is low, needs to specify community to advise in advance
The reasons such as mould and cannot be widely used.
Tradition overlapping community discovery method has COPRA algorithm, and in traditional COPRA algorithm, the communication process of label is only
Only related with neighbor node, the probability that label travels to this node is identical by all neighbor nodes.However, in truth
Under, bigger with the similarity of this node, the probability that label is traveled to this node is also bigger.In overlapping community discovery neck
Domain, traditional COPRA algorithm only only account for the topological structure of network.However, studies have shown that in live network often there is
A small amount of prior information, so that accuracy rate during community discovery is lower.
Summary of the invention
For above-mentioned deficiency in the prior art, a kind of semi-supervised overlapping based on partial tag information provided by the invention
Community discovery method solves the problems, such as that accuracy rate of traditional COPRA algorithm during community discovery is lower.
In order to achieve the above object of the invention, a kind of the technical solution adopted by the present invention are as follows: half based on partial tag information
Monitor overlay community discovery method, comprising the following steps:
S1, node Subject Matrix is initialized according to the positive label prior information of Must-in and Must-linkN is network
In interstitial content, C be network in community's number;
S2, by the Must-in and the positive label prior information of Must-link, Cannot-in and Cannot-link in network
Negative label prior information is converted to the subordinated-degree matrix F of networkn×CAnd label existence matrix S in networkn×C;
As the node v in networkiIt is Must-in relationship with community c, then fic=1, otherwise fic=0;Label exists in network
Property matrix Sn×CValue are as follows: as nodes viContain Must-in or Cannot-in information between community c, then sic=
al, otherwise sic=au, 0≤(al,au)≤1, alIt is substantially equal to 1, auIt is substantially equal to 0;fic∈Fn×C, sic∈Sn×C, i=1,
2 ..., n, c=1,2 ..., C;
S3, label communication process is executed according to network affiliation matrix iteration, as node viWith the label membership of community c
When determining, i.e. sic=al, no longer calculate node viLabel ownership, otherwise, from the tally set of neighbor node obtain node vi's
Label ownership;
S4, according to node viLabel ownership calculate network in community's label number, if number change, enter step
Otherwise S5 enters step S6;
S5, the smallest size value of each community's label is assigned to mt, and return step S2;
S6, the smallest size value of all community's labels is assigned to mt;
S7, work as mt=mt-1When, terminate algorithm, otherwise return step S2.
Further: the formula of iterative process in the step S3 are as follows:
In above formula,For t+1 iteration posterior nodal point viWith the label membership of community c,For t iteration posterior nodal point
viWith the label membership of community c, wijFor node viWith node vjLabel transition probability,For t+1 iteration posterior nodal point
viWith node vjLabel membership, wikFor node viWith node collection vkLabel transition probability,For t iteration posterior nodal point
viWith node collection vkLabel membership, N (i) be node viNeighbor node.
Further: label communication process in the step S3 are as follows: for node viWith the member of community c its clear relationship
Element, no longer progress label communication process;For containing Must-link prior information but any one node does not all have Must-
The node of in prior information is to executing open label communication strategy, for containing Cannot-link prior information but any one
A node does not all contain the node of Cannot-in prior information to executing stringent label communication strategy;And it is unknown for those
The node of prior information then calculates its probability for being subordinated to community c from its neighbor node.
Further: the calculation method of community's label number in the step S4 are as follows: if node viWith node vjHave
Must-link prior information, but be free of Must-in prior information, then in node viWith node vjBetween establish one it is unidirectional
Must-link prior information channel;If node viWith node vjBetween have Cannot-link prior information, then looking for node vi's
When label, by node vjLeave out from its candidate tally set.
The invention has the benefit that the present invention takes full advantage of the positive and negative label information in network, and take asynchronous
Mechanism of transmission.These prior informations that the present invention will make full use of in network, and part Must-link prior information is converted to
Cannot-link prior information is converted to Cannot-in information by Must-in information, and instructs community discovery mistake with this
Journey, time complexity of the invention are O (mnlog (m)), although therefore the present invention improves community discovery accuracy time complexity
There is no increase for degree.
Detailed description of the invention
Fig. 1 is flow chart of the present invention.
Specific embodiment
A specific embodiment of the invention is described below, in order to facilitate understanding by those skilled in the art this hair
It is bright, it should be apparent that the present invention is not limited to the ranges of specific embodiment, for those skilled in the art,
As long as various change is in the spirit and scope of the present invention that the attached claims limit and determine, these variations are aobvious and easy
See, all are using the innovation and creation of present inventive concept in the column of protection.
As shown in Figure 1, a kind of semi-supervised overlapping community discovery method based on partial tag information, comprising the following steps:
S1, node is initialized to Must-link (must be related) positive label prior information according to Must-in (must include)
Subject MatrixN is the interstitial content in network, and C is community's number in network.
Part Cannot-link and the Cannot-in information in network is pre-processed first:
If (vi,vj)∈DT, (vj,vk)∈DT, then (vi,vk)∈DT;
If (vi,vj)∈DT, (vj,vk)∈DN, then (vi,vk)∈DN;
If (vi,vj)∈DT, vi∈NT, then vj∈NT, and node viWith node vjLabel be the same;
If (vi,vj)∈DT, vi∈NN, then vj∈NN, and node viWith node vjOn negative label be the same;
If (vi,vj)∈DN, vi∈NT, then vj∈NN, and node viOn positive label and node vjOn negative label be one
Sample.
S2, by network Must-in and the positive label prior information of Must-link, Cannot-in (cannot include) and
Cannot-link (cannot be related) negative label prior information is converted to the subordinated-degree matrix F of networkn×CAnd label in network
Existence matrix Sn×C。
As the node v in networkiIt is Must-in relationship with community c, then fic=1, otherwise fic=0;Label exists in network
Property matrix Sn×CValue are as follows: as nodes viContain Must-in or Cannot-in information between community c, then sic=
al, otherwise sic=au, 0≤(al,au)≤1, alIt is substantially equal to 1, auIt is substantially equal to 0.fic∈Fn×C, sic∈Sn×C, i=1,
2 ..., n, c=1,2 ..., C.
S3, label communication process is executed according to network affiliation matrix iteration, as node viWith the label membership of community c
When determining, i.e. sic=al, no longer calculate node viLabel ownership, otherwise, from the tally set of neighbor node obtain node vi's
Label ownership.The formula of iterative process are as follows:
In above formula,For t+1 iteration posterior nodal point viWith the label membership of community c,For t iteration posterior nodal point
viWith the label membership of community c, wijFor node viWith node vjLabel transition probability,For t+1 iteration posterior nodal point
viWith node vjLabel membership, wikFor node viWith node collection vkLabel transition probability,For t iteration posterior nodal point
viWith node collection vkLabel membership, N (i) be node viNeighbor node.
Label communication process are as follows: for node viWith the element of community c its clear relationship, no longer progress label is propagated through
Journey;For containing Must-link prior information but any one node does not all have the node of Must-in prior information to execution
Open label communication strategy, for containing Cannot-link prior information but any one node does not all contain Cannot-in
The node of the prior information label communication strategy stringent to execution;And for the node of those unknown prior informations, then from its neighbour
It occupies and calculates its probability for being subordinated to community c in node.
S4, according to node viLabel ownership calculate network in community's label number, if number change, enter step
Otherwise S5 enters step S6.The calculation method of community's label number are as follows: if node viWith node vjBelieve with Must-link priori
Breath, but be free of Must-in prior information, then in node viWith node vjBetween establish a unidirectional Must-link prior information
Channel;If node viWith node vjBetween have Cannot-link prior information, then looking for node viLabel when, by node vjFrom
Leave out in its candidate tally set.
S5, the smallest size value of each community's label is assigned to mt, and return step S2.
S6, the smallest size value of all community's labels is assigned to mt。
S7, work as mt=mt-1When, terminate algorithm, otherwise return step S2.
In embodiments of the present invention, reality in there is positive label priori knowledge for instruct community discovery have it is important
Effect, in addition, some negative label prior informations in reality are to instructing community discovery equally also to play an important role.Such as
Cannot-link and Cannot-in information in network be also it is ubiquitous, overlapping community discovery in be responsible for can not or
Scarce responsibility.
Claims (4)
1. a kind of semi-supervised overlapping community discovery method based on partial tag information, which comprises the following steps:
S1, node Subject Matrix is initialized according to the positive label prior information of Must-in and Must-linkN is in network
Interstitial content, C are community's number in network;
S2, Must-in and the positive label prior information of Must-link, the Cannot-in and Cannot-link in network are born into mark
Label prior information is converted to the subordinated-degree matrix F of networkn×CAnd label existence matrix S in networkn×C;
As the node v in networkiIt is Must-in relationship with community c, then fic=1, otherwise fic=0, label existence square in network
Battle array Sn×CValue are as follows: as nodes viContain Must-in or Cannot-in information between community c, then sic=al, no
Then sic=au, 0≤(al,au)≤1, alIt is substantially equal to 1, auIt is substantially equal to 0;fic∈Fn×C, sic∈Sn×C, i=1,
2 ..., n, c=1,2 ..., C;
S3, label communication process is executed according to network affiliation degree matrix iteration, as node viIt is true with the label membership of community c
Periodically, i.e. sic=al, no longer calculate node viLabel ownership, otherwise, from the tally set of neighbor node obtain node viMark
Label ownership;
S4, according to node viLabel ownership calculate network in community's label number, if number change, enter step S5, it is no
Then enter step S6;
S5, the smallest size value of each community's label is assigned to mt, and return step S2;
S6, the smallest size value of all community's labels is assigned to mt;
S7, work as mt=mt-1When, terminate algorithm, otherwise return step S2.
2. the semi-supervised overlapping community discovery method according to claim 1 based on partial tag information, which is characterized in that
The formula of iterative process in the step S3 are as follows:
In above formula,For t+1 iteration posterior nodal point viWith the label membership of community c,For t iteration posterior nodal point viWith society
The label membership of area c, wijFor node viWith node vjAttraction,For t+1 iteration posterior nodal point viWith node vj's
Label membership, wikFor node viWith node collection vkAttraction,For t iteration posterior nodal point viWith node collection vkLabel
Membership, N (i) are node viNeighbor node.
3. the semi-supervised overlapping community discovery method according to claim 1 based on partial tag information, which is characterized in that
Label communication process in the step S3 are as follows: for node viWith the element of community c its clear relationship, no longer progress label biography
Broadcast process;For containing Must-link prior information but any one node does not all have the node pair of Must-in prior information
Open label communication strategy is executed, for containing Cannot-link prior information but any one node does not all contain
The node of the Cannot-in prior information label communication strategy stringent to execution;And for the node of those unknown prior informations,
Its probability for being subordinated to community c is then calculated from its neighbor node.
4. the semi-supervised overlapping community discovery method according to claim 1 based on partial tag information, which is characterized in that
The calculation method of community's label number in the step S4 are as follows: if node viWith node vjWith Must-link prior information, but
Without Must-in prior information, then in node viWith node vjBetween establish a unidirectional Must-link prior information channel;
If node viWith node vjBetween have Cannot-link prior information, then looking for node viLabel when, by node vjFrom its time
It selects in tally set and leaves out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810953974.XA CN109063173A (en) | 2018-08-21 | 2018-08-21 | A kind of semi-supervised overlapping community discovery method based on partial tag information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810953974.XA CN109063173A (en) | 2018-08-21 | 2018-08-21 | A kind of semi-supervised overlapping community discovery method based on partial tag information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109063173A true CN109063173A (en) | 2018-12-21 |
Family
ID=64687569
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810953974.XA Pending CN109063173A (en) | 2018-08-21 | 2018-08-21 | A kind of semi-supervised overlapping community discovery method based on partial tag information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063173A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298774A (en) * | 2021-05-20 | 2021-08-24 | 复旦大学 | Image segmentation method and device based on dual condition compatible neural network |
CN113434815A (en) * | 2021-07-02 | 2021-09-24 | 中国计量大学 | Community detection method based on similar and dissimilar constraint semi-supervised nonnegative matrix factorization |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9418142B2 (en) * | 2013-05-24 | 2016-08-16 | Google Inc. | Overlapping community detection in weighted graphs |
CN103729475B (en) * | 2014-01-24 | 2016-10-26 | 福州大学 | Multi-tag in a kind of social networks propagates overlapping community discovery method |
CN108073944A (en) * | 2017-10-18 | 2018-05-25 | 南京邮电大学 | A kind of label based on local influence power propagates community discovery method |
-
2018
- 2018-08-21 CN CN201810953974.XA patent/CN109063173A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9418142B2 (en) * | 2013-05-24 | 2016-08-16 | Google Inc. | Overlapping community detection in weighted graphs |
CN103729475B (en) * | 2014-01-24 | 2016-10-26 | 福州大学 | Multi-tag in a kind of social networks propagates overlapping community discovery method |
CN108073944A (en) * | 2017-10-18 | 2018-05-25 | 南京邮电大学 | A kind of label based on local influence power propagates community discovery method |
Non-Patent Citations (2)
Title |
---|
睢世凯: "基于局部标签信息的半监督社区发现算法研究", 《中国优秀硕士学位论文全文数据库科技信息辑》 * |
陈俊宇: "一种半监督的局部扩展式重叠社区发现方法", 《计算机研究与发展》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298774A (en) * | 2021-05-20 | 2021-08-24 | 复旦大学 | Image segmentation method and device based on dual condition compatible neural network |
CN113434815A (en) * | 2021-07-02 | 2021-09-24 | 中国计量大学 | Community detection method based on similar and dissimilar constraint semi-supervised nonnegative matrix factorization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Shu et al. | Expansion-squeeze-excitation fusion network for elderly activity recognition | |
Thakoor et al. | Bootstrapped representation learning on graphs | |
CN103678431B (en) | A kind of recommendation method to be scored based on standard label and project | |
Shen et al. | Shape clustering: Common structure discovery | |
CN112199462A (en) | Cross-modal data processing method and device, storage medium and electronic device | |
EP3136293A1 (en) | Method and device for processing an image of pixels, corresponding computer program product and computer readable medium | |
Xu et al. | Weakly supervised deep semantic segmentation using CNN and ELM with semantic candidate regions | |
Li et al. | Patch alignment manifold matting | |
Xu et al. | Instance-level coupled subspace learning for fine-grained sketch-based image retrieval | |
CN104966052A (en) | Attributive characteristic representation-based group behavior identification method | |
CN109063173A (en) | A kind of semi-supervised overlapping community discovery method based on partial tag information | |
Shang et al. | Cattle behavior recognition based on feature fusion under a dual attention mechanism | |
Wang et al. | Salient object detection by robust foreground and background seed selection | |
CN112906517B (en) | Self-supervision power law distribution crowd counting method and device and electronic equipment | |
Mund et al. | Active online confidence boosting for efficient object classification | |
Zhang et al. | A fast object tracker based on integrated multiple features and dynamic learning rate | |
CN117315090A (en) | Cross-modal style learning-based image generation method and device | |
CN103942779A (en) | Image segmentation method based on combination of graph theory and semi-supervised learning | |
Wibowo et al. | Convolutional shallow features for performance improvement of histogram of oriented gradients in visual object tracking | |
Bi et al. | C^ 2 C 2 Net: a complementary co-saliency detection network | |
Kawewong et al. | A speeded-up online incremental vision-based loop-closure detection for long-term SLAM | |
Zhang et al. | Search-based depth estimation via coupled dictionary learning with large-margin structure inference | |
Lv et al. | A challenge of deep‐learning‐based object detection for hair follicle dataset | |
CN115168609A (en) | Text matching method and device, computer equipment and storage medium | |
Yao et al. | Multi‐scale feature learning and temporal probing strategy for one‐stage temporal action localization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181221 |
|
RJ01 | Rejection of invention patent application after publication |