CN108334580A - A kind of community discovery method of combination link and attribute information - Google Patents

A kind of community discovery method of combination link and attribute information Download PDF

Info

Publication number
CN108334580A
CN108334580A CN201810071418.XA CN201810071418A CN108334580A CN 108334580 A CN108334580 A CN 108334580A CN 201810071418 A CN201810071418 A CN 201810071418A CN 108334580 A CN108334580 A CN 108334580A
Authority
CN
China
Prior art keywords
matrix
community
indicate
attribute
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810071418.XA
Other languages
Chinese (zh)
Inventor
黄海辉
王欣
禹果
余浩
周秀秀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN201810071418.XA priority Critical patent/CN108334580A/en
Publication of CN108334580A publication Critical patent/CN108334580A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to social networks and Data Mining, propose a kind of community discovery method of combination link and attribute information;It mainly includes the following steps that:Inputting has the social network data of link and attribute information, constructs the adjacency matrix based on linking relationship and the attribute matrix based on attribute information;According to two data matrixes, tectonic syntaxis bayesian probability model;Using the maximum degree of membership of non-negative matrix factorization method calculate node, obtains preliminary community and divide;According to the ownership situation of node, the absolute degree of membership of calculate node obtains overlapping community structure;The present invention is combined the link information in social networks with attribute information, is conducive to the utility value for improving data in community's detection, the accuracy and efficiency of community discovery can be improved, be suitably applied while having the theme community of attribute and link information to find.

Description

A kind of community discovery method of combination link and attribute information
Technical field
The present invention relates to social networks and Data Mining, the community of specially a kind of combination link and attribute information is sent out Existing method.
Background technology
The main carrier that social networks is transmitted as information, the information content covered have important grind for today's society Study carefully meaning, from individual to group, there is always implicit contacts from worldlet to great society, in actual life picks up people's chain Come.Community discovery is usually used in analyzing the structure feature between social groups.
The topological structure and nodal community of figure have great reference value in social networks, are only ground from wherein The structure feature for studying carefully community does not obviously have convincingness, and a usual community indicates a kind of set with same alike result, if There is directly contact between them, shows that the social relationships between two adjacent sections point link more closely, if be connected with side Two nodes belong to different communities, this side can as Liang Ge different attributes community link " tie ", if " knob Band " is more, illustrates to link also more closely between the communities Liang Ge, to which the research of discovery and community's evolution for overlapping community carries Better study condition is supplied.The structure feature of network structure and node is balanced for effective method is lacked, Wu et al. is carried Go out it is a kind of exploring the structure of community using the SAGL methods of global structure and local neighborhood characteristics, firstly evaluate each node Global importance, and the similarity of each node pair is calculated by combining edge strength and node similarity, then use K- Medoids clustering algorithms are split nodal community figure and topology diagram, to divide the similarity of posterior nodal point and each The barycenter of cluster is used as the condition of node distribution, and to divide community, this method can be effectively by node topology and attribute Information is combined, and to detect more meaningful theme community, but Community Clustering is of low quality and the cluster time is complicated Degree is higher, and application effect is relatively low in actual complex network.It is normal et al. to propose that a kind of node decomposed based on confederate matrix is more Net with attributes corporations detection method CDJMF, is solved using the cosine similarity between Non-negative Matrix Factorization association node, by band The least mean-square error problem of constraint is converted into classical convex quadratic programming algorithm, theoretically demonstrates the object function of solution Convergence and non-increasing, prove that the algorithm has higher community discovery efficiency in the case of identical evaluation index, but calculate The parameter relied in method is too many, causes algorithm accuracy not high.
Non-negative Matrix Factorization (Non-negative Matrix Factorization NMF) has been demonstrated in data point The fields such as analysis, pattern-recognition, signal processing, machine learning have obtained extensive research.The method of its similar cluster is very suitable for The topological structure of social networks is usually carried out matrix decomposition by the community discovery in field of social network, can obtain satisfaction Community discovery is as a result, so using NMF to not only including link information but also there is the social network data of attribute information to carry out community Analysis provides new thinking for community discovery.
Li Yafang et al. proposes a kind of method of nonnegative matrix and (refers to Li Yafang, Jia Caiyan, non-negative square is applied in sword Community discovery method summary [J] computer science and exploration, 2016,10 (01) of battle array decomposition model:1-13.), it can utilize The data network data of link information and attribute information carries out community analysis, but the algorithm complexity of this method is higher, does not utilize Operation.
Invention content
In consideration of it, the present invention proposes a kind of community discovery method of combination NMF algorithms, specially a kind of combination link and category Property information community discovery method, the present invention respectively combines in the form of Non-negative Matrix Factorization attribute matrix and adjacency matrix point Solution can reduce the complexity of algorithm, and the method applied in the present invention includes:
S1, input social network data, construct the adjacency matrix based on link information and the attribute square based on attribute information Battle array;
S2, a kind of joint bayesian probability model is designed using the adjacency matrix and the attribute matrix;
S3, according to the joint bayesian probability model, calculate adjacency matrix and attribute according to confederate matrix decomposition method The implicit node-home degree matrix of matrix, calculates the maximum degree of membership of the degree of membership matrix, carries out preliminary non-overlapping community and draws Point;
S4, the absolute degree of membership for calculating the degree of membership matrix interior joint, obtain the overlapping community structure of social networks.
Further, the method for the structure adjacency matrix includes the following steps:
S201, the social network data according to input construct social networks topology diagram, count number of nodes and mark For N;
S202, complete zero symmetry square matrix X is constructed by exponent number of N0, iteration update X0;Judge that two nodes are in social network diagram It is no to be connected directly, by matrix X if being connected0In corresponding row and column be set to 1, be otherwise set to 0;
S203, by symmetry square matrix X0Diagonal line is set as 0, i.e. node and node itself acquiescence is non-conterminous;
Updated X is exported when S204, iteration ends0, it is labeled as adjacency matrix X.
Further, the step of method of structure attribute matrix includes:
Statistical attribute quantity M and be each attribute from 1 to M beginning labels, is row with number of attributes M, number of nodes N is row structure Make full null attribute matrix Y0, it is iterated update;Corresponding row and column is set to 1 if node has attribute, is otherwise 0, To obtain attribute matrix Y.
Further, the joint bayesian probability model of the step S2 includes:Design a kind of joint Bayesian probability mould Type considers adjacency matrix and contains implicit community structure feature simultaneously with attribute matrix, and joint bayesian probability model is:
p(X,Y,W1,W2, H, β) and=p (X | W1,H)p(Y|W2,H)p(W1|β)p(W2|β)p(H|β)p(β)
Wherein, W1Indicate the strength of association matrix of attribute and community;W2Indicate the strength of association matrix of node and community;H tables Show that node and the degree of membership matrix of community, H are determined by adjacency matrix with attribute matrix simultaneously;β is hyper parameter;p(X|W1, H) and table Show matrix W1With the conditional probability of X under the conditions of matrix H;p(Y|W2, H) and representing matrix W2With the conditional probability of X under the conditions of matrix H;p (W1| β) indicate β under the conditions of W1Conditional probability;p(W2| β) indicate β under the conditions of matrix W2Conditional probability;P (H | β) table Show the conditional probability of the matrix H under the conditions of β;The probability of p (β) expression parameters β.
Further, it is preferred that preliminary non-overlapping community's partiting step of the step S3 includes:
S301, according to joint bayesian probability model, calculate joint Bayesian model in each prior probability negative logarithm Likelihood function;Obtain the object function based on joint probability NMF models;
S302, according to Algorithms of Non-Negative Matrix Factorization, decomposition is optimized to likelihood function, obtains iteration more new formula;
S303, it is iterated according to iteration more new formula, until meeting the condition of convergence;
S304, according to iteration ends when output degree of membership matrix, calculate node maximum degree of membership exports non-overlapping society Plot structure.
Further, it is preferred that combine the negative logarithm of each prior probability in Bayesian model in the step S301 seemingly So the computational methods of function are:Negative logarithm is taken to joint bayesian probability model, according to half normal state priori, obtains W1, W2With The negative log-likelihood function of H, corresponds to respectively:
Wherein ,-log p (w1| β) indicate W1Negative log-likelihood function ,-log p (w2| β) indicate W2Negative log-likelihood Function ,-log p (h | β) indicate the negative log-likelihood function of H;, W1Indicate the strength of association matrix of node and community, W2It indicates to belong to Property and community strength of association matrix, H indicate node and community degree of membership matrix;w1,ikRepresenting matrix W1The i-th row kth row Element, w2,ikRepresenting matrix W2The i-th row kth column element, hkjThe row k jth column element of representing matrix H;μ is a constant, N Indicate that number of nodes, M indicate that number of attributes, K indicate that community's number, β indicate hyper parameter.
Above-mentioned negative log-likelihood function formula is expressed as:The quantity of community is controlled by β, β is bigger, illustrate row in W to Column vector in amount and H more levels off to more uncorrelated between 0 namely community;The value of each β obeys Gamma distributions, i.e.,:
Wherein, p (β | ak,bk) indicate to obey gamma distribution;akIndicate the form parameter factor, bkIndicate the scale parameter factor, Γ(ak) indicate gamma function, β >=0.
Further, it is preferred that the computational methods of iteration more new formula include in the step S302:To bearing log-likelihood Function is minimized, and calculating iteration more new formula in conjunction with joint Algorithms of Non-Negative Matrix Factorization specifically includes:Utilize nonnegative matrix point The canonical form of solution obtains the reduced form of adjacency matrix, i.e. X ≈ HHT, belonged to using the canonical form of Non-negative Matrix Factorization The reduced form of property matrix, i.e. Y ≈ WHT;H, W and H are used respectivelyTIt is corresponding to substitute the W born in log-likelihood function1, W2And H, respectively Calculate H, W and β gradient;Iteration more new formula is obtained using gradient descent method.Wherein, W indicates replaced adjacency matrix.
Alternatively, the iteration more new formula is as follows:
Wherein, ← indicate that assignment, Y indicate that attribute matrix, I indicate that unit matrix, B expressions are made of diagonal hyper parameter β Matrix, H indicate the degree of membership matrix of node and community;βKThe iterative value for indicating hyper parameter β, the matrix for being above designated as T indicate corresponding The transposed matrix of matrix, a indicate that shape constant, b indicate scale parameter, and a and b are constant.
The overlapping community structure division methods are:Absolutely returning for each node is calculated according to the degree of membership matrix of node Category degree, judges whether absolute degree of membership is more than a certain threshold value, if it is, by the node division to existing community, to obtain The community of overlapping divides;The computational methods of absolute degree of membership are:
Wherein,WithThe minimum component in the ownership intensity vector of node and largest component are indicated respectively, hkjFor vijSubvector, hkjIndicate vijIndicate the daughter element of the adjacency matrix of N number of node composition.
Compared with prior art, the invention has the advantages that:
1) present invention proposes a kind of community discovery method of new combination attribute and link information, both remains based on section The advantage of the community discovery of point link information, and combine attribute information in social networks and carry out collaboration community discovery, it more can body Multi-attributes in existing social networks, can find theme community.
2) present invention utilizes the advantages of Algorithms of Non-Negative Matrix Factorization, i.e., the thought of its " fuzzy clustering " can be with attribute Community discovery provides necessary theoretical foundation, and under stringent mathematical derivation, obtained community discovery method can have more generation Table and stability.
3) present invention can have found non-overlapping community but also find overlapping community, the accuracy higher of algorithm, complexity It is lower, suitable for extensive overlapping community mining.
Description of the drawings
Fig. 1 is the implementation overview flow chart of the present invention;
Fig. 2 is adjacency matrix product process figure;
Fig. 3 is attribute matrix product process figure;
Fig. 4 is joint bayesian probability model figure;
Fig. 5 is degree of membership matrix calculation flow chart.
Specific implementation mode
With reference to specific embodiment and specific experiment data set to the society of the combination link and attribute information of the present invention Area finds that method is described further, as shown in Figure 1, the present invention includes the following steps:
S1, input social network data, construct the adjacency matrix based on link information and the attribute square based on attribute information Battle array;
S2, a kind of joint bayesian probability model is designed using the adjacency matrix and the attribute matrix;
S3, according to the joint bayesian probability model, calculate adjacency matrix and attribute according to confederate matrix decomposition method The implicit node-home degree matrix of matrix, calculates the maximum degree of membership of the degree of membership matrix interior joint, carries out preliminary non-overlapping Community divides;
S4, the absolute degree of membership for calculating node in the degree of membership matrix, obtain the overlapping community structure of social networks.
Further, the step S1 inputs social network data, constructs adjacency matrix based on link information and is based on The attribute matrix of attribute information includes the following steps:
S201, the social network data according to input construct social networks topology diagram, count number of nodes and mark For N;
S202, complete zero symmetry square matrix X is constructed by exponent number of N0, iteration update X0;Judge that two nodes are in social network diagram It is no to be connected directly, by matrix X if being connected0In corresponding row and column be set to 1, be otherwise set to 0;
S203, by symmetry square matrix X0Diagonal line be set as 0, i.e. node and node itself acquiescence is non-conterminous;
Updated X is exported when S204, iteration ends0, it is labeled as adjacency matrix X.Wherein, structure adjacency matrix such as Fig. 2 It is shown.
Further, structure attribute matrix as shown in figure 3, step includes:Statistical attribute quantity M and for each attribute from 1 arrives M beginning labels, is row with number of attributes M, and number of nodes N is that row construct full null attribute matrix Y0, it is iterated update;If There is node attribute corresponding row and column is just set to 1, be otherwise 0, to obtain attribute matrix Y.
The joint bayesian probability model of the step S2 includes:A kind of joint bayesian probability model is designed, synthesis is examined Consider adjacency matrix and contain implicit community structure feature simultaneously with attribute matrix, joint bayesian probability model is:
p(X,Y,W1,W2, H, β) and=p (X | W1,H)p(Y|W2,H)p(W1|β)p(W2|β)p(H|β)p(β)
Wherein, W1Indicate the strength of association matrix of attribute and community;W2Indicate the strength of association matrix of node and community;H tables Show that node and the degree of membership matrix of community, H are determined by adjacency matrix with attribute matrix simultaneously;β is hyper parameter;p(X|W1, H) and table Show matrix W1With the conditional probability of X under the conditions of matrix H;p(Y|W2, H) and representing matrix W2With the conditional probability of X under the conditions of matrix H;p (W1| β) indicate β under the conditions of W1Conditional probability;p(W2| β) indicate β under the conditions of matrix W2Conditional probability;P (H | β) table Show the conditional probability of the matrix H under the conditions of β;P (β) indicates that the probability of β, β indicate hyper parameter.
Wherein joint bayesian probability model is as shown in figure 4, wherein vijIndicate the interactive information of node, it obeys Poisson point This model can be described as by cloth according to Fig. 4:In the matrix of N number of node composition, each daughter element vijIt can be implicit by three Subvector hkj, w1,ikAnd w2,ikIndicate, the interaction of these three subvectors, codetermined node-home situation, and this three There are a common factor-betas between a subvectork, βkDetermine divisions of the community K in three subvectors.Wherein, hkjWith w1,ikBetween interaction be expressed as the adjacent relation matrix X of node;hkjAnd w2,ikBetween interaction be expressed as the attribute square of node Battle array Y.
Further, it is preferred that preliminary non-overlapping community's partiting step of the step S3 includes:
S301, according to joint bayesian probability model, calculate joint Bayesian model in each prior probability negative logarithm Likelihood function;Obtain the object function based on joint probability NMF models;
S302, according to Algorithms of Non-Negative Matrix Factorization, decomposition is optimized to likelihood function, obtains iteration more new formula;
S303, it is iterated according to iteration more new formula, until meeting the condition of convergence;
S304, according to iteration ends when output degree of membership matrix, calculate node maximum degree of membership exports non-overlapping society Plot structure.
Further, it is preferred that combine the negative logarithm of each prior probability in Bayesian model in the step S301 seemingly So the computational methods of function are:Negative logarithm is taken to joint bayesian probability model, according to half normal state priori, obtains W1, W2With The negative log-likelihood function of H, corresponds to respectively:
Wherein ,-log p (w1| β) indicate W1Negative log-likelihood function ,-log p (w2| β) indicate W2Negative log-likelihood Function ,-log p (h | β) indicate the negative log-likelihood function of H;, W1Indicate the strength of association matrix of node and community, W2It indicates to belong to Property and community strength of association matrix, H indicate node and community degree of membership matrix;w1,ikRepresenting matrix W1The i-th row kth row Element, w2,ikRepresenting matrix W2The i-th row kth column element, hkjThe row k jth column element of representing matrix H;μ is a constant, N Indicate that number of nodes, M indicate that number of attributes, K indicate that community's number, β indicate hyper parameter.
Further, above-mentioned negative log-likelihood function formula is expressed as:The quantity of community is controlled by β, β is bigger, illustrates W In column vector and H in column vector more level off to it is more uncorrelated between 0 namely community;The value of each β obeys Gamma points Cloth, i.e.,:
Wherein, p (β | ak,bk) indicate to obey gamma distribution;akIndicate the form parameter factor, bkIndicate the scale parameter factor, Γ(ak) indicate gamma function, β >=0.
The computational methods of iteration more new formula include in the step S302:Negative log-likelihood function is minimized, is tied Joint Algorithms of Non-Negative Matrix Factorization calculating iteration more new formula is closed to specifically include:It is obtained using three decomposed forms of Non-negative Matrix Factorization To reduced form, that is, X ≈ HH of adjacency matrixT, the simplification shape of attribute matrix is obtained using the canonical form of Non-negative Matrix Factorization Formula, i.e. Y ≈ WHT;H, W and H are used respectivelyTIt is corresponding to substitute the W born in log-likelihood function1, W2And H, calculate separately H, W and β ladder Degree;Iteration more new formula is obtained using gradient descent method.Wherein, W indicates replaced adjacency matrix.
Alternatively, the iteration more new formula is as follows:
Wherein, ← indicate that assignment, W indicate that replaced adjacency matrix, Y indicate that attribute matrix, I indicate unit matrix, B tables Show that the diagonal matrix that parameter is made of hyper parameter β, H indicate the degree of membership matrix of node and community;On be designated as T matrix indicate phase The transposed matrix of matrix, a is answered to indicate that shape constant, b indicate scale parameter, a and b are constant.
The overlapping community structure division methods are:Absolutely returning for each node is calculated according to the degree of membership matrix of node Category degree, judges whether absolute degree of membership is more than a certain threshold value, if it is, existing community is divided into, to be overlapped Community divide;The calculating of degree of membership matrix is as shown in Figure 5.The computational methods of absolute degree of membership are:
Wherein,WithThe minimum component in the ownership intensity vector of node and largest component are indicated respectively, hkjFor vijSubvector, hkjIndicate vijIndicate the daughter element of the adjacency matrix of N number of node composition.
The present invention devises a kind of community discovery method of combination link and attribute information, combines the number of social networks first According to feature calculation adjacency matrix and attribute matrix, and a kind of joint Bayesian probability based on two kinds of matrixes is designed on this basis Model;It is optimized again for this model, using the more new formula of the iteration in Algorithms of Non-Negative Matrix Factorization extraction model;Then root According to iteration more new formula calculate node degree of membership matrix, and maximum contribution degree is counted to find non-overlapping community structure;Finally The absolute degree of membership for iterating to calculate node, to find the community of overlapping.It is non-negative that the method is based on joint Bayesian probability Matrix decomposition model has stringent mathematical theory basis, is applied to true social networks and carries out community discovery, has more preferable Community division result.
Embodiment provided above has carried out further detailed description, institute to the object, technical solutions and advantages of the present invention It should be understood that embodiment provided above is only the preferred embodiment of the present invention, be not intended to limit the invention, it is all Any modification, equivalent substitution, improvement and etc. made for the present invention, should be included in the present invention within the spirit and principles in the present invention Protection domain within.

Claims (10)

1. a kind of community discovery method of combination link and attribute information, it is characterised in that:Including:
S1, input social network data, construct the adjacency matrix based on link information and the attribute matrix based on attribute information;
S2, a kind of joint is designed using the adjacency matrix based on link information and the attribute matrix based on attribute information Bayesian probability model;
S3, according to the joint bayesian probability model, calculate adjacency matrix and attribute matrix according to confederate matrix decomposition method Implicit node-home degree matrix calculates the maximum degree of membership of the degree of membership matrix interior joint, carries out preliminary non-overlapping community It divides;
S4, the absolute degree of membership for calculating the degree of membership matrix interior joint, obtain the overlapping community structure of social networks.
2. the community discovery method of a kind of combination link and attribute information according to claim 1, which is characterized in that structure The method of adjacency matrix is:
S201, the social network data according to input construct social networks topology diagram, count number of nodes and are labeled as N;
S202, complete zero symmetry square matrix X is constructed by exponent number of N0, iteration update X0;Judge whether two nodes are straight in social network diagram Connect it is connected, by matrix X if being connected0In corresponding row and column be set to 1, be otherwise set to 0;
S203, by symmetry square matrix X0Diagonal line be set as 0, i.e. node and node itself acquiescence is non-conterminous;
Updated X is exported when S204, iteration ends0, it is labeled as adjacency matrix X.
3. the community discovery method of a kind of combination link and attribute information according to claim 2, which is characterized in that construction The method of attribute matrix is:Statistical attribute quantity M and be each attribute from 1 to M beginning labels, is row, node with number of attributes M Number N is that row construct full null attribute matrix Y0, it is iterated update;Corresponding row and column is set to 1 if node has attribute, Otherwise it is 0, to obtain attribute matrix Y.
4. the community discovery method of a kind of combination link and attribute information according to claim 3, which is characterized in that step Joint bayesian probability model described in S2 is:
p(X,Y,W1,W2, H, β) and=p (X | W1,H)p(Y|W2,H)p(W1|β)p(W2|β)p(H|β)p(β)
Wherein, W1Indicate the strength of association matrix of attribute and community;W2Indicate the strength of association matrix of node and community;H indicates section The degree of membership matrix of point and community, H are determined by adjacency matrix with attribute matrix simultaneously;β is hyper parameter;p(X|W1, H) and indicate square Battle array W1With the conditional probability of X under the conditions of matrix H;p(Y|W2, H) and representing matrix W2With the conditional probability of X under the conditions of matrix H;p(W1| β) indicate the W under the conditions of β1Conditional probability;p(W2| β) indicate β under the conditions of matrix W2Conditional probability;P (H | β) indicate β items The conditional probability of matrix H under part;P (β) indicates the probability of hyper parameter β.
5. according to the community discovery method of any a kind of the combination links and attribute information of claim 1-4, feature exists In preliminary non-overlapping community's partiting step described in step S3 includes:
S301, according to joint bayesian probability model, calculate joint Bayesian model in each prior probability negative log-likelihood Function;Obtain the object function based on joint probability NMF models;
S302, according to Algorithms of Non-Negative Matrix Factorization, decomposition is optimized to likelihood function, obtains iteration more new formula;
S303, it is iterated according to iteration more new formula, until meeting the condition of convergence;
S304, according to iteration ends when output degree of membership matrix, calculate node maximum degree of membership exports non-overlapping community knot Structure.
6. the community discovery method of a kind of combination link and attribute information according to claim 5, which is characterized in that described The computational methods of the negative log-likelihood function of each prior probability are in joint Bayesian model in step S301:To combining pattra leaves This probabilistic model takes negative logarithm, according to half normal state priori, obtains W1, W2With the negative log-likelihood function of H, correspond to respectively:
Wherein ,-logp (w1| β) indicate W1Negative log-likelihood function ,-logp (w2| β) indicate W2Negative log-likelihood function ,- Logp (h | β) indicates the negative log-likelihood function of H;w1,ikRepresenting matrix W1The i-th row kth column element, w2,ikRepresenting matrix W2's I-th row kth column element, hkjThe row k jth column element of representing matrix H;μ is a constant, and N indicates that number of nodes, M indicate to belong to Property quantity, K indicate community's number, β indicate hyper parameter.
7. a kind of community discovery method of combination link and attribute information according to claim 5, which is characterized in that the step The computational methods of iteration more new formula include in rapid S302:Negative log-likelihood function is minimized, in conjunction with joint nonnegative matrix Decomposition algorithm calculates iteration more new formula, specifically includes:Adjacency matrix is obtained using three decomposed forms of Non-negative Matrix Factorization Reduced form, i.e. X ≈ HHT, the reduced form of attribute matrix, i.e. Y ≈ WH are obtained using the canonical form of Non-negative Matrix FactorizationT;Point It Yong not H, W and HTIt is corresponding to substitute the W born in log-likelihood function1, W2And H, calculate separately H, W and β gradient;Declined using gradient Method obtains iteration more new formula;Wherein, W indicates replaced adjacency matrix.
8. a kind of community discovery method of combination link and attribute information according to claim 7, which is characterized in that described to change Generation more new formula is as follows:
Wherein, ← indicate that assignment, I indicate unit matrix;B indicates the diagonal matrix being made of hyper parameter β;H indicates node and society The degree of membership matrix in area;βKIndicate the iterative value of hyper parameter β;A indicates that shape constant, b indicate scale parameter.
9. according to the community discovery method of any a kind of the combination link and attribute information of claim 4 or 6-9, feature It is, the value of the β all obeys gamma distribution, i.e.,:
Wherein, p (β | ak,bk) indicate to obey gamma distribution;akIndicate the form parameter factor, bkIndicate the scale parameter factor, Γ (ak) indicate gamma function, β >=0.
10. a kind of community discovery method of combination link and attribute information according to claim 1, which is characterized in that social The division methods of the overlapping community structure of network are:The absolute degree of membership of each node is calculated according to the degree of membership matrix of node, Judge whether absolute degree of membership is more than a certain threshold value, if it is, by the node division to existing community, to be overlapped Community divide;
The computational methods of wherein absolute degree of membership are:
Wherein,Indicate the minimum component in the ownership intensity vector of node;Indicate the ownership intensity vector of node In largest component, hkjThe row k jth column element of representing matrix H, H indicate the degree of membership matrix of node and community.
CN201810071418.XA 2018-01-25 2018-01-25 A kind of community discovery method of combination link and attribute information Pending CN108334580A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810071418.XA CN108334580A (en) 2018-01-25 2018-01-25 A kind of community discovery method of combination link and attribute information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810071418.XA CN108334580A (en) 2018-01-25 2018-01-25 A kind of community discovery method of combination link and attribute information

Publications (1)

Publication Number Publication Date
CN108334580A true CN108334580A (en) 2018-07-27

Family

ID=62926629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810071418.XA Pending CN108334580A (en) 2018-01-25 2018-01-25 A kind of community discovery method of combination link and attribute information

Country Status (1)

Country Link
CN (1) CN108334580A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508455A (en) * 2018-10-18 2019-03-22 山西大学 A kind of GloVe hyper parameter tuning method
CN109859063A (en) * 2019-01-18 2019-06-07 河北工业大学 A kind of community discovery method, device, storage medium and terminal device
CN109949176A (en) * 2019-03-28 2019-06-28 南京邮电大学 It is a kind of based on figure insertion social networks in abnormal user detection method
CN110334285A (en) * 2019-07-04 2019-10-15 仲恺农业工程学院 A kind of symbolic network community discovery method based on constitutional balance constraint
CN110851732A (en) * 2019-10-28 2020-02-28 天津大学 Attribute network semi-supervised community discovery method based on non-negative matrix three-factor decomposition
CN111047453A (en) * 2019-12-04 2020-04-21 兰州交通大学 Detection method and device for decomposing large-scale social network community based on high-order tensor
CN112084418A (en) * 2020-07-29 2020-12-15 浙江工业大学 Microblog user community discovery method based on neighbor information and attribute network representation learning
CN112464107A (en) * 2020-11-26 2021-03-09 重庆邮电大学 Social network overlapping community discovery method and device based on multi-label propagation
CN112487110A (en) * 2020-12-07 2021-03-12 中国船舶重工集团公司第七一六研究所 Overlapped community evolution analysis method and system based on network structure and node content
CN113158080A (en) * 2021-04-27 2021-07-23 华南师范大学 Community discovery method, system and device based on fusion attribute and storage medium
CN113516562A (en) * 2021-07-28 2021-10-19 中移(杭州)信息技术有限公司 Family social network construction method, device, equipment and storage medium

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109508455A (en) * 2018-10-18 2019-03-22 山西大学 A kind of GloVe hyper parameter tuning method
CN109508455B (en) * 2018-10-18 2021-11-19 山西大学 GloVe super-parameter tuning method
CN109859063A (en) * 2019-01-18 2019-06-07 河北工业大学 A kind of community discovery method, device, storage medium and terminal device
CN109859063B (en) * 2019-01-18 2023-05-05 河北工业大学 Community discovery method and device, storage medium and terminal equipment
CN109949176A (en) * 2019-03-28 2019-06-28 南京邮电大学 It is a kind of based on figure insertion social networks in abnormal user detection method
CN109949176B (en) * 2019-03-28 2022-07-15 南京邮电大学 Graph embedding-based method for detecting abnormal users in social network
CN110334285B (en) * 2019-07-04 2021-08-06 仲恺农业工程学院 Symbolic network community discovery method based on structural balance constraint
CN110334285A (en) * 2019-07-04 2019-10-15 仲恺农业工程学院 A kind of symbolic network community discovery method based on constitutional balance constraint
CN110851732A (en) * 2019-10-28 2020-02-28 天津大学 Attribute network semi-supervised community discovery method based on non-negative matrix three-factor decomposition
CN111047453A (en) * 2019-12-04 2020-04-21 兰州交通大学 Detection method and device for decomposing large-scale social network community based on high-order tensor
CN112084418A (en) * 2020-07-29 2020-12-15 浙江工业大学 Microblog user community discovery method based on neighbor information and attribute network representation learning
CN112084418B (en) * 2020-07-29 2023-07-28 浙江工业大学 Microblog user community discovery method based on neighbor information and attribute network characterization learning
CN112464107A (en) * 2020-11-26 2021-03-09 重庆邮电大学 Social network overlapping community discovery method and device based on multi-label propagation
CN112487110A (en) * 2020-12-07 2021-03-12 中国船舶重工集团公司第七一六研究所 Overlapped community evolution analysis method and system based on network structure and node content
CN113158080A (en) * 2021-04-27 2021-07-23 华南师范大学 Community discovery method, system and device based on fusion attribute and storage medium
CN113158080B (en) * 2021-04-27 2023-07-11 华南师范大学 Community discovery method, system, device and storage medium based on fusion attribute
CN113516562A (en) * 2021-07-28 2021-10-19 中移(杭州)信息技术有限公司 Family social network construction method, device, equipment and storage medium
CN113516562B (en) * 2021-07-28 2023-09-19 中移(杭州)信息技术有限公司 Method, device, equipment and storage medium for constructing family social network

Similar Documents

Publication Publication Date Title
CN108334580A (en) A kind of community discovery method of combination link and attribute information
Zhang et al. Detecting overlapping communities in networks using spectral methods
Scrucca GA: A package for genetic algorithms in R
Jeub et al. Think locally, act locally: Detection of small, medium-sized, and large communities in large networks
Dong et al. MOEA/D with a self-adaptive weight vector adjustment strategy based on chain segmentation
Cai et al. A clustering-ranking method for many-objective optimization
Zhang et al. Mapping Koch curves into scale-free small-world networks
CN108733631A (en) A kind of data assessment method, apparatus, terminal device and storage medium
Cafieri et al. Loops and multiple edges in modularity maximization of networks
CN110458187A (en) A kind of malicious code family clustering method and system
CN107609469B (en) Social network associated user mining method and system
CN108765180A (en) The overlapping community discovery method extended with seed based on influence power
Jin et al. Graph regularized nonnegative matrix tri-factorization for overlapping community detection
CN112905656A (en) Dynamic community discovery system fusing time sequence network
CN109002524B (en) A kind of gold reference author's sort method based on paper adduction relationship
CN106845536A (en) A kind of parallel clustering method based on image scaling
Tucker et al. Rgfga: An efficient representation and crossover for grouping genetic algorithms
Farzad et al. Multi-layer community detection
Hu et al. A new algorithm CNM-Centrality of detecting communities based on node centrality
CN112286996A (en) Node embedding method based on network link and node attribute information
CN111353525A (en) Modeling and missing value filling method for unbalanced incomplete data set
CN112905907B (en) Dynamic community discovery method for system evolution transplanting partition time sequence network
Zhu et al. A no self-edge stochastic block model and a heuristic algorithm for balanced anti-community detection in networks
CN112884023A (en) Community discovery method for system evolution transplantation subareas
CN112256935A (en) Complex network clustering method based on optimization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180727