CN106934494B - Symbol prediction method and device - Google Patents

Symbol prediction method and device Download PDF

Info

Publication number
CN106934494B
CN106934494B CN201710110775.8A CN201710110775A CN106934494B CN 106934494 B CN106934494 B CN 106934494B CN 201710110775 A CN201710110775 A CN 201710110775A CN 106934494 B CN106934494 B CN 106934494B
Authority
CN
China
Prior art keywords
model
node
network model
network
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710110775.8A
Other languages
Chinese (zh)
Other versions
CN106934494A (en
Inventor
赵学华
陈慧灵
韩丽屏
李晓堂
詹峰
刘学艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Information Technology
Original Assignee
Shenzhen Institute of Information Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Information Technology filed Critical Shenzhen Institute of Information Technology
Priority to CN201710110775.8A priority Critical patent/CN106934494B/en
Publication of CN106934494A publication Critical patent/CN106934494A/en
Application granted granted Critical
Publication of CN106934494B publication Critical patent/CN106934494B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention is applicable to the field of social networks, and provides a symbol prediction method and a symbol prediction device. The method comprises the following steps: defining an adjacency matrix to represent a social network, and constructing and initializing a network model of the social network; fitting the network model with the adjacency matrix, and calculating posterior approximate distribution of model parameters of the network model; selecting an optimal model based on a model selection standard and posterior approximate distribution of the model parameters; and performing symbol prediction according to a predefined algorithm based on the optimal model. The method can improve the accuracy of symbol prediction.

Description

Symbol prediction method and device
Technical Field
The embodiment of the invention belongs to the field of social networks, and particularly relates to a symbol prediction method and a symbol prediction device.
Background
In recent years, symbolic prediction is an important content of social network research, in a social network, positive links generally represent relationships such as friendliness, likes and trusts, negative links represent relationships such as enemy, dislikes and trusts, and symbolic prediction is to predict an opponent relationship which may occur between individuals. The symbolic prediction has very important application value for the positive and negative prediction of the link in the symbolic social network, namely the symbolic prediction, and the symbolic prediction has very important application value for personalized recommendation of the social network, identification of abnormal nodes in the network, user clustering and the like.
In the prior art, the existing symbol prediction algorithms mainly include a symbol prediction algorithm based on a belief propagation model, a matrix decomposition or matrix filling, a network structure and a network context. Specifically, Guha et al obtains 4 propagation models for predicting positive and negative relationships to discover that each user in the symbolic social network provides only a small number of trusted or untrusted relationships, thereby predicting whether any two users in the network are trusted or not; hsieh et al transform the symbol prediction problem into a low-rank matrix filling problem, and effectively predict the unknown edge symbols in the network using a low-rank matrix filling algorithm. However, the two symbol prediction algorithms have low calculation accuracy and are not suitable for the field of complex networks.
Therefore, a new technical solution is needed to solve the above technical problems.
Disclosure of Invention
In view of this, embodiments of the present invention provide a symbol prediction method and apparatus, which are used to solve the problem that the existing symbol prediction method is low in computation efficiency and is not suitable for being applied to the field of complex networks.
The embodiment of the invention is realized in such a way that a symbol prediction method comprises the following steps:
defining an adjacency matrix to represent a social network, and constructing and initializing a network model of the social network;
fitting the network model with the adjacency matrix, and calculating posterior approximate distribution of model parameters of the network model;
selecting an optimal model based on a model selection standard and posterior approximate distribution of the model parameters;
and performing symbol prediction according to a predefined algorithm based on the optimal model.
Another object of an embodiment of the present invention is to provide a symbol prediction apparatus, including:
the construction unit is used for defining an adjacency matrix to represent a social network, constructing a network model of the social network and initializing the network model;
a fitting unit configured to fit the network model to the adjacency matrix and calculate posterior approximate distribution of model parameters of the network model;
the optimal model selecting unit is used for selecting an optimal model based on model selection standards and posterior approximate distribution of the model parameters;
and the symbol prediction unit is used for performing symbol prediction according to a predefined algorithm based on the optimal model.
In the embodiment of the invention, an adjacency matrix is defined to represent a social network, a network model of the social network is constructed and initialized, the network model is fitted with the adjacency matrix, the posterior approximate distribution of model parameters of the network model is calculated, an optimal model is selected based on a model selection standard and the posterior approximate distribution of the model parameters, and finally symbol prediction is carried out according to a predefined algorithm based on the optimal model.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
FIG. 1 is a flow chart of a symbol prediction method according to a first embodiment of the present invention;
FIG. 2 is a diagram illustrating a symbol prediction method according to a first embodiment of the present invention;
fig. 3 is a block diagram of a symbol prediction apparatus according to a second embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
In order to explain the technical means of the present invention, the following description will be given by way of specific examples.
The first embodiment is as follows:
fig. 1 shows a flowchart of a symbol prediction method according to a first embodiment of the present invention, which is detailed as follows:
and step S11, defining an adjacency matrix to represent the social network, and constructing and initializing the social network model.
In the present embodiment, an adjacency matrix a is defined to represent the social network N, and the element a in the adjacency matrix aijRepresenting links to node i and node j in the social network N, where i and j represent node i and node j, respectively. a isij1 indicates the presence between node i and node jA positive link, aij-1 indicates that there is a negative link between node i and node j, aij0 means that there is no link between node i and node j.
Constructing a network model NM of the social network N and initializing the network model NM, wherein the network model NM is (N, K, Z, pi, omega), N, K, Z, pi, omega are all model parameters in the network model NM, N represents the number of nodes in the network model, K represents the number of blocks in the network model, K is used for representing the number of communities contained in the social network, K belongs to [1, N ], Z is an N × K-dimensional vector and is used for indicating the blocks to which each node belongs, pi is a K × K × 3-dimensional vector and is used for representing the probability of connection between the blocks in the network model, and omega is a K-dimensional vector and is used for representing the probability of the nodes belonging to the blocks in the network model.
Specifically, in the present embodiment, in the network model NM ═ (n, K, Z, pi, Ω), the parameter pi satisfies the following relationship:
Figure GDA0002874369540000041
and pilqWith the following a priori distribution:
Figure GDA0002874369540000042
where Γ (x) denotes a gamma function (gamma function), h ═ 1,2,3}, h ═ 1 denotes that a positive link exists between node i in block l and node j in block q, h ═ 2 denotes that a link does not exist between node i in block l and node j in block q, and h ═ 3 denotes that a negative link exists between node i in block l and node j in block q. Pilq1Indicates the probability, pi, that there is a positive link between the nodes of block l and block qlq2Representing the probability, π, that a node between block l and block q is unlinkedlq3Indicating the probability that there is a negative link between the node of block l and block q,
Figure GDA0002874369540000043
denotes the parameter π firstAnd (5) testing the parameters of the distribution, wherein the value range is [0, + ∞ ].
The parameter Ω satisfies the following relationship:
Figure GDA0002874369540000044
and Ω has the following a priori distribution:
Figure GDA0002874369540000051
wherein Γ (x) represents a gamma function (gamma function), K represents the number of blocks of the network model, K is a positive integer, K ∈ [ K [ [ K ]min,Kmax],Kmin∈[1,n],Kmax∈[1,n]。ρ0Is a parameter in the prior distribution p (Ω) of the parameter Ω, whose value range is (0, + ∞).
Step S12, fitting the network model to the adjacency matrix, and calculating a posterior approximate distribution of model parameters of the network model.
In this embodiment, the network model NM is fitted to a social network N, that is, the network model NM is fitted to the adjacency matrix a, and the step S12 specifically includes:
a1, fitting the network model NM with the adjacency matrix A, wherein each K value corresponds to one network model NM, and estimating the posterior approximate distribution of model parameters pi, omega and Z in the network model NM.
Specifically, in the calculation process, the parameters K ∈ [1, n ], each K value corresponding to a network model NM, i.e. each K value corresponding to a posterior approximate distribution of the specific parameter values pi, Ω, and Z. The posterior approximate distribution of the model parameters pi, omega and Z is calculated according to the following steps:
let the posterior approximation of (Z, π, Ω) be distributed as q (Z, π, Ω) and have the following expression:
Figure GDA0002874369540000052
Figure GDA0002874369540000053
Figure GDA0002874369540000054
Figure GDA0002874369540000055
wherein Z isikIs an indication vector for indicating whether node i belongs to block k, if ZikIf 1, node i belongs to block k, if ZikIf 0, node i does not belong to block k. Tau isikIs a parameter in the posterior approximate distribution of the indicator vector Z, whose value range is [0, 1 ]],ηlqIs a parameter of the posterior approximate distribution of the parameter pi, whose value range is [0, + ∞).
In the last 90 th century, variational inference was rapidly developed on probabilistic models, and a general variational method under a Bayesian framework can be applied to hidden Markov models, mixed factor analysis, linear dynamics, graph models and the like. Variational bayes is a class of techniques used to approximate complex integrals in the field of bayesian estimation and machine learning. The method is mainly applied to complex statistical models. The posterior approximate distributions of model parameters pi, omega and Z are calculated by equations (5) (6) (7) (8) to make statistical inferences from the variables of these parameters.
And step S13, selecting an optimal model based on the model selection standard and the posterior approximate distribution of the model parameters.
In this embodiment, the step S13 specifically includes:
and B1, obtaining a model selection standard H by combining a variational Bayes method with the network model NM inference. Specifically, the model selection criterion H is obtained by combining a network model NM inference based on a variational bayesian method, and the specific form is as follows:
Figure GDA0002874369540000061
b2, calculating the evidence value H of the network model NM corresponding to each K value based on the model selection standard H and the posterior approximate distribution of the model parameters pi, omega and ZK
Specifically, to obtain the evidence value H of the network model NM corresponding to each K valueK(also called marginal likelihood estimation), the optimal values of the model parameters (τ, η, ρ) of the posterior approximate distribution of the model parameters π, Ω, and Z are determined first.
Initializing model parameters (τ, η, ρ, η)00) Wherein η0Is a parameter in the prior distribution of the parameter pi, p0Is a parameter in the parameter omega prior distribution, K is an element of [ K ∈ [ [ K ]min,Kmax],Kmin∈[1,n],Kmax∈[1,n]. The specific expressions shown in formulas (9), (10) and (11) are obtained by inference based on the variational bayes method and formulas (5), (6), (7) and (8) in step S12.
The value of the parameter of the Z-approximation distribution is calculated according to the following formula (9), where τilRepresenting the probability that node i points to block i.
Figure GDA0002874369540000071
Where ψ (. cndot.) is a digamma function (derivative of logarithm of gamma function),. alpha.denotes "proportional", and τ is normalized after calculation to satisfy
Figure GDA0002874369540000072
The parameter values of the approximate distributions of the parameters pi and omega are calculated according to the following equations (10) (11), respectively:
Figure GDA0002874369540000073
Figure GDA0002874369540000074
at this time, a value K is determined, and the value range of K is [ Kmin,Kmax]A set of parameter values of the parameters (tau, eta, rho) is obtained by calculation according to the above equations (9), (10) and (11), and the corresponding network model evidence value H is calculated according to the set of parameter values and the equation (12)front(ii) a The next set of parameter values of the parameters (tau, eta, rho) is obtained by iteratively calculating the equations (9), (10) and (11), and the corresponding network model evidence value H is calculated according to the equation (12)nextIs prepared from HnextAnd HfrontMaking a comparison when HnextAnd HfrontIf the difference value is larger than the preset value delta, continuously carrying out iterative calculation on parameter values; when H is presentnextAnd HfrontIf the difference is less than the preset value delta, the iteration is ended, and the evidence value H of the network model is determinednextCorresponding parameter values are the optimal solution and output, and meanwhile, the evidence value of the network model corresponding to the K value is determined to be Hnext. Calculating parameter values of parameters (tau, eta, rho) output at each K value and corresponding network model evidence values H according to the methodK
B3, selecting the maximum evidence value HKThe corresponding network model NM is used as the optimal model NMoptim
In particular, for a particular network model NM, the higher its evidence value of the network model, the better the degree of fitting of said network model NM to the social network, and therefore, the evidence value H is chosenKThe maximum corresponding network model NM is used as the optimal model NMoptim
And step S14, performing symbol prediction according to a predefined algorithm based on the optimal model.
In the present embodiment, the optimal model NM is obtained according to step S13optimThe step S14 specifically includes:
based on the optimal model NMoptimAnd performing symbol prediction according to a predefined formula by the posterior approximate distribution of the middle parameter pi. Wherein the posterior of the parameter π is obtained according to equation (7) in step S12The approximate distribution q (pi).
Specifically, the predefined algorithm is as follows:
the link sign between node i and node j is determined according to the following equation (13):
Figure GDA0002874369540000081
specifically, l and q denote block l and block q, respectively, where node i belongs to block l, node j belongs to block q,
Figure GDA0002874369540000082
indicating the probability that there is a positive link between node i in block l and node j in block q,
Figure GDA0002874369540000083
indicating the probability that there is a negative link between node i in block l and node j in block q,
Figure GDA0002874369540000084
Figure GDA0002874369540000085
ηlqa parameter value representing a posterior approximation distribution of the parameter pi.
Specifically, taking an application embodiment as an example, as shown in fig. 2, an adjacency matrix a is defined to represent the social network N, and the variation range of the number of communities, i.e., the number of blocks K, is defined as [ Kmin,Kmax],K=KminBuilding a network model NM corresponding to the social networkKAnd initializing said network model NMKSaid network model NMK-estimating the posterior approximate distribution of the parameters pi, omega, Z, -calculating the network model NMKEvidence value of (H)KAnd judging whether the K value is K at the momentmaxIf not, making K be K +1, repeating the above steps, if K is KmaxThen choose the evidence value HKNetwork model NM corresponding to maximum value of KkAs optimal network model NMoptimFinally NM based on optimal network modeloptimThe posterior approximation of the middle parameter pi distributes the predicted symbols.
In the first embodiment of the invention, a social network is represented by defining an adjacency matrix, a network model of the social network is constructed and initialized, the network model is fitted with the adjacency matrix, the posterior approximate distribution of model parameters of the network model is calculated, then the optimal model parameters are selected based on a model selection standard and the posterior approximate distribution of the model parameters, an optimal model is determined at the same time, and finally symbol prediction is carried out according to a predefined algorithm based on the optimal model.
It should be understood that, in the embodiment of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiment of the present invention.
Example two:
fig. 3 is a block diagram illustrating a symbol prediction apparatus according to a second embodiment of the present invention. For convenience of explanation, only portions related to the embodiments of the present invention are shown.
The symbol prediction device includes: the device comprises a construction unit 21, a fitting unit 22, an optimal model selecting unit 23 and a symbol predicting unit 24, wherein:
the building unit 21 is configured to define an adjacency matrix representing a social network, build a network model of the social network, and initialize the network model.
Further, the building unit 21 specifically includes:
a definition module for defining an adjacency matrix A representing the social network, wherein the element a in the adjacency matrix AijRepresenting links of a node i and a node j in the social network, wherein i and j represent the node i and the node j respectively; a isij1 indicates that there is a positive link between node i and node j, aij-1 indicates that there is a negative link between node i and node j, aij0 denotes node i and node bThere is no link between points j.
The building module is used for building and initializing a network model NM of the social network, wherein the network model NM is (n, K, Z, pi, omega), n, K, Z, pi, omega are model parameters in the network model NM, n represents the number of nodes in the network model, K represents the number of blocks in the network model and represents the number of communities contained in the social network, K belongs to [1, n ], Z is a vector with dimensions of nxK and is used for indicating the blocks to which each node belongs, pi is a vector with dimensions of kxkxkx3 and is used for representing the probability of connection among the blocks in the network model, and omega is a vector with dimensions of K and is used for representing the probability of the nodes belonging to the blocks in the network model.
Specifically, in the present embodiment, in the network model NM ═ (n, K, Z, pi, Ω), the parameter pi satisfies the following relationship:
Figure GDA0002874369540000101
and pilqWith the following a priori distribution:
Figure GDA0002874369540000102
where Γ (x) denotes a gamma function (gamma function), h ═ 1,2,3}, h ═ 1 denotes that a positive link exists between node i in block l and node j in block q, h ═ 2 denotes that a link does not exist between node i in block l and node j in block q, and h ═ 3 denotes that a negative link exists between node i in block l and node j in block q. Pilq1Indicates the probability, pi, that there is a positive link between the nodes of block l and block qlq2Representing the probability, π, that a node between block l and block q is unlinkedlq3Indicating the probability that there is a negative link between the node of block l and block q,
Figure GDA0002874369540000103
the parameter representing the prior distribution of the parameter pi has a value range of [0, + ∞).
The parameter Ω satisfies the following relationship:
Figure GDA0002874369540000104
and Ω has the following a priori distribution:
Figure GDA0002874369540000105
wherein Γ (x) represents a gamma function (gamma function), K represents the number of blocks of the network model, and K ∈ [1, n ∈],k∈[Kmin,Kmax],Kmin∈[1,n],Kmax∈[1,n]。ρ0Is a parameter in the prior distribution p (Ω) of the parameter Ω, whose value range is (0, + ∞). A fitting unit 22, configured to fit the network model to the adjacency matrix, and calculate a posterior approximate distribution of model parameters of the network model.
In this embodiment, the network model NM is fitted to a social network N, i.e. the network model NM is fitted to the adjacency matrix a.
Further, the fitting unit 22 specifically includes:
and the fitting module is used for fitting the network model NM with the adjacency matrix A, wherein K belongs to [1, n ], each K value corresponds to one network model NM, and the posterior approximate distribution of model parameters pi, omega and Z in the network model NM is estimated.
Specifically, in the calculation process, the parameters K ∈ [1, n ], each K value corresponding to a network model NM, i.e. each K value corresponding to a posterior distribution of the specific parameter values pi, Ω and Z. The posterior approximate distribution of the model parameters pi, omega and Z is calculated according to the following steps:
let the posterior approximation of (Z, π, Ω) be distributed as q (Z, π, Ω) and have the following expression:
Figure GDA0002874369540000111
Figure GDA0002874369540000112
Figure GDA0002874369540000113
Figure GDA0002874369540000114
wherein Z isikIs an indication vector for indicating whether node i belongs to block k, if ZikIf 1, node i belongs to block k, if ZikIf 0, node i does not belong to block k. Tau isikIs a parameter in the posterior approximate distribution of the indicator vector Z, whose value range is [0, 1 ]],ηlqIs a parameter of the posterior approximate distribution of the parameter pi, whose value range is [0, + ∞).
In the last 90 th century, variational inference was rapidly developed on probabilistic models, and a general variational method under a Bayesian framework can be applied to hidden Markov models, mixed factor analysis, linear dynamics, graph models and the like. Variational bayes is a class of techniques used to approximate complex integrals in the field of bayesian estimation and machine learning. The method is mainly applied to complex statistical models. The posterior approximate distributions of model parameters pi, omega and Z are calculated by equations (5) (6) (7) (8) to make statistical inferences from the variables of these parameters.
And the optimal model selecting unit 23 is configured to select an optimal model based on a model selection criterion and the posterior approximate distribution of the model parameters.
Further, the optimal model selecting unit 23 specifically includes:
and the standard acquisition module is used for acquiring a model selection standard H by combining the network model NM based on a variational Bayesian method. Specifically, the model selection criterion H is obtained by combining a network model NM inference based on a variational bayesian method, and the specific form is as follows:
Figure GDA0002874369540000121
and the evidence value calculation module is used for calculating the evidence value of the network model NM corresponding to each K value based on the model selection standard H and the posterior approximate distribution of the model parameters pi, omega and Z.
Specifically, to obtain the evidence value H of the network model NM corresponding to each K valueK(also called marginal likelihood estimation), the optimal values of the model parameters (τ, η, ρ) of the posterior approximate distribution of the model parameters π, Ω, and Z are determined first.
Initializing model parameters (τ, η, ρ, η)00) Wherein η0Is a parameter in the prior distribution of the parameter pi, p0Is a parameter in the parameter omega prior distribution, K is an element of [ K ∈ [ [ K ]min,Kmax],Kmin∈[1,n],Kmax∈[1,n]. The specific expressions shown in formulas (9), (10) and (11) are obtained by inference based on the variational bayes method and formulas (5), (6), (7) and (8) in step S12.
The value of the parameter of the Z-approximation distribution is calculated according to the following formula (9), where τilRepresenting the probability that node i points to block i.
Figure GDA0002874369540000122
Where ψ (. cndot.) is a digamma function (derivative of logarithm of gamma function),. alpha.denotes "proportional", and τ is normalized after calculation to satisfy
Figure GDA0002874369540000123
The parameter values of the approximate distributions of the parameters pi and omega are calculated according to the following equations (10) (11), respectively:
Figure GDA0002874369540000124
Figure GDA0002874369540000125
at this time, a value K is determined, and the value range of K is [ Kmin,Kmax]A set of parameter values of the parameters (tau, eta, rho) is obtained by calculation according to the above equations (9), (10) and (11), and the corresponding network model evidence value H is calculated according to the set of parameter values and the equation (12)front(ii) a The next set of parameter values of the parameters (tau, eta, rho) is obtained by iteratively calculating the equations (9), (10) and (11), and the corresponding network model evidence value H is calculated according to the equation (12)nextIs prepared from HnextAnd HfrontMaking a comparison when HnextAnd HfrontIf the difference value is larger than the preset value delta, continuously carrying out iterative calculation on parameter values; when H is presentnextAnd HfrontIf the difference is less than the preset value delta, the iteration is ended, and the evidence value H of the network model is determinednextCorresponding parameter values are the optimal solution and output, and meanwhile, the evidence value of the network model corresponding to the K value is determined to be Hnext. Calculating parameter values of parameters (tau, eta, rho) output at each K value and corresponding network model evidence values H according to the methodK
A selection module for selecting the largest evidence value HKThe corresponding network model NM is used as the optimal model NMoptim
In particular, for a particular network model NM, the higher its evidence value of the network model, the better the degree of fitting of said network model NM to the social network, and therefore, the evidence value H is chosenKThe maximum corresponding network model NM is used as the optimal model NMoptim
And the symbol prediction unit 24 is used for performing symbol prediction according to a predefined algorithm based on the optimal model.
Further, the symbol prediction unit 24 specifically includes:
a symbol prediction module for NM based on the optimal modeloptimAnd performing symbol prediction according to a predefined algorithm by the posterior approximate distribution of the middle parameter pi. Wherein, the posterior approximate distribution q (pi) of the parameter pi is obtained according to the formula (7).
Wherein the predefined algorithm is as follows:
the link sign between node i and node j is determined according to the following equation (13):
Figure GDA0002874369540000131
specifically, l and q denote block l and block q, respectively, where node i belongs to block l, node j belongs to block q,
Figure GDA0002874369540000141
indicating the probability that there is a positive link between node i in block l and node j in block q,
Figure GDA0002874369540000142
indicating the probability that there is a negative link between node i in block l and node j in block q,
Figure GDA0002874369540000143
Figure GDA0002874369540000144
ηlqa parameter value representing a posterior approximation distribution of the parameter pi.
In the second embodiment of the invention, a social network is represented by defining an adjacency matrix, a network model of the social network is constructed and initialized, the network model is fitted with the adjacency matrix, posterior approximate distribution of model parameters of the network model is calculated, optimal model parameters are selected based on model selection criteria and the posterior approximate distribution of the model parameters, an optimal model is determined at the same time, and finally symbol prediction is carried out according to a predefined algorithm based on the optimal model.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (2)

1. A symbolic prediction method for a symbolic social network, the symbolic prediction method comprising:
defining an adjacency matrix to represent a social network, constructing a network model of the social network and initializing, specifically comprising: defining an adjacency matrix A representing the social network, wherein the element a in the adjacency matrix AijRepresenting links of a node i and a node j in the social network, wherein i and j represent the node i and the node j respectively; constructing and initializing a network model NM of the social network, wherein the network model NM is (n, K, Z, pi, omega), n, K, Z, pi, omega are all model parameters in the network model NM, n represents the number of nodes in the network model, K represents the number of blocks in the network model and is used for representing the number of communities in the social network, Z is a dimension vector of nxK and is used for indicating the block to which each node belongs, pi is a dimension vector of kxKx 3 and is used for representing the probability of connection between blocks in the network model, and omega is a dimension vector of K and is used for representing that a node belongs to the network model NMProbabilities of blocks in the network model; a isij1 indicates that there is a positive link between node i and node j, aij-1 indicates that there is a negative link between node i and node j, aij0 means no link between node i and node j; positive links represent friendly, liked and trusted relationships, negative links represent hostile, disliked and untrusted relationships;
fitting the network model with the adjacency matrix, and calculating posterior approximate distribution of model parameters of the network model;
selecting an optimal model based on a model selection standard and posterior approximate distribution of the model parameters;
performing symbol prediction according to a predefined algorithm based on the optimal model;
fitting the network model to the adjacency matrix, and calculating posterior approximate distribution of model parameters of the network model specifically includes:
fitting the network model NM with the adjacency matrix A, wherein each K value corresponds to one network model NM, and estimating the posterior approximate distribution of model parameters pi, omega and Z in the network model NM;
the selecting of the optimal model based on the model selection standard and the posterior approximate distribution of the model parameters specifically comprises:
obtaining a model selection standard H based on a variational Bayesian method in combination with the network model NM;
calculating an evidence value H of the network model NM corresponding to each K value based on the model selection standard H and the posterior approximate distribution of the model parameters pi, omega and ZK
Choose the largest evidence value HKThe corresponding network model NM is used as the optimal model
Figure FDA0002874369530000024
The symbol prediction is performed according to a predefined algorithm based on the optimal model, and specifically includes:
based on the optimal model NMoptimThe posterior approximate distribution of the middle parameter pi according to the predefined calculationCarrying out symbol prediction;
wherein the predefined algorithm is as follows:
the link sign between node i and node j is determined according to the following formula:
Figure FDA0002874369530000021
where l and q represent block l and block q, respectively, where node i belongs to block l, node j belongs to block q,
Figure FDA0002874369530000022
indicating the probability that there is a positive link between node i in block l and node j in block q,
Figure FDA0002874369530000023
indicating the probability that a negative link exists between node i in block l and node j in block q.
2. A symbolic prediction apparatus for a symbolic social network, the symbolic prediction apparatus comprising:
the construction unit is used for defining an adjacency matrix to represent a social network, constructing a network model of the social network and initializing the network model; the construction unit specifically comprises: a definition module for defining an adjacency matrix A representing the social network, wherein the element a in the adjacency matrix AijRepresenting links of a node i and a node j in the social network, wherein i and j represent the node i and the node j respectively; a building module, configured to build and initialize a network model NM of the social network, where the network model NM is (n, K, Z, pi, Ω), where n, K, Z, pi, Ω are model parameters in the network model NM, n represents the number of nodes in the network model, K represents the number of blocks in the network model, and is used to represent the number of communities included in the social network, Z is an n × K dimensional vector and is used to indicate a block to which each node belongs, and pi is a K × 3 dimensional vector and is used to represent the probability of inter-block connections in the network model, and Ω is a K dimensional vectorA vector representing a probability of a node belonging to a block in the network model; a isij1 indicates that there is a positive link between node i and node j, aij-1 indicates that there is a negative link between node i and node j, aij0 means no link between node i and node j; positive links represent friendly, liked and trusted relationships, negative links represent hostile, disliked and untrusted relationships;
a fitting unit configured to fit the network model to the adjacency matrix and calculate posterior approximate distribution of model parameters of the network model;
the optimal model selecting unit is used for selecting an optimal model based on model selection standards and posterior approximate distribution of the model parameters;
the symbol prediction unit is used for performing symbol prediction according to a predefined algorithm based on the optimal model;
the fitting unit specifically includes:
the fitting module is used for fitting the network model NM with the adjacent matrix A, each K value corresponds to one network model NM, and the posterior approximate distribution of model parameters pi, omega and Z in the network model NM is estimated;
the optimal model selecting unit specifically comprises:
the standard acquisition module is used for acquiring a model selection standard H by combining the network model NM based on a variational Bayesian method;
an evidence value calculation module for calculating the evidence value H of the network model NM corresponding to each K value based on the model selection standard H and the posterior approximate distribution of the model parameters pi, omega and ZK
A selection module for selecting the largest evidence value HKThe corresponding network model NM is used as the optimal model
Figure FDA0002874369530000031
The symbol prediction unit specifically includes:
a symbol prediction module for NM based on the optimal modeloptimPosterior of middle parameter piApproximate distribution, and symbol prediction is carried out according to a predefined algorithm;
wherein the predefined algorithm is as follows:
the link sign between node i and node j is determined according to the following formula:
Figure FDA0002874369530000041
where l and q represent block l and block q, respectively, where node i belongs to block l, node j belongs to block q,
Figure FDA0002874369530000042
indicating the probability that there is a positive link between node i in block l and node j in block q,
Figure FDA0002874369530000043
indicating the probability that a negative link exists between node i in block l and node j in block q.
CN201710110775.8A 2017-02-28 2017-02-28 Symbol prediction method and device Active CN106934494B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710110775.8A CN106934494B (en) 2017-02-28 2017-02-28 Symbol prediction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710110775.8A CN106934494B (en) 2017-02-28 2017-02-28 Symbol prediction method and device

Publications (2)

Publication Number Publication Date
CN106934494A CN106934494A (en) 2017-07-07
CN106934494B true CN106934494B (en) 2021-04-06

Family

ID=59423473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710110775.8A Active CN106934494B (en) 2017-02-28 2017-02-28 Symbol prediction method and device

Country Status (1)

Country Link
CN (1) CN106934494B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109523012B (en) * 2018-10-11 2021-06-04 上海交通大学 Expression learning method for symbol directed network based on variational decoupling mode

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942614A (en) * 2014-04-09 2014-07-23 清华大学 Method and system for predicting heterogeneous network linking relation
CN105160580A (en) * 2015-07-13 2015-12-16 西安电子科技大学 Symbol network structure balance of multi-objective particle swarm optimization based on decomposition
CN105893637A (en) * 2016-06-24 2016-08-24 四川大学 Link prediction method in large-scale microblog heterogeneous information network
CN106204299A (en) * 2016-07-20 2016-12-07 深圳信息职业技术学院 Community mining method and device based on symbolic network model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10120979B2 (en) * 2014-12-23 2018-11-06 Cerner Innovation, Inc. Predicting glucose trends for population management

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103942614A (en) * 2014-04-09 2014-07-23 清华大学 Method and system for predicting heterogeneous network linking relation
CN105160580A (en) * 2015-07-13 2015-12-16 西安电子科技大学 Symbol network structure balance of multi-objective particle swarm optimization based on decomposition
CN105893637A (en) * 2016-06-24 2016-08-24 四川大学 Link prediction method in large-scale microblog heterogeneous information network
CN106204299A (en) * 2016-07-20 2016-12-07 深圳信息职业技术学院 Community mining method and device based on symbolic network model

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Bayesian Approach to Modeling and Detecting Communities in Signed Network;Bo Yang 等;《Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence》;20151231;第1952-1958页 *
Predicting positive and negative links in signed social networks by transfer learning;Jihang Ye 等;《Proceedings of the 22nd international conference on World Wide Web》;20130517;第1477-1487页 *
一种高效的随机块模型学习算法;赵学华 等;《软件学报》;20160504;第27卷(第9期);第2248-2264页 *
基于生成模型的大规模网络广义社区发现方法研究;柴变芳;《中国博士学位论文全文数据库信息科技辑》;20151015;第2015年卷(第10期);第I138-8页 *
符号社会网络中正负关系预测算法研究综述;蓝梦微 等;《计算机研究与发展》;20150215;第52卷(第2期);第410-422页 *
统计网络模型若干关键问题研究;赵学华;《中国博士学位论文全文数据库信息科技辑》;20150315;第2015年卷(第03期);第I140-16页 *

Also Published As

Publication number Publication date
CN106934494A (en) 2017-07-07

Similar Documents

Publication Publication Date Title
Solus et al. Consistency guarantees for greedy permutation-based causal inference algorithms
CN109460793B (en) Node classification method, model training method and device
Goegebeur et al. Nonparametric regression estimation of conditional tails: the random covariate case
Gao et al. On the asymptotic normality of estimating the affine preferential attachment network models with random initial degrees
Graversen et al. Computational aspects of DNA mixture analysis: Exact inference using auxiliary variables in a Bayesian network
Knoch et al. Cycle representatives for the coarse-graining of systems driven into a non-equilibrium steady state
Gerych et al. Recurrent bayesian classifier chains for exact multi-label classification
Arnst et al. Computation of sobol indices in global sensitivity analysis from small data sets by probabilistic learning on manifolds
CN115345293A (en) Training method and device of text processing model based on differential privacy
Seo et al. Pivotal inference for the scaled half logistic distribution based on progressively Type-II censored samples
CN106934494B (en) Symbol prediction method and device
Stewart et al. Pseudo-likelihood-based $ M $-estimation of random graphs with dependent edges and parameter vectors of increasing dimension
CN111178414A (en) Prediction method of edge-connected symbol in symbol network
CN113095490B (en) Graph neural network construction method and system based on differential privacy aggregation
CN109728958A (en) A kind of network node trusts prediction technique, device, equipment and medium
Souravlas et al. Probabilistic community detection in social networks
CN110688484B (en) Microblog sensitive event speech detection method based on unbalanced Bayesian classification
JP7331938B2 (en) LEARNING DEVICE, ESTIMATION DEVICE, LEARNING METHOD, AND LEARNING PROGRAM
Alfonso et al. A generative flow for conditional sampling via optimal transport
CN112184299A (en) Arbitrage user identification method, apparatus, electronic device and medium
Li et al. Dynamic scaled sampling for deterministic constraints
CN110598850A (en) Method for determining neighbor function kernel parameters of self-organizing neural network and training method
Krishnamurthy et al. Segregation in social networks: Markov bridge models and estimation
Saraiva et al. A Bayesian sparse finite mixture model for clustering data from a heterogeneous population
Andrade et al. Long-range dependence and approximate Bayesian computation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant