CN103020221A - Social search method based on multi-mode self-adaptive social relation strength excavation - Google Patents

Social search method based on multi-mode self-adaptive social relation strength excavation Download PDF

Info

Publication number
CN103020221A
CN103020221A CN 201210535907 CN201210535907A CN103020221A CN 103020221 A CN103020221 A CN 103020221A CN 201210535907 CN201210535907 CN 201210535907 CN 201210535907 A CN201210535907 A CN 201210535907A CN 103020221 A CN103020221 A CN 103020221A
Authority
CN
China
Prior art keywords
user
theme
word
image
expression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201210535907
Other languages
Chinese (zh)
Inventor
徐常胜
桑基韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN 201210535907 priority Critical patent/CN103020221A/en
Publication of CN103020221A publication Critical patent/CN103020221A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a self-adaptive social relation strength excavation method based on a multi-mode generative model. The method comprises the following steps of: collecting picture information uploaded by users and users which have social relationships with the picture information, and enabling each user to correspond to a triple consisting of an uploaded image set, an image annotation set and a social network; reversing the generation process of picture contents and annotations according to an input triple through the multi-mode generative model for concluding to obtain a theme space which can be used for describing user interest distribution and the theme distribution of the users; and calculating the obtained theme space and the user theme distribution to obtain the relation strength of theme sensitivity among users. The method is applied to self-adaptive multimedia retrieval and the like.

Description

A kind of social searching method that excavates based on multi-modal self-adaptation social relationships intensity
Technical field
The present invention relates to the multimedia search field, particularly a kind of social searching method that excavates based on multi-modal self-adaptation social relationships intensity.
Background technology
Social Media (Social Media) has greatly changed the user and has shared mode and custom with obtaining information.In the Social Media service, the user consists of community, i.e. so-called community network with other user interactions inevitably.Comprise two-way social relationships in the community network, such as " related (Connect) " among the LinkedIn and " the adding as a friend (Add Friend) " among the Facebook, and unidirectional social relationships, such as " the subscribing to (Subscribe) " among " following (Follow) " among the Twitter and the Youtube.These social relationships are considered to affect user's behavior and the active development of community network.As, the colleague on the LinkedIn can affect the selection on the personal work, and the good friend on the Facebook then can affect behavior and the demand in the personal lifestyle.Can urge to give birth to a lot of important application by analyzing and excavate these social relationships, such as viral marketing, Collaborative Recommendation and cooperative information search etc.Take based on the multimedia collaborative search of one-dimensional society relation as example, its basic assumption and starting point are: by analyzing the behavior to other users of the influential relation of search subscriber, real demand that can the forecasting search user is also adjusted Search Results.
The method of excavating for social relationships at present mainly concentrates on and studies the prediction of strength that whether has social relationships and social relationships.In a lot of problems, binaryzation or continuous social relationships can not be satisfied the demand of application.As in the multimedia search problem, for different search words, the social relationships between the user are different.Suppose that the user will be the photo of honeymoon trip search " Hawaii " of oneself, have the good friend of tourism speciality can be maximum to his help, we wish the social relationships grow between them; And when the photo of same user search " fashion show ", can wish that then pop fashion has the good friend of research can affect more Search Results, i.e. social relationships grow between them.We claim that this social relationships relevant with problem are adaptive social relationships intensity, and will introduce in the present invention a kind of self-adaptation social relationships intensity method for digging based on multi-modal production model.
Summary of the invention
The technical matters that (one) will solve
The purpose of this invention is to provide according to the adaptive social relationships intensity of excavating and carry out picture searching, when the different demand of user, can automatically obtain the assistance from different user, thereby help to understand and the real demand of predictive user, and then accurate finger URL share the Search Results of family real demand.
(2) technical scheme
For solving the problems of the technologies described above, the invention provides a kind of social searching method that excavates based on multi-modal self-adaptation social relationships intensity, it is as follows that the method comprising the steps of:
Step 1: collect the pictorial information that the user uploads and the user that one-dimensional society's relation is arranged with it, each user corresponding one by uploading image collection, image labeling set and gathering the tlv triple that forms with its user that concerns by one-dimensional society's relation;
Step 2: according to the tlv triple of input, set up multi-modal probability production model, the generative process of the image labeling information in the image content in the described image collection and the image labeling set is inferred;
Step 3: calculate user theme space and the distribution of user's theme according to inferred results, calculate the social relationships intensity of the theme sensitivity between user and the user;
Step 4: Search Results is sorted according to resulting user's theme space, the distribution of user's theme and social relationships intensity.
(3) beneficial effect
The present invention has adopted multi-modal production model, and user's community network, the user who observes uploaded image and provide mark counter pushing away, and proposes a kind of social relationships intensity method for digging of theme sensitivity.This invention has solved the social relationships intensity problem that self-adaptation is adjusted in different problems, wherein considers simultaneously text marking data and visual pattern feature, can analyze preferably the social relationships intensity in the multimedia application; The theme that in addition, can obtain simultaneously theme space, user by the method distribute and the user between the pass tie up to intensity on the different themes.
Description of drawings
Fig. 1 is the process flow diagram according to the self-adaptation social relationships intensity method for digging based on multi-modal production model of the present invention;
Fig. 2 is the synoptic diagram according to multi-modal production topic model of the present invention;
Fig. 3 is the synoptic diagram according to the implementation result of method provided by the present invention on the Flickr data set;
Fig. 4 is another synoptic diagram according to the implementation result of method provided by the present invention on the Flickr data set.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in further detail.
The present invention realizes the method for the social relationships intensive analysis of theme sensitivity under a kind of social multimedia environment, can be for different application, and self-adaptation is regulated social relationships intensity.Compare existing social relationships intensive analysis method, obtain on the one hand the social relationships intensity of theme sensitivity, but self-adaptation is regulated according to problem; On the other hand, by considering text message and visual information, can serve better multimedia application.
Fig. 1 is the process flow diagram according to the self-adaptation social relationships intensity method for digging based on multi-modal production model of the present invention.As shown in Figure 1, method provided by the present invention comprises the steps:
Step 1: input pre-treatment step, namely collect the pictorial information that the user uploads and the user that one-dimensional society's relation is arranged with it, the oriented social action that means in the Social Media sharing website is closed in the one-dimensional society here, such as follow (follow) among the contact among the Flickr (contact) or the Twitter.Each user corresponding one by the tlv triple that image collection that user's set, this user upload and image labeling set form that concerns that has one-dimensional society's relation with it, wherein image labeling refers to original tag (Tag) information of the Description Image that this user provides;
Step 2: the responsive social relationships excavation step of multi-modal theme, namely according to the described tlv triple of inputting, by a kind of multi-modal production model, the anti-row that advances of the generative process of the image labeling information in the image content in the described image collection and the image labeling set is inferred, obtain one and can describe theme space and the distribution of user's theme that user interest distributes;
Step 3: the output parameter calculation procedure, namely the described theme space and the distribution of described user's theme that obtain are calculated, obtain the relationship strength of the theme sensitivity between user and the user.
The below is described in detail each step.Following table has provided the used key symbol tabulation of the present invention and corresponding description thereof.
Figure BDA00002572908600031
Figure BDA00002572908600041
Step 1: input pre-treatment step.
The social relationships intensity Mining Problems of theme sensitivity is at first described with mathematical linguistics:
The user who defines in the 1 given Social Media (such as Flickr) gathers U, the corresponding tlv triple { C of each user u ∈ U wherein u, D u, T u, C wherein u, D u, T uExpression and user u have the image labeling set that image collection that user's set, user u upload and user u add that concerns of one-dimensional society relation respectively.
The purpose that the social relationships intensity of theme sensitivity is excavated is learnt exactly:
(1) theme space
Figure BDA00002572908600042
With
Figure BDA00002572908600043
Φ wherein w, Φ vThe mark word that is the theme distributes and the vision descriptor distributes, and w, v are user's mark word and vision descriptor vector, and R is real number field, and K is the theme sum, and k represents the k theme.All users' mark word forms the mark dictionary, and all vision descriptors form vision descriptor dictionary, | W| is the size of mark dictionary, | V| is the size that vision is described dictionary.The mark term vector here refers to the response that marks on the dictionary that is labeled in of certain user's use, such as the w={ landscape, and travelling, landscape, ..., vision descriptor vector refers to the response of picture on vision descriptor dictionary that certain user uploads, such as υ={ descriptor 1, descriptor 5, descriptor 1 ...;
(2) theme of each user u distributes
Figure BDA00002572908600044
For the user uses the probability of each theme, the interest that can be regarded as the user distributes;
(3) the social relationships intensity of theme sensitivity
Figure BDA00002572908600045
K={1 ..., K}, That record is user u 2To user u 1Social relationships intensity on k theme,
Attention: the social relationships intensity here is unidirectional, namely
Figure BDA00002572908600047
With
Figure BDA00002572908600048
Different.
Pretreated operation is exactly that three kinds of elements to the tlv triple of inputting gather and represent.
Step 11: concern that the user gathers C uCollection and pre-service.
To each user u, according to its social relation network collection has one-dimensional society to concern with it user, form set C u
Step 12: user's uploading pictures set D uCollection and pre-service.
To each user u, gather the picture that it is uploaded, represent with the response vector of picture at vision descriptor dictionary, and form image collection D uThe vision descriptor here can adopt any word bag (bag-of-words) feature, the present invention has adopted maximum stable extremal region feature (Maximally Stable Extremal Region in test experiments, MSER) come the vision content of Description Image, compare with the feature based on key point, the MSER feature is described is local homogeneity part in the image, have higher consistance, be more suitable in problem background of the present invention.
Step 13: user images mark set T uCollection and pre-service.
In the media sharing website, the user can add label so that management and description, i.e. markup information for its image of uploading.To each user u, gather the markup information of its interpolation.Each user's markup information also represents with its response vector at the mark dictionary, and forms set T u
Step 2: the responsive social relationships excavation step of multi-modal theme.
User's online behavior is considered to be subject to have with it very big impact of user of social relationships, in the image sharing website, uploads image and image is marked the observation that can think the online behavior of user.Based on this, the present invention proposes a kind of multi-modal probability production model, by the production process of analog image and mark, infer inherent social relationships structure.Concrete thought is: suppose that to each user its image of uploading and mark are produced by dual mode, or depend on the interest of oneself, or are subjected to other users' impact.According to this hypothesis, the present invention proposes the responsive social relation model of a kind of multi-modal theme, this model comprises the steps:
Step 21: the multi-modal probability production model of setting up the theme sensitivity.
Fig. 2 has shown the structural representation of the production model that hypothesis thus proposes.In the production model, arrow represents the conditional relationship hypothesis, corresponding to sampling from distribute; Circle represents variable, and wherein solid circles represents observational variable, comprises that mainly the social relationships user gathers C u, upload image collection D uWith the image labeling set T that adds uEmpty circles represents hidden variable, mainly comprises switch hidden variable s, theme record hidden variable z, sampling user hidden variable c, and theme space variable Φ and user's theme distribution variable Ω.Switch hidden variable s is used for the generative process of control observation variable, specifically sees lower; The theme that theme record hidden variable z record obtains by user's theme profile samples; The unidirectional relationship user that the sampling of sampling user hidden variable c record obtains.
Among the present invention, for the image vision content, at first make up a vision descriptor dictionary, then its vision descriptor vector v in the response formation of vision descriptor dictionary of every width of cloth imagery exploitation represents.Model is introduced binary switch hidden variable s wAnd s vControl and record certain mark word and vision descriptor and be by the spontaneous generation of user u or the impact that is subjected to other users.Work as s w=1 o'clock, expression mark root was according to user's oneself theme distribution Ω uProduce; Work as s w=0 o'clock, expression mark word was concerned user list C uIn certain user c wImpact, and distribute according to this theme that affects the user
Figure BDA00002572908600061
Produce.
Therefore, the production process of the mark word of user u is as follows:
Sampling obtains switching variable from Bei Nuli distributes:
s w~Bernoulli (λ), the wherein shape of λ control Bei Nuli distribution;
If s w=0, then concern the user list according to multinomial distribution that from user u sampling obtains one and affects the user:
Figure BDA00002572908600062
Wherein γ controls the shape of multinomial distribution;
From affecting the user Theme distribute Theme of middle sampling is recorded as variable
Figure BDA00002572908600065
If s w=1, then from the theme distribution Ω of user u oneself uTheme of middle sampling is recorded as variable
Distribute from the mark word of theme
Figure BDA00002572908600067
Middle sampling obtains marking word w U, i
The production process of vision descriptor similarly, vision descriptor v U, iIt is the vision descriptor distribution by theme
Figure BDA00002572908600068
Middle sampling produces.By this step set up such as the production model of Fig. 2 and the production process of above-mentioned hypothesis, the gibbs sampler during we can carry out steps 22, each the hidden variable value that obtains sampling, thereby the production model is found the solution.
Step 22: the multi-modal probability production model of finding the solution the theme sensitivity.
Described in step 1, this model be input as one group of user, corresponding tlv triple { C of each user wherein u, D u, T u.Described in step 21, we are to a kind of production process of this input hypothesis and introduced a series of hidden variables, and finding the solution this model namely needs to sample by the production process of the input that observes and hypothesis and infer the value of these hidden variables.Finally, according to the sampling value of hidden variable, we can carry out output parameter calculating, and this will state in step 3.The production model that proposes comprises three class hidden variables: switch hidden variable s w, s v, sampling user hidden variable c w, c vAnd theme record hidden variable z w, z v, s wherein v, c vAnd z vThe switch hidden variable that sampling produced when expression generated the vision descriptor respectively, sampling user's hidden variable and theme record hidden variable, the present invention utilizes gibbs (Gibbs) sampling to carry out the deduction of model hidden variable and comprises that theme space, user's theme distribute and the finding the solution of the model parameter of the relationship strength of theme sensitivity.
When using gibbs sampler to find the solution the production model, each iteration can obtain a value for each hidden variable sampling, and the sampling of the hidden variable value that the sampling of last iteration obtains hidden variable can upgrade next iteration the time.Each hidden variable is carried out the iteration renewal by fixing its dependent variable,, the update rule of the hidden variable relevant with the mark word is as follows:
p ( s i w = 0 | s - i w , u i w , c i w , z i w , · ) ∝ N U , S w ( u i w , 0 ) + α λ - 1 N U w ( u i w ) + 2 α λ - 1 · N U , Z w ( c i w , z i w ) + α Ω - 1 N U w ( c i w ) + Kα Ω - 1
p ( s i w = 0 | s - i w , u i w , c i w , z i w , · ) ∝ N U , S w ( u i w , 1 ) + α λ - 1 N U w ( u i w ) + 2 α λ - 1 · N U , S , Z w ( c i w , 1 , z i w ) + α Ω - 1 N U , S w ( c i w ) + Kα Ω - 1
p ( c i w | c - i w , s i w = 0 , u i w , z i w , C u i w , · ) ∝ N U , C , S , Z w ( u i w , c i w , 0 , z i w ) + α γ N U , S , Z w ( u i w , 0 , z i w ) + | C u i w | · N U , Z w ( c i w , z i w ) + α Ω - 1 N U w ( c i w ) + Kα Ω - 1
p ( z i w | z - i w , s i w = 0 , w i , · ) ∝ N U , Z w + ( c i w , z i w ) + α Ω - 1 N U w ( c i w ) + Kα Ω - 1 · N Z , W w ( z i w , w i ) + α Φ w N Z w ( z i w ) + | W | α Φ w
p ( z i w | z - i w , s i w = 1 , w i , · ) ∝ N U , S , Z w + ( c i w , 1 , z i w ) + α Ω - 1 N U , S w ( c i w , 1 ) + Kα Ω - 1 · N Z , W w ( z i w , w i ) + α Φ w N Z w ( z i w ) + | W | α Φ w - - - ( 1 )
Wherein
Figure BDA00002572908600083
Represent i the user under the mark word,
Figure BDA00002572908600084
Represent that i the theme under the mark word distributes, be generalization, the minimizing model learning complexity that guarantees model, the all priori of hypothesis is all obeyed symmetrical Dirichlet distribute in the model learning process, determine the parameter of these prior distribution parameters, i.e. super parameter, the subscript that adds dependent variable with α represents, α Ω,
Figure BDA00002572908600085
α λ, α γBe respectively the super parameter of symmetry of the corresponding Di Li Cray prior distribution of control, it is manually specified when realizing and regulates.N () represents counter, is used for the number of samples that expression iteration sampling process meets certain condition.As
Figure BDA00002572908600086
The expression user
Figure BDA00002572908600087
The mark word in by concerning the user
Figure BDA00002572908600088
Impact
Figure BDA00002572908600089
Result from theme
Figure BDA000025729086000810
Sample size;
Figure BDA000025729086000811
The expression user The mark word in by concerning that customer impact (S=0) results from theme
Figure BDA000025729086000813
Sample size; The expression user The mark word in by the sample size that concerns customer impact (S=0);
Figure BDA000025729086000816
The expression user
Figure BDA000025729086000817
The quantity of mark word.Each counter is all from the gibbs sampler process: sampling obtains meeting the hidden variable of certain counter condition, and this Counter Value namely adds one.The variable update rule relevant with the vision descriptor similarly.The switch hidden variable s that tries to achieve by above-mentioned formula w, s v, sampling user hidden variable c w, c vAnd theme record hidden variable z w, z vConditional probability distribution can when each iteration, obtain the sampling of each hidden variable, and iteration upgrades.
Step 3: output parameter calculation procedure.
The input of this step is the sampled value to each hidden variable that obtains in the step 2 gibbs sampler process, and output is three kinds of parameters that the social relationships intensity Mining Problems of defined theme sensitivity in the step 1 will obtain: theme space Φ wAnd Φ v, each user u theme distribution Ω uAnd the social relationships intensity Ψ (k) of theme sensitivity.
Step 31: the calculating of theme space Φ and user's theme distribution Ω.
Through gibbs sampler, can obtain hidden variable
Figure BDA00002572908600091
Sampled value.When hidden variable and the mark that observes and the joint distribution of uploading image are stable, the update rule that namely calculates according to sampled value is when meeting observation data and conform to most, and the iteration renewal process reaches convergence.This process is similar to maximal possibility estimation.The sampled value of each hidden variable that obtain behind the Statistical Convergence this moment, and refresh counter can directly be calculated theme space Φ and user's theme distribution Ω.The vision descriptor distribution Φ of the mark word of described theme and theme w, Φ vThe subspace that expression study is arrived can be distributed by the theme of sampling
Figure BDA00002572908600092
Calculate.Because What reality was described is the probability that produces j mark word in k theme, so it can be by normalization counter N Z, W() obtains, and calculates Φ vProcess similar, that is:
Φ k , j w = N Z , W w ( Z k , w j ) + α Φ w N Z w ( Z k ) + | W | α Φ w - - - ( 2 )
Φ k , j v = N Z , W v ( Z k , v j ) + α Φ v N Z v ( Z k ) + | V | α Φ v - - - ( 3 )
Z wherein kRepresent k theme.M user U mTheme distribute and can followingly calculate:
Ω m , k = N U , S , Z w ( U m , 1 , Z k ) + N U , S , Z v ( U m , 1 , Z k ) + α Ω N U , S w ( U m , 1 ) + N U , S v ( U m , 1 ) + Kα Ω - - - ( 4 )
Step 32: the calculating of the relationship strength Ψ of the theme sensitivity between user and the user.
User U under k the theme M1To U M2Social relationships intensity Ψ M1, m2(k), can be by under k the theme, user U M2Mark word/vision descriptor be subjected to user U M1The number of impact calculates, i.e. N U, C, S, Z(U M2, U M1, 0, Z k):
ψ m 1 , m 2 ( k ) = N U , C , S , Z w ( U m 2 , U m 1 , 0 , Z k ) + N U , C , S , Z v ( U m 2 , U m 1 , 0 , Z k ) + α γ N U , S , Z w ( U m 2 , 0 , Z k ) + N U , S , Z v ( U m 2 , 0 , Z k ) + | C U m 2 | α γ - - - ( 5 )
Wherein
Figure BDA00002572908600098
User U M2The size of user's set of one-dimensional society's relation is arranged, and the practical significance of this formula is, if the user is U M2Much from theme Z kMark word or image concerned user U M1Affect, then U M1To U M2Social relationships on k theme are just stronger.Wherein, α Ω,
Figure BDA00002572908600101
α λ, α γBe the super parameter of symmetry of the corresponding Di Li Cray prior distribution of control, can when the denominator register N () of each formula value is zero, carry out smoothly simultaneously that these values need to manually be specified and regulate when realizing.
Step 4: Search Results is sorted according to resulting social relationships intensity.
Take the picture search problem as example, can analyze user's query word, the theme of query word q distributes and can be calculated by following formula:
p ( Z k | q ) = Π w i ∈ q p ( w i | Z k )
Wherein ∏ takes advantage of symbol, p (w for connecting i| Z k) expression mark word w iProbability in k theme, it obtains according to described user's theme space.Suppose that current search subscriber is u, c is its unidirectional relationship user, can obtain adaptive social relationships intensity Ψ according to the theme distribution of query word U, c(k) p (Z k| q).As weight, calculate the relevance scores of the picture that every width of cloth searches by the social relationships intensity that obtains, be used for final ordering.Relevance scores such as picture d can followingly be calculated:
R ^ ( q , u , d ) = R ( q , u , d ) + Σ c ∈ C u Σ k Ψ u , c ( k ) p ( Z k | q ) R ( q , c , d )
Wherein q represents query word, R (q, u, d) expression is for the correlativity of search subscriber u picture d and query word q, R (q, c, d) represent that then this correlativity can be calculated by any picture indices method or distance metric method for search subscriber u the user c picture d of one-dimensional society's relation and the correlativity of query word q being arranged.Wherein, the following calculating of described correlativity R (q, u, d):
R ( q , u , d ) = Σ k p ( Z k | u ) p ( Z k | q ) p ( Z k | d ) ,
Wherein, p (Z k| u) be the theme distribution of user u, expression user u uses the probability of k theme, is the Ω that calculates in the step 31 U, k, p (Z k| d) be the theme distribution of image, following calculating:
p ( Z k | d ) = Π v i ∈ d p ( v i | Z k ) ,
P (v wherein i| Z k) expression vision descriptor v iProbability in k theme.
The below is the implementation result according to method provided by the present invention.
In order to assess the present invention, the present invention has crawled 3,372 users' image, mark and has concerned user network information from picture sharing website Flickr, obtain altogether 124,099 of images, 30,108 of mark words.
Fig. 3 has shown two dark places in 20 theme spaces that obtain according to method provided by the present invention, and each theme has shown the highest mark word and maximally related five images of the first five ordering.Can find out, by considering simultaneously text marking word and visual pattern content, the theme that extracts by the inventive method has kept a lot of consistance at semantic concept and visual theme, and this provides advantage for the social relationships analysis of further carrying out the theme sensitivity.
Fig. 4 shown two test subscribers and on theme #2 and #13 to its user profile that has the greatest impact.The theme distribution intensity of the length respective user of grey blocks, this has reflected that the user distributes in the interest of corresponding theme.User's preference can be predicted by being presented at the following picture that likes best.Each is concerned the user, has provided its tagger's number among Fig. 4, with and the example images uploaded and mark cloud.Tagger's number can reflect their social influence power, uploads image has reflected their theme sensitivity with the mark cloud speciality.
Can find out that method provided by the present invention can be analyzed the social relationships intensity of theme sensitivity preferably.The strong social relationships user who finds by method provided by the present invention has more follower, and has shown stronger speciality at corresponding theme.Very large in the distribution of theme #2 such as user " 95386698@N00 ", it has carried out a lot of activities relevant with theme #2 to the blit picture as can be known with the mark cloud from it; On the other hand, according to the numerous follower's number of user " 26324110@N00 " with upload the professional of image, can infer roughly that it is the prevalent fashion aspect, i.e. the expert of theme #13.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; be understood that; the above only is specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (9)

1. social searching method that excavates based on multi-modal self-adaptation social relationships intensity, it is as follows that the method comprising the steps of:
Step 1: collect the pictorial information that the user uploads and the user that one-dimensional society's relation is arranged with it, each user corresponding one by uploading image collection, image labeling set and gathering the tlv triple that forms with its user that concerns by one-dimensional society's relation;
Step 2: according to the tlv triple of input, set up multi-modal probability production model, the generative process of the image labeling information in the image content in the described image collection and the image labeling set is inferred;
Step 3: calculate user theme space and the distribution of user's theme according to inferred results, calculate the social relationships intensity of the theme sensitivity between user and the user;
Step 4: Search Results is sorted according to resulting user's theme space, the distribution of user's theme and social relationships intensity.
2. the method for claim 1 is characterized in that, described step 1 comprises:
Step 11: to each user u, according to its social relation network collection has one-dimensional society to concern with it user, form set C u
Step 12: to each user u, gather the picture that it is uploaded, form set D u
Step 13: to each user u, gather it to the mark that the picture of uploading adds, form set T u
3. the method for claim 1 is characterized in that, described step 2 comprises:
Step 21: set up the multi-modal probability production model of theme sensitivity, simulate the generative process of picture and mark; Wherein by hidden variable is set the process of setting up multi-modal probability production model is described; Wherein, described hidden variable comprises switch hidden variable s, theme record hidden variable z and sampling user hidden variable c, and described switch hidden variable s represents to mark word and image to be the spontaneous generation of user or to be concerned customer impact and produce; The theme that described theme record hidden variable z represents to sample and obtains; The user that concerns that described user's hidden variable c represents to sample and obtains;
Step 22: find the solution described multi-modal probability production model, wherein infer the value that obtains described hidden variable by gibbs sampler.
4. method as claimed in claim 3 is characterized in that, described step 3 comprises:
Step 31: the value according to the described hidden variable that obtains is calculated theme space Φ and user's theme distribution Ω;
Step 32: according to the social relationships intensity Ψ of the theme sensitivity between described theme space Φ and user's theme distribution Ω calculating user and the user.
5. the method for claim 1 is characterized in that, described step 4 comprises:
Calculate the relevance scores of the picture that searches according to the social relationships intensity that obtains, described relevance scores is used for the ordering of net result, and wherein the computing formula of relevance scores is as follows:
R ^ ( q , u , d ) = R ( q , u , d ) + Σ c ∈ C u Σ k Ψ u , c ( k ) p ( Z k | q ) R ( q , c , d )
Wherein, q represents query word, R (q, u, d) expression is for the correlativity of search subscriber u picture d and query word q, R (q, c, d) represent that then k represents k theme, C for search subscriber u the correlativity that concerns user c picture d and query word q of one-dimensional society relation being arranged uExpression has the user that concerns of one-dimensional society's relation to gather with user u; Ψ U, c(k) expression concerns that user c is to the social relationships intensity of user u, p (Z k| q) theme of expression query word q distributes, and its computing formula is as follows:
p ( Z k | q ) = Π w i ∈ q p ( w i | Z k )
P (w wherein i| Z k) expression mark word w iProbability in k theme, it obtains according to described user's theme space.
6. method as claimed in claim 5 is characterized in that, concerns user U M1To user U M2The following calculating of social relationships intensity:
Ψ m 1 , m 2 ( k ) = N U , C , S , Z w ( U m 2 , U m 1 , 0 , Z k ) + N U , C , S , Z v ( U m 2 , U m 1 , 0 , Z k ) + α γ N U , S , Z w ( U m 2 , 0 , Z k ) + N U , S , Z v ( U m 2 , 0 , Z k ) + | C U m 2 | α γ
Wherein,
Figure FDA00002572908500024
Be and user U M2The size that concerns user's set that one-dimensional society's relation is arranged, N U , C , S , Z w ( U m 2 , U m 1 , 0 , Z k ) Expression user U M2The mark word in by concerning user U M1Impact results from theme Z kSample size; N U , C , S , Z v ( U m 2 , U m 1 , 0 , Z k ) Expression user U M2Upload in the vision descriptor of image by concerning user U M1Impact results from theme Z kSample size; N U , S , Z w ( U m 2 , 0 , Z k ) Expression user U M2Concern that by all customer impact results from theme Z in the word of mark kSample size; N U , S , Z v ( U m 2 , 0 , Z k ) Expression user U M2Upload in the vision descriptor of image and concern that by all customer impact results from theme Z kSample size; Wherein, α γIt is the super parameter of symmetry of the corresponding Di Li Cray prior distribution of control; Described social relationships intensity represents if the user is U M2Much from theme Z kMark word or image concerned user U M1Impact, then concern user U M1To user U M2Social relationships on k theme are stronger.
7. method as claimed in claim 5 is characterized in that, the described following calculating of correlativity R (q, u, d) for search subscriber u picture d and query word q:
R ( q , u , d ) = Σ k p ( Z k | u ) p ( Z k | q ) p ( Z k | d ) ,
Wherein, p (Z k| u) be the theme distribution of user u, expression user u produces the probability of k theme, p (Z k| d) be the theme distribution of image, following calculating:
p ( Z k | d ) = Π v i ∈ d p ( v i | Z k ) ,
Wherein, p (v i| Z k) expression vision descriptor v iProbability in k theme.
8. method as claimed in claim 3 is characterized in that, the multi-modal probability production model of described theme sensitivity comprises the generative process that marks word and the generative process of vision descriptor, and the generative process that wherein marks word is as follows:
At first sampling obtains switching variable: s from Bei Nuli distributes w~Bernoulli (λ);
If s w=0, then concern the user set that from user u sampling obtains one and concerns the user:
Figure FDA00002572908500031
From concerning the user
Figure FDA00002572908500032
Theme distribute Theme of middle sampling is recorded as variable
If s w=1, then from the theme distribution Ω of user u oneself uTheme of middle sampling is recorded as variable
Figure FDA00002572908500035
Distribute from the mark word of theme
Figure FDA00002572908500036
Middle sampling obtains marking word w U, i
Carry out in the same way the generative process of vision descriptor, wherein vision descriptor V U, iDistribute from the vision descriptor of theme Middle sampling produces.
9. method as claimed in claim 5 is characterized in that, and is described for having the user c picture d of one-dimensional society's relation and the correlativity R (q, c, d) of query word q to calculate by picture indices method or distance metric method with search subscriber u.
CN 201210535907 2012-12-12 2012-12-12 Social search method based on multi-mode self-adaptive social relation strength excavation Pending CN103020221A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201210535907 CN103020221A (en) 2012-12-12 2012-12-12 Social search method based on multi-mode self-adaptive social relation strength excavation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201210535907 CN103020221A (en) 2012-12-12 2012-12-12 Social search method based on multi-mode self-adaptive social relation strength excavation

Publications (1)

Publication Number Publication Date
CN103020221A true CN103020221A (en) 2013-04-03

Family

ID=47968825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201210535907 Pending CN103020221A (en) 2012-12-12 2012-12-12 Social search method based on multi-mode self-adaptive social relation strength excavation

Country Status (1)

Country Link
CN (1) CN103020221A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412872A (en) * 2013-07-08 2013-11-27 西安交通大学 Micro-blog social network information recommendation method based on limited node drive
CN103488769A (en) * 2013-09-27 2014-01-01 中国科学院自动化研究所 Search method of landmark information mined based on multimedia data
CN103886011A (en) * 2013-12-30 2014-06-25 安徽讯飞智元信息科技有限公司 Social-relation network creation and retrieval system and method based on index files
CN104133807A (en) * 2014-07-29 2014-11-05 中国科学院自动化研究所 Method and device for learning cross-platform multi-mode media data common feature representation
CN105740327A (en) * 2016-01-22 2016-07-06 天津中科智能识别产业技术研究院有限公司 Self-adaptive sampling method based on user preferences
CN107480194A (en) * 2017-07-13 2017-12-15 中国科学院自动化研究所 The construction method and system of the multi-modal automatic learning model of the representation of knowledge
CN110895561A (en) * 2019-11-13 2020-03-20 中国科学院自动化研究所 Medical question and answer retrieval method, system and device based on multi-mode knowledge perception

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103412872A (en) * 2013-07-08 2013-11-27 西安交通大学 Micro-blog social network information recommendation method based on limited node drive
CN103412872B (en) * 2013-07-08 2017-04-26 西安交通大学 Micro-blog social network information recommendation method based on limited node drive
CN103488769A (en) * 2013-09-27 2014-01-01 中国科学院自动化研究所 Search method of landmark information mined based on multimedia data
CN103886011B (en) * 2013-12-30 2017-04-12 讯飞智元信息科技有限公司 Social-relation network creation and retrieval system and method based on index files
CN103886011A (en) * 2013-12-30 2014-06-25 安徽讯飞智元信息科技有限公司 Social-relation network creation and retrieval system and method based on index files
CN104133807A (en) * 2014-07-29 2014-11-05 中国科学院自动化研究所 Method and device for learning cross-platform multi-mode media data common feature representation
CN104133807B (en) * 2014-07-29 2017-06-23 中国科学院自动化研究所 Learn the method and device that cross-platform multi-modal media data common trait is represented
CN105740327A (en) * 2016-01-22 2016-07-06 天津中科智能识别产业技术研究院有限公司 Self-adaptive sampling method based on user preferences
CN105740327B (en) * 2016-01-22 2019-04-19 天津中科智能识别产业技术研究院有限公司 A kind of adaptively sampled method based on user preference
CN107480194A (en) * 2017-07-13 2017-12-15 中国科学院自动化研究所 The construction method and system of the multi-modal automatic learning model of the representation of knowledge
CN107480194B (en) * 2017-07-13 2020-03-13 中国科学院自动化研究所 Method and system for constructing multi-mode knowledge representation automatic learning model
CN110895561A (en) * 2019-11-13 2020-03-20 中国科学院自动化研究所 Medical question and answer retrieval method, system and device based on multi-mode knowledge perception
CN110895561B (en) * 2019-11-13 2022-04-01 中国科学院自动化研究所 Medical question and answer retrieval method, system and device based on multi-mode knowledge perception

Similar Documents

Publication Publication Date Title
CN103020221A (en) Social search method based on multi-mode self-adaptive social relation strength excavation
Zhong et al. Comsoc: adaptive transfer of user behaviors over composite social network
Wan et al. A hybrid ensemble learning method for tourist route recommendations based on geo-tagged social networks
Poleto et al. The roles of big data in the decision-support process: an empirical investigation
Deitrick et al. Mutually enhancing community detection and sentiment analysis on twitter networks
Tabouy et al. Variational inference for stochastic block models from sampled data
CN108509551A (en) A kind of micro blog network key user digging system under the environment based on Spark and method
Zhang et al. Combining social network and collaborative filtering for personalised manufacturing service recommendation
Tran et al. Hashtag recommendation approach based on content and user characteristics
Li et al. Image sentiment prediction based on textual descriptions with adjective noun pairs
Walker Posterior sampling when the normalizing constant is unknown
Shang et al. A micro-video recommendation system based on big data
Zhou et al. Rank2vec: learning node embeddings with local structure and global ranking
Peters et al. Iterative multi-label multi-relational classification algorithm for complex social networks
Li et al. Improving search ranking of geospatial data based on deep learning using user behavior data
Kim et al. User interest-based recommender system for image-sharing social media
Memon et al. Social network data mining: Research questions, techniques, and applications
Liu et al. On the influence propagation of web videos
Liu Designing an English teaching resource’s information management system using collaborative recommendation
Fang et al. Toward establishing a knowledge graph for drought disaster based on ontology design and named entity recognition
Wu et al. Enhancing sequential recommendation via decoupled knowledge graphs
Joulin et al. A note on convex ordering for stable stochastic integrals
Dong et al. Deep attributed network embedding based on the PPMI
Harvey et al. Spatial cyberinfrastructure: building new pathways for geospatial semantics on existing infrastructures
Tai et al. Predicting information diffusion using the inter-and intra-path of influence transitivity

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130403