CN103020221A

CN103020221A - Social search method based on multi-mode self-adaptive social relation strength excavation

Info

Publication number: CN103020221A
Application number: CN 201210535907
Authority: CN
Inventors: 徐常胜; 桑基韬
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2012-12-12
Filing date: 2012-12-12
Publication date: 2013-04-03

Abstract

The invention discloses a self-adaptive social relation strength excavation method based on a multi-mode generative model. The method comprises the following steps of: collecting picture information uploaded by users and users which have social relationships with the picture information, and enabling each user to correspond to a triple consisting of an uploaded image set, an image annotation set and a social network; reversing the generation process of picture contents and annotations according to an input triple through the multi-mode generative model for concluding to obtain a theme space which can be used for describing user interest distribution and the theme distribution of the users; and calculating the obtained theme space and the user theme distribution to obtain the relation strength of theme sensitivity among users. The method is applied to self-adaptive multimedia retrieval and the like.

Description

Social search method based on multi-mode self-adaptive social relation strength mining

Technical Field

The invention relates to the field of multimedia search, in particular to a social search method based on multi-modal adaptive social relationship strength mining.

Background

Social Media (Social Media) has greatly changed the way and habits users share and obtain information. In social media services, users inevitably interact with other users to form communities, so-called social networks. The social network includes a bidirectional social relationship such as "association (Connect)" in linkedln and "Add Friend (Add Friend)" in Facebook, and a unidirectional social relationship such as "Follow (Follow)" in Twitter and "Subscribe (Subscribe)" in Youtube. These social relationships are believed to affect the behavior of users and the dynamic development of social networks. For example, colleagues in LinkedIn may influence a person's work choices, and friends in Facebook may influence a person's activities and needs in life. By analyzing and mining these social relationships, many important applications can be brought forward, such as viral marketing, collaborative recommendations, and collaborative information searches, among others. Taking the multimedia collaborative search based on the unidirectional social relationship as an example, the basic assumptions and the starting points are as follows: by analyzing the behaviors of other users having influence on the search user, the real demand of the search user can be predicted and the search result can be adjusted.

At present, methods for mining social relationships mainly focus on researching whether social relationships exist and predicting the strength of the social relationships. In many cases, the social relationship of binarization or continuity cannot meet the requirements of the application. As in the multimedia search problem, the social relationships between users are different for different search terms. Suppose that a user wants to search for a photo of Hawaii for own honey-month travel, a friend who has a special travel will help him the most, and hope that the social relationship between the friends is strengthened; when the same user searches for the photos of the fashion show, the user may want to influence the search results more for friends who have been studied in the past, i.e. the social relationship between them becomes stronger. We call this social relationship related to the problem as adaptive social relationship strength, and will introduce an adaptive social relationship strength mining method based on a multi-modal generative model in the present invention.

Disclosure of Invention

Technical problem to be solved

The invention aims to provide the method for searching the pictures according to the excavated self-adaptive social relation strength, and the method can automatically obtain the assistance from different users when the users have different requirements, thereby being beneficial to understanding and predicting the real requirements of the users and further accurately positioning the search results meeting the real requirements of the users.

(II) technical scheme

In order to solve the technical problem, the invention provides a social search method based on multi-modal adaptive social relationship strength mining, which comprises the following steps:

step 1: collecting picture information uploaded by users and users having one-way social relations with the picture information, wherein each user corresponds to a triple consisting of an uploaded image set, an image annotation set and a relation user set having one-way social relations with the image annotation set;

step 2: establishing a multi-mode probability generating model according to the input triple, and deducing the generation process of the picture content in the image set and the image annotation information in the image annotation set;

and step 3: calculating user theme space and user theme distribution according to the inference result, and calculating the social relationship strength of theme sensitivity between the users;

and 4, step 4: and sequencing the search results according to the obtained user theme space, the user theme distribution and the social relationship strength.

(III) advantageous effects

The invention adopts a multi-mode generating model to reversely deduce the observed social network of the user, the uploaded image of the user and the provided label, and provides a social relationship strength mining method with sensitive subject. The invention solves the problem of self-adaptive adjustment of the social relationship strength among different problems, wherein the social relationship strength in multimedia application can be better analyzed by simultaneously considering text annotation data and visual image characteristics; in addition, the method can simultaneously obtain the theme space, the theme distribution of the users and the strength of the relationship among the users on different themes.

Drawings

FIG. 1 is a flow chart of an adaptive social relationship strength mining method based on a multi-modal generative model according to the present invention;

FIG. 2 is a schematic diagram of a multi-modal generative topic model in accordance with the present invention;

FIG. 3 is a schematic diagram of the effect of the method provided by the present invention on a Flickr data set;

fig. 4 is another schematic diagram of the implementation effect of the method provided by the invention on the Flickr data set.

Detailed Description

In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.

The invention realizes a method for analyzing the social relationship strength with sensitive subjects in the social multimedia environment, and can adaptively adjust the social relationship strength according to different applications. Compared with the existing social relationship strength analysis method, on one hand, the social relationship strength with sensitive subject is obtained, and the social relationship strength can be adjusted according to the problem in a self-adaptive manner; on the other hand, by comprehensively considering text information and visual information, multimedia applications can be better served.

FIG. 1 is a flow chart of an adaptive social relationship strength mining method based on a multi-modal generative model according to the present invention. As shown in fig. 1, the method provided by the present invention comprises the following steps:

step 1: the method comprises an input preprocessing step, namely, collecting picture information uploaded by a user and the user who has a unidirectional social relationship with the user, wherein the unidirectional social relationship refers to directional social behaviors in a social media sharing website, such as contact in Flickr or follow-up in Twitter (follow). Each user corresponds to a triple consisting of a relationship user set having a one-way social relationship with the user, an image set uploaded by the user and an image annotation set, wherein the image annotation refers to original Tag (Tag) information provided by the user and describing an image;

step 2: a step of mining a multi-modal subject sensitive social relationship, namely reversely deducing the generation process of the picture content in the image set and the image annotation information in the image annotation set through a multi-modal generating model according to the input triple to obtain a subject space and user subject distribution which can describe the user interest distribution;

and step 3: and calculating output parameters, namely calculating the obtained theme space and the user theme distribution to obtain the theme sensitive relationship strength between the users.

The steps are described in detail below. The following table gives a list of key symbols used in the present invention and their corresponding descriptions.

Step 1: inputting a preprocessing step.

Firstly, describing a topic-sensitive social relationship strength mining problem by using a mathematical language:

definitions 1A set of users U in a given social media (e.g., Flickr), where each user U e U corresponds to a triplet { C ∈ U }_u，D_u，T_uIn which C is_u，D_u，T_uAnd respectively representing a relationship user set having a one-way social relationship with the user u, an image set uploaded by the user u and an image annotation set added by the user u.

The purpose of topic-sensitive social relationship strength mining is to learn:

(1) subject space

And

wherein phi^w，Φ^vThe distribution of the label words and the distribution of the visual descriptors of the topics are shown, w and v are the label words and the visual descriptor vectors of the users, R is a real number field, K is the total number of the topics, and K represents the kth topic. The labeling words of all users form a labeling dictionary, all visual descriptions form a visual description sub-dictionary, | W | is the size of the labeling dictionary, and | V | is the size of the visual description dictionary. The annotation word vector refers to a response, which is used by a user and is annotated on an annotation dictionary, such as w ═ landscape, travel, landscape,.. }, and the visual descriptor sub-vector refers to a response, which is uploaded by the user, of a picture on a visual descriptor dictionary, such as υ ═ descriptor 1, descriptor 5, descriptor 1,.. };

(2) topic distribution per user u

The probability of using each theme for the user can be understood as the interest distribution of the user;

(3) topic sensitive social relationship strength

k＝{1，...，K}，Recorded is user u₂For user u₁The strength of the social relationship on the kth topic,

note that: the strength of the social relationship here is unidirectional, i.e.

And

is different.

The preprocessing operation is to acquire and represent the three elements of the inputted triplet.

Step 11: set of relational users C_uCollection and pre-processing.

For each user u, collecting users with one-way social relations according to the social relation network thereof to form a set C_u。

Step 12: user uploading picture set D_uCollection and pre-processing.

For each user u, collecting the uploaded pictures, representing the pictures by using the response vectors of the pictures in the visual description sub-dictionary, and forming an image set D_u. The visual descriptor can adopt any bag-of-words (bag-of-words) feature, in a test experiment, the invention adopts maximum Stable extreme area feature (MSER) to describe the visual content of the image, and compared with the feature based on key points, the MSER feature describes the local homogeneous part in the image, has higher consistency and is more suitable for the problem background of the invention.

Step 13: user image annotation set T_uCollection and pre-processing.

In the media sharing website, the user canThe images uploaded by them are tagged for ease of management and description, i.e., labeling information. And collecting the added labeling information of each user u. The labeled information of each user is also represented by the response vector of the labeled information in the labeling dictionary, and a set T is formed_u。

Step 2: and mining the multi-modal topic sensitive social relationship.

The online behavior of the user is considered to be greatly influenced by the user who has social relationship with the user, and uploading and labeling the image in the image sharing website can be considered as observation of the online behavior of the user. Based on the method, the invention provides a multi-mode probability generating model, and the inherent social relationship structure is deduced through the generation process of simulated images and labels. The specific idea is as follows: it is assumed that for each user, the uploaded images and annotations are generated in two ways, either depending on the interests of the user or influenced by other users. Based on this assumption, the present invention proposes a multimodal topic-sensitive social relationship model, which comprises the following steps:

step 21: and establishing a multi-modal probability generating model sensitive to the subject.

Fig. 2 shows a schematic structural diagram of the generative model thus proposed. In the generative model, the arrows represent conditional relationship hypotheses, corresponding to sampling from the distribution; the circles represent variables, with the filled circles representing observed variables, and primarily include a set of social relationship users C_uUploading the image set D_uAnd an added image annotation collection T_u(ii) a The hollow circles represent hidden variables and mainly comprise switch hidden variables s, theme recording hidden variables z, sampling user hidden variables c, theme space variables phi and user theme distribution variables omega. The switch hidden variable s is used for controlling the generation process of the observation variable, and is specifically shown in the following; a theme record hidden variable z records a theme obtained by user theme distribution sampling; and recording the sampled one-way relation users by the sampling user hidden variable c.

In the invention, for image visual content, firstly, a first step is constructedA visual descriptor dictionary, and each image is then represented by a visual descriptor vector v constructed from its responses in the visual descriptor dictionary. Model introduction binary switch hidden variable s^wAnd s^vTo control and record whether a certain annotation word and visual descriptor is generated spontaneously by user u or influenced by other users. When s is^wWhen 1, the token is distributed omega according to the user's own theme_uGenerating; when s is^wWhen the value is 0, the user list C indicates that the annotation word is related_uA certain user c in^wAnd according to the distribution of topics influencing the user

And (4) generating.

Therefore, the generation process of the annotation word of the user u is as follows:

sampling from the bernoulli distribution yields the switching variables:

s^wbernoulli (λ), wherein λ controls the shape of the Bernoulli distribution;

if s^wIf 0, then an influencing user is obtained from the relational user list of user u by distribution sampling according to the polynomial:

wherein γ controls the shape of the polynomial distribution;

from influencing the userSubject distribution ofSampling a subject, recording as a variable

If s^w1, then Ω is distributed from the user u's own theme_uSampling a subject and recording the subject as a variable

Annotated word distribution from topics

The marking word w is obtained by middle sampling_u，i。

The visual descriptor generation process is similar, and the visual descriptor v_u，iIs distributed by visual descriptors of the subject

Is generated by sampling. Through the generative model established in this step, as shown in fig. 2, and the assumed generative process, gibbs sampling in step 22 may be performed to obtain values of various sampled implicit variables, so as to solve the generative model.

Step 22: and solving the multi-modal probability generating model sensitive to the subject.

As described in step 1, the input to the model is a set of users, each of which corresponds to a triplet { C }_u，D_u，T_u}. As described in step 21, a generative process is assumed for the input, and a series of hidden variables are introduced, and solving the model requires sampling and inferring values of the hidden variables through the observed input and the assumed generative process. Finally, we can perform output parameter calculation according to the sampled values of the hidden variables, which will be stated in step 3. The proposed generative model contains three classes of hidden variables: hidden variable s of switch^w，s^vSampling the user hidden variable c^w，c^vAnd subject record hidden variable z^w，z^vWherein s is^v、c^vAnd z^vThe method utilizes Gibbs (Gibbs) sampling to carry out the inference of model hidden variables and has strong relationship including subject space, user subject distribution and subject sensitivityAnd solving the model parameters including the degree.

When Gibbs sampling is used for solving the generative model, each iteration can obtain a value for each hidden variable sample, and the hidden variable value obtained by the last iteration sample can update the sample of the hidden variable in the next iteration. Each hidden variable is iteratively updated by fixing other variables, and the updating rule of the hidden variable related to the label word is as follows:

p (s_{i}^{w} = 0 | s_{- i}^{w}, u_{i}^{w}, c_{i}^{w}, z_{i}^{w}, \cdot) &Proportional; \frac{N_{U, S}^{w} (u_{i}^{w}, 0) + α_{λ} - 1}{N_{U}^{w} (u_{i}^{w}) {+ 2 α}_{λ} - 1} \cdot \frac{N_{U, Z}^{w} (c_{i}^{w}, z_{i}^{w}) + α_{Ω} - 1}{N_{U}^{w} (c_{i}^{w}) + {Kα}_{Ω} - 1}

p (s_{i}^{w} = 0 | s_{- i}^{w}, u_{i}^{w}, c_{i}^{w}, z_{i}^{w}, \cdot) &Proportional; \frac{N_{U, S}^{w} (u_{i}^{w}, 1) + α_{λ} - 1}{N_{U}^{w} (u_{i}^{w}) {+ 2 α}_{λ} - 1} \cdot \frac{N_{U, S, Z}^{w} (c_{i}^{w}, 1, z_{i}^{w}) + α_{Ω} - 1}{N_{U, S}^{w} (c_{i}^{w}) + {Kα}_{Ω} - 1}

p (c_{i}^{w} | c_{- i}^{w}, s_{i}^{w} = 0, u_{i}^{w}, z_{i}^{w}, C_{u_{i}^{w}}, \cdot) &Proportional; \frac{N_{U, C, S, Z}^{w} (u_{i}^{w}, c_{i}^{w}, 0, z_{i}^{w}) + α_{γ}}{N_{U, S, Z}^{w} (u_{i}^{w}, 0, z_{i}^{w}) + | C_{u_{i}^{w}} |} \cdot \frac{N_{U, Z}^{w} (c_{i}^{w}, z_{i}^{w}) + α_{Ω} - 1}{N_{U}^{w} (c_{i}^{w}) + {Kα}_{Ω} - 1}

p (z_{i}^{w} | z_{- i}^{w}, s_{i}^{w} = 0, w_{i}, \cdot) &Proportional; \frac{N_{U, Z}^{w} + (c_{i}^{w}, z_{i}^{w}) + α_{Ω} - 1}{N_{U}^{w} (c_{i}^{w}) + {Kα}_{Ω} - 1} \cdot \frac{N_{Z, W}^{w} (z_{i}^{w}, w_{i}) + α_{Φ^{w}}}{N_{Z}^{w} (z_{i}^{w}) + | W | α_{Φ^{w}}}

p (z_{i}^{w} | z_{- i}^{w}, s_{i}^{w} = 1, w_{i}, \cdot) &Proportional; \frac{N_{U, S, Z}^{w} + (c_{i}^{w}, 1, z_{i}^{w}) + α_{Ω} - 1}{N_{U, S}^{w} (c_{i}^{w}, 1) + {Kα}_{Ω} - 1} \cdot \frac{N_{Z, W}^{w} (z_{i}^{w}, w_{i}) + α_{Φ^{w}}}{N_{Z}^{w} (z_{i}^{w}) + | W | α_{Φ^{w}}} - - - (1)

wherein

Indicates the user to which the ith annotation word belongs,

the topic distribution of the ith annotation word is represented, so that the generalization of the model is ensured, and the learning complexity of the model is reducedIn the model learning process, all the prior are assumed to obey the symmetric Dirichlet distribution, parameters of the prior distribution parameters, namely hyper-parameters, are determined, and are expressed by subscripts of alpha plus corresponding variables, wherein alpha is_Ω，

α_λ，α_γRespectively, the symmetric hyper-parameters controlling the prior distribution of the corresponding dirichlet, which are manually specified and adjusted when implemented. And N (-) represents a counter and is used for representing the number of samples meeting certain conditions in the iterative sampling process. Such as

Representing a user

User by relationship in annotated words

Influence of

Arising from subject matter

The number of samples of (a);

representing a userInfluence (S ═ 0) in annotated words from topics by relational users

The number of samples of (a);representing a userThe number of samples in the annotated word that are affected by the relational user (S ═ 0);

representing a user

Number of the label words. Each counter comes from the gibbs sampling process: sampling to obtain the hidden variable meeting the condition of a certain counter, and adding one to the value of the counter. The variable update rules associated with the visual descriptor are similar. Switch hidden variable s obtained by the formula^w，s^vSampling user hidden variable c^w，c^vAnd subject record hidden variable z^w，z^vThe conditional probability distribution of (2) can obtain samples of all hidden variables at each iteration, and the iteration is repeated for updating.

And step 3: and calculating an output parameter.

The input of the step is the sampling value of each hidden variable obtained in the Gibbs sampling process of the step 2, and the output is three parameters which are obtained by the topic-sensitive social relationship strength mining problem defined in the step 1: topic space phi^wAnd phi^vTopic distribution Ω for each user u_uAnd the strength of the social relationship Ψ (k) for which the subject is sensitive.

Step 31: and calculating a theme space phi and a user theme distribution omega.

By Gibbs sampling, latent variables can be obtained

Of the sampling value(s). When the joint distribution of the hidden variable and the observed mark and the uploaded image is stable, namely the updating rule calculated according to the sampling value is most consistent with the observation data, the iterative updating process is converged. This process is similar to maximum likelihood estimation. At this time, the sampling values of all hidden variables obtained after convergence are counted, and a counter is updated, so that the topic space phi and the user topic distribution omega can be directly calculated. The label words of the subjectAnd visual descriptor distribution of subject matter^w，Φ^vRepresenting learned subspaces, assignable by sampled topics

And (6) performing calculation. Due to the fact thatWhat is actually described is the probability of producing the jth token in the kth topic, so it can be done by normalizing the counter N_Z，W(. obtaining) calculating phi^vThe process of (a) is similar, namely:

Φ_{k, j}^{w} = \frac{N_{Z, W}^{w} (Z_{k}, w_{j}) + α_{Φ^{w}}}{N_{Z}^{w} (Z_{k}) + | W | α_{Φ^{w}}} - - - (2)

Φ_{k, j}^{v} = \frac{N_{Z, W}^{v} (Z_{k}, v_{j}) + α_{Φ^{v}}}{N_{Z}^{v} (Z_{k}) + | V | α_{Φ^{v}}} - - - (3)

wherein Z_kRepresenting the kth topic. Mth user U_mThe topic distribution of (c) can be calculated as follows:

Ω_{m, k} = \frac{N_{U, S, Z}^{w} (U_{m}, 1, Z_{k}) + N_{U, S, Z}^{v} (U_{m}, 1, Z_{k}) + α_{Ω}}{N_{U, S}^{w} (U_{m}, 1) + N_{U, S}^{v} (U_{m}, 1) + {Kα}_{Ω}} - - - (4)

step 32: calculation of the strength Ψ of the subject-sensitive relationship between the users.

User U under k topic_m1To U_m2Strength of social relationship Ψ_m1，m2(k) Under the k theme, the user U_m2The label word/visual descriptor of is controlled by the user U_m1By number of influences, i.e. N_{U，C，S，Z}(U_m2，U_m1，0，Z_k)：

ψ_{m 1, m 2} (k) = \frac{N_{U, C, S, Z}^{w} (U_{m 2}, U_{m 1}, 0, Z_{k}) + N_{U, C, S, Z}^{v} (U_{m 2}, U_{m 1}, 0, Z_{k}) + α_{γ}}{N_{U, S, Z}^{w} (U_{m 2}, 0, Z_{k}) + N_{U, S, Z}^{v} (U_{m 2}, 0, Z_{k}) + | C_{U_{m} 2} | α_{γ}} - - - (5)

Wherein

Is a user U_m2The practical significance of this formula is if the user U has a one-way social relationship_m2Much from topic Z_kThe annotation word or image is a related user U_m1Influence, then U_m1To U_m2The social relationship on the kth topic is stronger. Wherein alpha is_Ω，

α_λ，α_γThe symmetric hyper-parameters corresponding to Dirichlet prior distribution are controlled, and meanwhile smoothing can be carried out when the value of a denominator recorder N (-) of each formula is zero, and the values need to be manually specified and adjusted during implementation.

And 4, step 4: and sequencing the search results according to the obtained social relationship strength.

Taking an image search question as an example, the query term of the user can be analyzed, and the topic distribution of the query term q can be calculated by the following formula:

p (Z_{k} | q) = \underset{w_{i} &Element; q}{Π} p (w_{i} | Z_{k})

where n is a continuous multiplication symbol, p (w)_i|Z_k) Denotes a label word w_iThe probability in the kth topic, which is derived from the user topic space. Assuming that the current searching user is u and c is the one-way relation user thereof, the self-adaptive social relation strength psi can be obtained according to the topic distribution of the query words_u，c(k)p(Z_k| q). And calculating the relevance score of each searched picture by taking the obtained social relationship strength as a weight for final sequencing. The relevance score for e.g. picture d can be calculated as follows:

\hat{R} (q, u, d) = R (q, u, d) + \underset{c &Element; C_{u}}{Σ} \underset{k}{Σ} Ψ_{u, c} (k) p (Z_{k} | q) R (q, c, d)

wherein q represents a query word, R (q, u, d) represents the correlation between the picture d and the query word q for the searching user u, and R (q, c, d) represents the correlation between the picture d and the query word q for the user c having a unidirectional social relationship with the searching user u, and the correlation can be calculated by any picture indexing method or distance measurement method. Wherein the correlation R (q, u, d) is calculated as follows:

R (q, u, d) = \underset{k}{Σ} p (Z_{k} | u) p (Z_{k} | q) p (Z_{k} | d),

wherein,p(Z_k| u) is the topic distribution of the user u, which represents the probability that the user u uses the kth topic, i.e. Ω calculated in step 31_u，k，p(Z_k| d) is the distribution of the subject of the image, calculated as follows:

p (Z_{k} | d) = \underset{v_{i} &Element; d}{Π} p (v_{i} | Z_{k}),

wherein p (v)_i|Z_k) Representing a visual descriptor v_iProbability in the kth topic.

The following are the effects of the implementation of the method provided according to the invention.

In order to evaluate the invention, the invention crawls images, labels and relationship user network information of 3 and 372 users from a picture sharing website Flickr to obtain 124,099 images and 30,108 label words.

Fig. 3 shows two dark places in the 20 subject spaces obtained by the method provided by the present invention, each subject showing the first five highest-ranked annotation words and the most relevant five images. It can be seen that by considering both text annotation words and visual image content, the topics extracted by the method of the invention maintain much consistency on semantic concepts and visual topics, which provides advantages for further topic-sensitive social relationship analysis.

Fig. 4 shows two test users and the user information that most affects them on topics #2 and # 13. The length of the gray block corresponds to the distribution intensity of the user's theme, which reflects the interest distribution of the user in the corresponding theme. The user's preferences may be predicted by the favorite pictures displayed below. For each relational user, the number of its followers, and its uploaded image examples and annotation clouds are given in FIG. 4. The number of followers can reflect their social influence, upload images and label clouds and reflect the speciality of their subject sensitivity.

Therefore, the method provided by the invention can better analyze the social relationship strength of the subject sensitivity. The strong social relation users found by the method provided by the invention have more followers and show stronger specialties in corresponding topics. If the user "95386698N 00" has a large distribution on topic #2, it can be known from its uploaded image and annotation clouds that it has performed a lot of activity related to topic # 2; on the other hand, based on the number of followers and the expertise of uploading images that the user "26324110N 00" is a large, it can be roughly inferred that it is an expert in fashion, namely topic # 13.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A social search method based on multi-modal adaptive social relationship strength mining comprises the following steps:

2. The method of claim 1, wherein step 1 comprises:

step 11: for each user u, collecting users with one-way social relations according to the social relation network thereof to form a set C_u；

Step 12: for each user u, collecting the uploaded pictures to form a set D_u；

Step 13: for each user u, collecting the labels added to the uploaded pictures by the user u to form a set T_u。

3. The method of claim 1, wherein the step 2 comprises:

step 21: establishing a multi-modal probability generating model sensitive to the subject to simulate the generating process of pictures and labels; wherein the process of establishing the multi-modal probability generative model is described by setting a hidden variable; the hidden variables comprise a switch hidden variable s, a subject recording hidden variable z and a sampling user hidden variable c, wherein the switch hidden variable s represents that the annotation words and the images are generated by users or influenced by related users; the theme record hidden variable z represents a sampled theme; the user hidden variable c represents a sampled relational user;

step 22: and solving the multi-mode probability generating model, wherein the value of the hidden variable is obtained through Gibbs sampling inference.

4. The method of claim 3, wherein step 3 comprises:

step 31: calculating a theme space phi and a user theme distribution omega according to the obtained values of the hidden variables;

step 32: and calculating the social relationship strength psi of the subject sensitivity between the users according to the subject space phi and the user subject distribution omega.

5. The method of claim 1, wherein step 4 comprises:

calculating the relevance score of the searched pictures according to the obtained social relationship strength, wherein the relevance score is used for ranking the final results, and the calculation formula of the relevance score is as follows:

\hat{R} (q, u, d) = R (q, u, d) + \underset{c &Element; C_{u}}{Σ} \underset{k}{Σ} Ψ_{u, c} (k) p (Z_{k} | q) R (q, c, d)

wherein, q is as followsShowing a query word, wherein R (q, u, d) represents the correlation between the picture d and the query word q for a searching user u, R (q, C, d) represents the correlation between the picture d and the query word q for a relation user C having a one-way social relation with the searching user u, k represents a kth subject, C_uRepresenting a relational user set with a one-way social relationship with a user u; Ψ_u，c(k) Representing the strength of the social relationship, p (Z), of the associated user c to user u_k| q) represents the topic distribution of the query word q, and the calculation formula is as follows:

p (Z_{k} | q) = \underset{w_{i} &Element; q}{Π} p (w_{i} | Z_{k})

wherein p (w)_i|Z_k) Denotes a label word w_iThe probability in the kth topic, which is derived from the user topic space.

6. The method of claim 5, wherein a relational user U_m1For user U_m2The social relationship strength of (a) is calculated as follows:

Ψ_{m 1, m 2} (k) = \frac{N_{U, C, S, Z}^{w} (U_{m 2}, U_{m 1}, 0, Z_{k}) + N_{U, C, S, Z}^{v} (U_{m 2}, U_{m 1}, 0, Z_{k}) + α_{γ}}{N_{U, S, Z}^{w} (U_{m 2}, 0, Z_{k}) + N_{U, S, Z}^{v} (U_{m 2}, 0, Z_{k}) + | C_{U_{m 2}} | α_{γ}}

wherein,

is connected with a user U_m2The size of a set of relational users with unidirectional social relationships,

N_{U, C, S, Z}^{w} (U_{m 2}, U_{m 1}, 0, Z_{k})

representing a user U_m2User U in relation to marked words_m1Influence arising from subject Z_kThe number of samples of (a);

N_{U, C, S, Z}^{v} (U_{m 2}, U_{m 1}, 0, Z_{k})

representing a user U_m2By the relational user U in the visual descriptor of the uploaded image_m1Influence arising from subject Z_kThe number of samples of (a);

N_{U, S, Z}^{w} (U_{m 2}, 0, Z_{k})

representing a user U_m2User influence by all relationships in annotated words resulting from topic Z_kThe number of samples of (a);

N_{U, S, Z}^{v} (U_{m 2}, 0, Z_{k})

representing a user U_m2All relational user influence in the visual descriptor of the uploaded image results from topic Z_kThe number of samples of (a); wherein alpha is_γIs a symmetric hyper-parameter controlling the prior distribution of the corresponding dirichlet; the social relationship strength represents if the user U_m2Much from topic Z_kThe annotation word or image is a related user U_m1Influence, then relationship user U_m1For user U_m2The social relationship on the kth topic is strong.

7. The method of claim 5, wherein the relevance R (q, u, d) of picture d to query term q for searching user u is calculated as follows:

R (q, u, d) = \underset{k}{Σ} p (Z_{k} | u) p (Z_{k} | q) p (Z_{k} | d),

wherein, p (Z)_k| u) is the topic distribution of user u, representing the probability that user u produces the kth topic, p (Z)_k| d) is the distribution of the subject of the image, calculated as follows:

p (Z_{k} | d) = \underset{v_{i} &Element; d}{Π} p (v_{i} | Z_{k}),

wherein, p (v)_i|Z_k) Representing a visual descriptor v_iProbability in the kth topic.

8. The method of claim 3, wherein the topic-sensitive multi-modal probabilistic generative model comprises a generation process of a tokenization word and a generation process of a visual descriptor, wherein the generation process of the tokenization word is as follows:

firstly, sampling is carried out from the Bernoulli distribution to obtain a switching variable: s^w～Bernoulli(λ)；

If s^wIf 0, a relational user is sampled from the relational user set of user u:

from relational users

Subject distribution ofSampling a subject, recording as a variable

Annotated word distribution from topics

The marking word w is obtained by middle sampling_u，i；

The generation process of the visual descriptor V is performed in the same manner_u，iVisual descriptor distribution from a topicAnd generating intermediate samples.

9. The method according to claim 5, wherein the relevance R (q, c, d) of the picture d to the query word q for the user c who has a one-way social relationship with the searching user u is calculated by a picture indexing method or a distance measurement method.