CN103577549B - Crowd portrayal system and method based on microblog label - Google Patents

Crowd portrayal system and method based on microblog label Download PDF

Info

Publication number
CN103577549B
CN103577549B CN201310481674.3A CN201310481674A CN103577549B CN 103577549 B CN103577549 B CN 103577549B CN 201310481674 A CN201310481674 A CN 201310481674A CN 103577549 B CN103577549 B CN 103577549B
Authority
CN
China
Prior art keywords
label
user
microblog
crowd
method based
Prior art date
Application number
CN201310481674.3A
Other languages
Chinese (zh)
Other versions
CN103577549A (en
Inventor
阳德青
肖仰华
汪卫
Original Assignee
复旦大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 复旦大学 filed Critical 复旦大学
Priority to CN201310481674.3A priority Critical patent/CN103577549B/en
Publication of CN103577549A publication Critical patent/CN103577549A/en
Application granted granted Critical
Publication of CN103577549B publication Critical patent/CN103577549B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Abstract

The invention belongs to the technical field of wireless communication networks and particularly discloses a crowd portrayal system and a crowd portrayal method based on a microblog label. The system comprises two main modules of a microblog label recommendation module and a label theme clustering module; the first module adopts a label recommendation algorithm covering three steps, wherein a first step is homogeneous label recommendation, a second step is co-occurrence label extension, a third step is that a semantic network is built on the basis of a Chinese mapping knowledge domain, the semantic similarity between labels is measured by using a network topology property, the labels with same or similar semantics are thus removed, and the refining property of the label used for portraying a user is ensured. According to the system and the method, the condition that the commercial application value of the label of the microblog user is wide is utilized, and the research direction is indicated for the mining algorithm of labels of internet users and the application of the Chinese mapping knowledge domain.

Description

A kind of crowd portrayal system and method based on microblog label

Technical field

The invention belongs to cordless communication network technical field is and in particular to a kind of crowd portrayal system based on microblog label And method.

Background technology

Microblogging is social media the most popular at this stage, user's rule of the domestic microblogging website with Sina, Tengxun as representative Mould was skyrocketed through in recent years.Taking Sina weibo as a example, by the end of the end of the year 2012, its register user number alreadys exceed 500,000,000, And daily active users have just broken through 46,200,000 people.With the fast development of microblogging, and the various applications of its correlation and Increasing commercial value is being created in service.

Allow user using the critical services that personality label is that the main flow microblogging website such as Sina and Tengxun provides, Yong Huke With described using these labels oneself identity, occupation, interest, the personal attribute such as religions belief, even express oneself Some viewpoints.As the important supplement of individual subscriber attribute description, the label of user is together with other contents of the personal information page Can act as the important information source that viewer understands this user in detail.And understand a user in depth and then a lot of reality are answered With helpful, such as special user's search, customer relation management of friend recommendation, online advertisement and enterprise etc..Therefore, for every Individual microblog users recommend one group of label being capable of accurate description user's association attributes to be used for portraying the feature of different user colony, or It is referred to as user's crowd portrayal, significant.

Existing society labeling system is mostly directed to the mark of network object, the such as picture in Flickr website or URL Link resources.These systems are all to provide the user label to mark object, and non-user itself.As much with collaborative filtering (collaborative filtering) [1] is that the proposed algorithm of basic recommendation mechanism is all built upon such a hypothesis On.Same or analogous label is employed to the mark of a lot of subject before assuming user A and user B, just explanation A Closely similar to the cognition of things with B, then A is then probably used to this object using B for the mark of a new object Same label.But, in the mark to microblog users, this assume and be false, user only can to oneself mark, or even A lot of users are to oneself not using any label.The method that therefore existing Social Label commending system uses can not be straight Scoop out the scene for microblog users label

Society's mark is more valued most people to do label recommendations to collective's preference of jobbie object, and marks microblogging User but will consider how truly to portray the personal preference being marked user.Obviously, personal characteristics and the preference of user are excavated It is more difficult to than the preference finding popular, because everyone has the individual character of oneself uniqueness.

Content of the invention

It is an object of the invention to by designing an effective microblog users commending system, providing one kind to be based on microblogging mark The crowd portrayal system and method signed.It needs to solve following problem.

1st, do a difficult problem --- the cold start-up needing during label recommendations often to meet in the face of collaborative filtering class proposed algorithm (cold start) problem, is especially considering that the people having nearly half in Sina weibo user does not have any label.So-called Cold start-up, refers to when recommending certain emerging object (as certain part commodity) to a user, due to not having for this object The recommendation record of history may be referred to so that proposed algorithm cannot work.

2nd, second challenge be intended to could sufficiently to portray a people in view of the variation enough of recommended label multi-party The attribute in face, because a real people is more than an object complexity, he can be respectively described oneself using much individual labels Star of the characteristic of various aspects, such as education background, hobby, even worship etc..How just can find out more than one group The label of sample is very challenging work.

3 do semantic redundancy problem present in the label that should be noted that recommendation during label recommendations.As Sina weibo only permits Permitted a user and at most used 10 labels, then each user is wished that each label describes oneself as much as possible naturally, right In general it is will not to put in the set of tags of oneself in synonym or near synonym simultaneously.By contrast, for same thing The description of body object, is then very common using synonym or near synonym label.Thus, push away in the label of these mark objects Recommend in system, semantic redundancy problem can be avoided.

For the technical barrier needing solution above and research purpose, the present invention is on the basis of conventional correlational study technology On, incorporate brand-new algorithm idea, and the Internet semantic entity information by magnanimity(Chinese knowledge mapping)Achieve accurately Microblog users crowd portrayal technology.

The present invention provide a kind of crowd portrayal system based on microblog label, be broadly divided into microblog users label recommendations and The big module of label Subject Clustering two, wherein:

Described microblog users label recommendations module, the co-occurrence of the homogeneity and label that are utilized respectively microblog users associates Produce the label of candidate, recycle Chinese knowledge mapping to identify the semantic redundancy of label, and then eliminate the redundancy in candidate's label Label, realizes microblog users and recommends;

Described label Subject Clustering module, is gathered by the label recommending each microblog users is carried out with LDA theme Alanysis, obtains the theme distribution vector of each user, thus judging the crowd belonging to user and the difference between measure user Different in nature distance, to portray the feature of different user colony, realizes user's crowd portrayal.

Fig. 1 show the overall technology framework of the present invention.

In the present invention, the proposed algorithm being related in the label recommendations module of microblog users covers three main job steps Suddenly.Each step of algorithm just should right each difficult problem in challenging above-mentioned.

Three steps of the label recommendations algorithm being related to are summarized as follows:

1. homogeneity is recommended:This step is to solve the problems, such as cold start-up.Core concept is for microblog users, Label recommendations most-often used for his microblogging good friend are given him.When screening and sequencing is carried out to candidate's label, it is possible to use frequency(The most label of usage frequency)、tf-idf(Word frequency-fall document frequency)With tri- kinds of marking mechanism of tf-rw.Through Empirical research, preferably by tf-rw marking mechanism, this mechanism is equally based on tf-idf thought, can further remove Those are excessively by the widely used label of total user, and retain those and have the label that individual character portrays ability to targeted customer. In addition to this several label marking mechanism, present invention additionally comprises a kind of multi-tag propagation algorithm of local to generate for targeted customer The label recommended.

2. co-occurrence extension:On the basis of this module recommends label in the first step, also utilize the cooccurrence relation between label To expand the candidate's label recommending microblog users, so that the consequently recommended label to user has enough multiformity, to the greatest extent Possibly portray the many attributes of user.

3. eliminate semantic redundancy:In order to eliminate semantic redundancy present in candidate's set of tags, swash from online encyclopaedia website The semantic entity data taking constructs a huge Chinese knowledge mapping (Chinese knowledge graph).By inciting somebody to action Microblog label is mapped to the entity in collection of illustrative plates network, measures out the semantic distance between two labels, i.e. similarity degree.Finally, System differentiates using this semantic similarity to recommend the synonymous label in list of labels or closely adopted label.

In the present invention, the microblog users label LDA instrument that a upper module is produced by label Subject Clustering module(A kind of raw The text subject model of an accepted way of doing sth)[2] cluster analyses are carried out, each class can represent a theme or a user group, from And judge the class belonging to every user(Crowd).

A kind of crowd portrayal method based on microblog label is also provided, it is by recommending label for microblog users in the present invention Precisely to portray the attribute character of user, and to judge user exactly using after the theme distribution of LDA tool analysis user tag Affiliated crowd;Comprise the following steps that:

(1)The co-occurrence of the homogeneity and label that are utilized respectively microblog users associates the label to produce candidate, recycles Chinese knowledge mapping, to identify the redundancy label in the semantic redundancy of label, and then elimination candidate's label, realizes microblog users mark Sign and recommend;

(2)LDA Subject Clustering analysis is carried out to the label recommending each microblog users, thus obtaining each user Theme distribution vector, thus judge the crowd belonging to user and the diversity distance between measure user to portray different user The feature of colony, realizes user's crowd portrayal.

Above-mentioned steps(1)In, recommend the method for label specific as follows using the homogeneity of microblog users:Excavate microblog users The most used label of good friend, recommend the widely used label of its good friend for each microblog users, and remove those excessively The label frequently using.

Above-mentioned steps(1)In, recommend the method for label specific as follows using the co-occurrence of label:The mark recommended with homogeneity Based on label, expand and be used collectively more label with it so that user each side more galore portrayed by the label recommended The attribute in face.

Above-mentioned steps(1)The method of the middle redundancy label step eliminating in candidate's label is specific as follows:Known by setting up Chinese Know collection of illustrative plates, the entry of encyclopaedia class website is mapped to the node in collection of illustrative plates, the hyperlink between entry maps out network edge so that net The topological structure of network can be with the semantic domain of measurement labels, thus with the presence or absence of semantic superfluous in judging the label of Candidate Recommendation Remaining.

The good friend of above-mentioned microblog users is " vermicelli ", " concern " or " mutual powder object " of microblog users;In the algorithm It is preferably used " mutual powder object " as microblogging good friend.

Above-mentioned with regard to the diversity distance between user be Cosine distance, Pearson distance or Jensen- Shannon distance.

The beneficial effects of the present invention is:

1st, using microblog users label, crowd portrayal is done to Internet user first.

2nd, the semantic redundancy in label is identified first using Chinese knowledge mapping.

3rd, first by the tf-idf to key word in information retrieval(Word frequency-fall document frequency)Marking mechanism is applied to microblogging The excavation of label, and algorithm is done with corresponding improvement raising label recommendations accuracy rate.

4 tag sets that LDA text subject model is applied to microblog users first, thus depict each microblogging use The theme distribution at family, for accurate crowd portrayal service.

The present invention is to realize the successful case of concrete application service using microblog users label, not only has extensive business Using value, is also the mining algorithm of Internet user's label simultaneously and the application of Chinese knowledge mapping specifies research direction.

Brief description

Fig. 1 is the overall technology framework of the present invention.

Fig. 2 is the Chinese knowledge mapping example of label(Scope shown in figure right part rectangular broken line frame).

Fig. 3 is the specific example of the embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawings and examples the present invention is described in further details.

The present invention provide a kind of crowd portrayal system based on microblog label, include microblog users label recommendations module with Label Subject Clustering module two nucleus module.Introduce the present invention below in conjunction with module.

Module one:Microblog users label recommendations

1st, it is based on homogeneous label recommendations

Homogeneity refers to that specifically comparing other people between the people of same or similar attribute is more prone to social communication's behavior, As become good friend, theme following behavior etc..Homogeneity has been viewed as what one of all kinds of social media were widely present Phenomenon, or even in the community network of such as Twitter user composition.For example, show more in the Twitter user of concern mutually Hobby, geographical position or power of influence etc. as multiphase.It is demonstrated experimentally that in microblogging community network, having intimate The label using between the user of social relations (as microblogging good friend) has significantly similarity.This result is based on same Matter label marking mechanism provide fact basis, that is, from the good friend colony of user according to certain in ordering mechanism select frequency Numerous using label recommend candidate's set of tags of targeted customer u(It is assumed to be k label).Candidate's label that this step produces Aggregated label is C, and it is using the input as next recommendation step.Here needs score function s (t) to carry out the mark to candidate Sign t to be ranked up, then select come foremost k.Also to ensure that the label selected has certain descriptive, that is, simultaneously Can not be by the excessively widely used label of most of users.The present invention calculates s (t) using the marking mechanism of entitled tf-rw The score value of function, i.e. s (t)=tf (t) × rw (t).The concrete calculating reference equation 1 below and 2 of tf (t) and rw (t), The tf-idf unity of thinking that its core concept is used with document keyword retrieval.In equation 1, Ngb (u) represents that certain microblogging is used The neighborhood of family u(Mutual powder object in microblogging), | Ngb (u) | represents the number in this set.R (t) is in neighborhood User employ label t person-time, and T (Ngb (u)) then represents the tag population set that all neighbours of u use.Formula 2 In, n (t) is the number employing label t in total user, and N is then total user sum.

Formula 1:

Formula 2:

Except above-mentioned tf-rw marking mechanism carrys out the candidate's label to the targeted customer to be recommended that sorts out, the present invention is also One kind is proposed based on classical label propagation algorithm(Abbreviation LPA)[3] algorithm is generating candidate's set of tags.This algorithm be one anti- The algorithm of multiple iteration, basic process is as follows:

1) for a targeted customer u, firstly generate ego network G u of u, all nodes of this network are that u owns with him Mutually powder good friend, when being the relation existing between these points, follow-up label is propagated and is all confined in this ego network;

2) in Gu either with or without true tag node(User)Generated using above-mentioned tf-rw method with u Its set of tags, the remaining user containing true tag then retains its label, and this step is equivalent to an iteration;

3) repeat the above steps 2), until the set of tags of user u(Containing k candidate's label)No longer change, i.e. iteration convergence;

The achievement in research having had forefathers proves that this algorithm can be restrained in finite iteration number of times, and therefore this algorithm is agreed Surely can terminate within a certain period of time.It is confined to the ego net of targeted customer in view of improvement LPA algorithm proposed by the present invention Network, and multiple labels can be produced, therefore it is referred to asLocal multi-tag propagation algorithm.

2nd, the label recommendations based on co-occurrence extension

For each of C label t(Total k), the association mining of co-occurrence label is carried out to it, then selects and t Cooccurrence relation front q label the strongest, is labeled as ti.Here, represent there is, with label t, the label that co-occurrence associates with st (ti) score value The co-occurrence intensity of tj, specific formula for calculation is referring still to formula 2.In all labels having co-occurrence to associate with t, choose st (ti) score value comes in the expanded list of ti addition t of front q position.Here t is referred to as father's label of ti, is represented with p (ti).If Extension label ti out Already in then can directly ignore in C.Through the extension of this step, k × q at most can be increased Label is in C.If the Candidate Recommendation tag set obtaining to represent this step end-of-job with C ' after, then C ' C then generation New extension label out in this step of table.In addition it is also necessary to resequence to each of C ' label after C ' produces, because It comprises by two kinds of labels recommending ordering mechanism to obtain.The core thinking of rearrangement seeks to ensure increasing newly in C ' C Label in C that the back of the alignment score sum of label obtains set was both competitive, be less than again his father's label sequence Scoring.Accordingly, it would be desirable to a new alignment score function is defined to each ti ∈ C '.

Formula 3:

In formula, λ is attenuation quotient, general value 0.8, and Z is normalization factor, is allSummation.

3rd, eliminate the semantic redundancy of label

This step is in short, be using the entry obtaining from online Chinese encyclopaedia website and entry link information structure first Produce a Chinese knowledge mapping (also comprising much English entries), this collection of illustrative plates can be regarded as a semantic network, network again Each of node represent the semantic entity of entry description, just correspond to a label, and every a line then represent entry Between hyperlink relation(See the part of the right rectangular broken line frame in Fig. 2).Therefore, the neighbours colony of an entry correspondence node The semantic content of this entry can be reflected to a great extent.In conjunction with the topological structure of this semantic network, the technology of the present invention is contained Cover a kind of computational methods of semantic distance between two labels (node) of precisive, thus determining two labels semantically Whether close enough, it is that one of them is removed candidate's list of labels of recommendation.Two node u in collection of illustrative plates, the semanteme between v Determined by the Jaccard coefficient of neighborhood apart from sim (u, v), that is,

Formula 4:

sim(u,v)=|Nu∩Nv|/|Nu∪Nv|

Wherein Nu represents the neighborhood of node u, and | Nu ∩ Nv | represents public neighbours' number of u and v.

One appropriate threshold τ be can interpolate that by training dataset(It is the discovery that 0.028 through experiment)If, sim (u, v) >=τ is then it is assumed that u and v is it should take one with semantic or highly approximately semantic label.If u is in above-mentioned co-occurrence mark Sign the score in spread stepU () is more than the score of v, then remove v from final recommendation set of tags and retain u.As " tourism " " travelling ", the suggestion that " Christ " and " Jesus " passes through this algorithm finds to be exactly with/near synonym(Label).

Below, three steps producing Candidate Recommendation label to be described with the specific example in Fig. 3.At the beginning, user u does not have There is label, his three neighbours(Microblogging mutual powder object)User v1, v2, v3 have the label of oneself.Base according to above-mentioned steps 1 This thought because the label of " tourism ", " travelling " and " photography " in neighbours colony using more frequent, all through the 1st step Can using these three labels as user u candidate's set of tags.In step 2, because " cuisines " label often and " tourism " label One piece of use(A lot of tour pals like tasting the cuisines on travelling ground), therefore " cuisines " label also can be expanded enter u candidate mark Label group.The last semanteme that passes through in step 3 judges, " tourism " and " travelling " is synonym, and " travelling " label is because the low quilt of score value Screen out.

Module two:Label Subject Clustering

Since recommending the label can accurately, galore portray each attribute spy of microblog users through above-mentioned steps Levy, then carry out then judging user group's distribution of this microblog users after subject analysis to the tag set of all users. Present invention specific algorithm as used herein is to carry out cluster analyses using LDA instrument to the theme distribution in tag set, for every Individual microblog users produce a corresponding theme distribution vector [v1, v2 ... vk].Wherein, k is theme sum, and vectorial is each Dimension 0<=vi<=1, represent the probability that user belongs to theme i.Using the theme distribution vector of user, can judge belonging to user The feature such as crowd or hobby is it is also possible to quantitatively calculate the diversity distance between any user, thus completing crowd The target of portrait.With regard to the diversity distance between user, it is possible to use Cosine distance, Pearson distance or Jensen- Shannon distance.

List of references

[1] T. Hofmann. Collaborative filtering via gaussian probabilistic latent semantic analysis. 
InProc. of SIGIR, 2003

[2]D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993 - 1022, Jan. 2003.

[3]X.Zhu and Z.Ghahramani. Learning from labeledandunlabeleddata with label propagation.Technical Report, 2002.

Claims (6)

1. a kind of user's crowd portrayal method based on microblog label it is characterised in that its by for microblog users recommend label Precisely portray the attribute character of user, and using judging user institute after the theme distribution of LDA tool analysis user tag exactly The crowd belonging to;Comprise the following steps that:
(1)The co-occurrence of the homogeneity and label that are utilized respectively microblog users associates the label to recommend producing candidate, recycles Chinese knowledge mapping identifies the semantic redundancy of label, and then eliminates the redundancy label in candidate's label, realizes microblog users label Recommend;
(2)LDA Subject Clustering analysis is carried out to the label recommending each microblog users, the theme obtaining each user divides Cloth vector, thus judges the crowd belonging to user and the diversity distance between the measure user spy to portray different user colony Levy, realize user's crowd portrayal;Wherein:Step(1)The method of the middle redundancy label step eliminating in candidate's label is specific as follows:Logical Cross the Chinese knowledge mapping of foundation, the entry of encyclopaedia class website is mapped to the node in semantic network, the hyperlink between entry is reflected Project network edge so that the topological structure of network can be with the semantic domain of measurement labels, thus judging the label of Candidate Recommendation In whether there is semantic redundancy.
2. the user's crowd portrayal method based on microblog label according to claim 1 is it is characterised in that step(1)In, Recommend the method for label specific as follows using the homogeneity of microblog users:Excavate the most used mark of the good friend of microblog users Sign, recommend the widely used label of its good friend for each microblog users, and remove those excessively frequent labels using.
3. the user's crowd portrayal method based on microblog label according to claim 1 is it is characterised in that step(1)In, Recommend the method for label specific as follows using the co-occurrence of label:Based on the label that homogeneity is recommended, expand and its quilt It is used in conjunction with more label so that the attribute of user's each side more galore portrayed by the label recommended.
4. the user's crowd portrayal method based on microblog label according to claim 2 it is characterised in that:Described microblogging is used The good friend at family is vermicelli, concern or mutual powder object.
5. the user's crowd portrayal method based on microblog label according to claim 2 it is characterised in that:Adopt during excavation Algorithm includes the multi-tag propagation algorithm of local, and tri- kinds of labels marking mechanism of frequency, tf-idf and tf-rw.
6. the user's crowd portrayal method based on microblog label according to claim 1 is it is characterised in that between described user Diversity distance be Cosine distance, Pearson distance or Jensen-Shannon distance.
CN201310481674.3A 2013-10-16 2013-10-16 Crowd portrayal system and method based on microblog label CN103577549B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310481674.3A CN103577549B (en) 2013-10-16 2013-10-16 Crowd portrayal system and method based on microblog label

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310481674.3A CN103577549B (en) 2013-10-16 2013-10-16 Crowd portrayal system and method based on microblog label

Publications (2)

Publication Number Publication Date
CN103577549A CN103577549A (en) 2014-02-12
CN103577549B true CN103577549B (en) 2017-02-15

Family

ID=50049325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310481674.3A CN103577549B (en) 2013-10-16 2013-10-16 Crowd portrayal system and method based on microblog label

Country Status (1)

Country Link
CN (1) CN103577549B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103995820B (en) * 2014-03-06 2019-04-16 吉林大学 Individual subscriber moral character multiple labeling prediction technique based on lowest threshold
CN103970863B (en) * 2014-05-08 2017-12-19 清华大学 The method for digging and system of microblog users interest based on LDA topic models
CN104199838B (en) * 2014-08-04 2017-09-29 浙江工商大学 A kind of user model constructing method based on label disambiguation
CN104598588B (en) * 2015-01-19 2017-08-11 河海大学 Microblog users label automatic generating calculation based on double focusing class
CN104778605B (en) * 2015-04-09 2019-05-03 北京京东尚科信息技术有限公司 The classification method and device of electric business client
CN106407239A (en) * 2015-08-03 2017-02-15 阿里巴巴集团控股有限公司 Methods and apparatuses used for recommending information and assisting in recommending information
CN105117449B (en) * 2015-08-14 2019-08-16 百度在线网络技术(北京)有限公司 A kind of method and apparatus for generating the label of content item
CN105893406A (en) * 2015-11-12 2016-08-24 乐视云计算有限公司 Group user profiling method and system
CN105574098B (en) * 2015-12-11 2019-02-12 百度在线网络技术(北京)有限公司 The generation method and device of knowledge mapping, entity control methods and device
CN105719189B (en) * 2016-01-15 2019-12-27 天津大学 Label recommendation method for effectively improving label diversity in social network
CN107402932A (en) * 2016-05-20 2017-11-28 腾讯科技(深圳)有限公司 Extension processing method, the text of user tag recommend method and apparatus
CN106484764A (en) * 2016-08-30 2017-03-08 江苏名通信息科技有限公司 User's similarity calculating method based on crowd portrayal technology
CN106649730A (en) * 2016-12-23 2017-05-10 中山大学 User clustering and short text clustering method based on social network short text stream
CN107038261B (en) * 2017-05-28 2019-09-20 海南大学 A kind of processing framework resource based on data map, Information Atlas and knowledge mapping can Dynamic and Abstract Semantic Modeling Method
CN107330001A (en) * 2017-06-09 2017-11-07 国政通科技股份有限公司 The creation method and system of a kind of diversification label
CN107463703A (en) * 2017-08-16 2017-12-12 电子科技大学 English social media account number classification method based on information gain
CN107562917A (en) * 2017-09-12 2018-01-09 广州酷狗计算机科技有限公司 User recommends method and device
CN110431585A (en) * 2018-01-22 2019-11-08 华为技术有限公司 A kind of generation method and device of user's portrait

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008020663A1 (en) * 2006-08-17 2008-02-21 Olaworks, Inc. Methods for tagging person identification information to digital data and recommending additional tag by using decision fusion
CN101751448A (en) * 2009-07-22 2010-06-23 中国科学院自动化研究所 Commendation method of personalized resource information based on scene information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008020663A1 (en) * 2006-08-17 2008-02-21 Olaworks, Inc. Methods for tagging person identification information to digital data and recommending additional tag by using decision fusion
CN101751448A (en) * 2009-07-22 2010-06-23 中国科学院自动化研究所 Commendation method of personalized resource information based on scene information

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
祁奇.基于标签数据的用户协同过滤.《中国优秀硕士学位论文全文数据库 信息科技辑》.2012, *
葛艳艳.基于社会标签系统的推荐技术研究.《中文优秀硕士学位论文全文数据库 信息科技辑》.2012, *
许振亮,郭晓川.国际技术创新研究前沿领域的知识图谱分析_作者共被引网络与聚类分析视角.《科学学研究》.2011,第29卷(第11期), *
陈渊、林磊、孙承杰、刘秉权.一种面向微博用户的标签推荐方法.《智能计算机与应用》.2011,第1卷(第3期), *

Also Published As

Publication number Publication date
CN103577549A (en) 2014-02-12

Similar Documents

Publication Publication Date Title
Marine-Roig et al. Tourism analytics with massive user-generated content: A case study of Barcelona
Bozzon et al. Choosing the right crowd: expert finding in social networks
Sun et al. Mining heterogeneous information networks: principles and methodologies
JP5230751B2 (en) A recommendation system using social behavior analysis and vocabulary classification
US8086605B2 (en) Search engine with augmented relevance ranking by community participation
Moghaddam et al. On the design of LDA models for aspect-based opinion mining
AU2010330720B2 (en) System and method for attentive clustering and related analytics and visualizations
CN105378764B (en) Interactive concept editor in computer-human&#39;s interactive learning
US20110060983A1 (en) Producing a visual summarization of text documents
Zhao et al. Connecting social media to e-commerce: Cold-start product recommendation using microblogging information
Liu et al. Analyzing changes in hotel customers’ expectations by trip mode
He et al. Trirank: Review-aware explainable recommendation by modeling aspects
Liu et al. Learning geographical preferences for point-of-interest recommendation
US8676732B2 (en) Methods and apparatus for providing information of interest to one or more users
Serdyukov et al. Modeling documents as mixtures of persons for expert finding
Szpektor et al. Improving recommendation for long-tail queries via templates
Sharma et al. A comparative analysis of web page ranking algorithms
Murata et al. Link prediction based on structural properties of online social networks
Au Yeung et al. Contextualising tags in collaborative tagging systems
US7519588B2 (en) Keyword characterization and application
CN101436186A (en) Method and system for providing related searches
US9171081B2 (en) Entity augmentation service from latent relational data
Fujimura et al. Topigraphy: visualization for large-scale tag clouds
Bouadjenek et al. Social networks and information retrieval, how are they converging? A survey, a taxonomy and an analysis of social information retrieval approaches and platforms
Yang et al. Fine-grained preference-aware location search leveraging crowdsourced digital footprints from LBSNs

Legal Events

Date Code Title Description
PB01 Publication
C06 Publication
SE01 Entry into force of request for substantive examination
C10 Entry into substantive examination
GR01 Patent grant
C14 Grant of patent or utility model