CN105869058A - Method for user portrait extraction based on multilayer latent variable model - Google Patents

Method for user portrait extraction based on multilayer latent variable model Download PDF

Info

Publication number
CN105869058A
CN105869058A CN201610250016.7A CN201610250016A CN105869058A CN 105869058 A CN105869058 A CN 105869058A CN 201610250016 A CN201610250016 A CN 201610250016A CN 105869058 A CN105869058 A CN 105869058A
Authority
CN
China
Prior art keywords
user
collection
theme
entry
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610250016.7A
Other languages
Chinese (zh)
Other versions
CN105869058B (en
Inventor
毋立芳
王丹
刘爽
张磊
刘海英
张岱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201610250016.7A priority Critical patent/CN105869058B/en
Publication of CN105869058A publication Critical patent/CN105869058A/en
Application granted granted Critical
Publication of CN105869058B publication Critical patent/CN105869058B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method for user portrait extraction based on a multilayer latent variable model and relates to the field of data mining and recommendation systems. A user portrait is extracted according to a social curation network, and the method for user portrait extraction based on the multilayer latent variable model is provided according to data of two modes including text description information of collected entries and user behaviors on a forward chain. A latent Dirichlet allocation (LDA) model is introduced to the text description information to obtain user's latent subject distribution, and subject interest distribution is obtained based on the user's latent subject distribution; and users' interest distribution is obtained in combination with the user's latent subject distribution and the subject interest distribution. A users' social community is found based on the multilayer latent variable model, and user recommendation results are obtained in combination with Jensen-Shannon divergence ascending sort. According to the method, the users' social community is found by utilization of information of the two different modes including the user text description information and the user behaviors on the forward chain, and user recommendation is achieved.

Description

A kind of multilamellar latent variable model user draw a portrait extract method
Technical field
The present invention relates to data mining and commending system field, be specifically related to a kind of multilamellar latent variable model user portrait and carry The research of access method and realization.
Background technology
Social media refers to that a series of foundation is on the basis of the technology and ideology of Web2.0, it is allowed to user oneself produces The creation of content and the network application exchanged.From the beginning of 2009, the social plan of some specialties exhibition network (as Pinterest, Snip.it, Scoopit, petal net etc.) formally occur.So-called " social plan exhibition " refers to that people collect on network, organize and divide Enjoy the synonym of the behaviors such as information.Traditional social networks is customer-centric, and social plan exhibition network is centered by content. Social plan exhibition network is with user interest for guiding, and user can be with oneself author content, it is also possible to oneself will pay close attention on other website Content by link, replicate and be stowed under oneself catalogue, user can also be according to the hobby of oneself to favorites Carry out taxonomic revision, provide the user network linking the most easily, the adding to order and sharing function of Internet resources.Other are used The collection of user can be commented on, be put to praise and turn and adopt by family.This issuing function easily so that user can be really real Existing one-touch expression, shares the viewpoint of oneself easily.Pinterest is first batch of by one-touch content creation and structuring plan Exhibition, incorporates one of website of Content Management collection (referred to as " Board ").The fast development of Pinterest, has absolutely proved this kind of society Hand over the plan exhibition network captivation to public users.From the beginning of 2012, the domestic society that more than ten classes Pinterest occur the most successively Hand over plan exhibition network such as petal net, beauty to say, mushroom street etc..
The user's portrait extracting personalization on social networks is a crucial technology.Extract user for social networks to draw The method of picture has a lot, but is mostly absorbed in prediction customer attribute information, such as Demographic, sex, age, religion Deng.Forecast model is often set up and is closed (often natural language text content) at the series of features collection set up by user, and It not that the model of a system is for recommending or community discovery.User based on latent variable model draws a portrait extraction and is applied to literary composition This content analysis, traditional commending system, socialization's commending system and social network analysis.
Social plan exhibition network packet is containing different types of information: collection entry, storage book and collection classification, user generate The short text information of collection entry, forwards the information of chain, also has " concern " information etc..User is to be organized in storage book by a series of In collection entry set composition, a collection entry comprises abundant information, such as describe information, the letter of relevant storage book The raw information etc. that breath and collection entry forward.The information that social plan opens up network multi-modal brings challenges to user modeling.How Description text message and user to collection entry forward chain behavior to carry out latent variable modeling, and find potential user community and push away Recommending with targeted customer's interest similar users is the emphasis of the present invention.
Summary of the invention
The present invention is towards social plan exhibition network, and for the feature that user data mode is various, research is based on multi-modal creep The user of amount modeling draws a portrait extracting method, for community discovery with realize user and recommend.
In order to realize the problems referred to above, the present invention proposes a kind of double-deck Bayes's latent variable model for describing user.The party Method includes:
A, set up dictionary and disable dictionary, utilizing the text of participle instrument ICTCLAS collection all to user entry to describe Information carries out participle.
B, in social plan network, the collection entry of active user all can be carried out turning by other users adopts.Adopt if user turns One collection entry, then can be by all for this collection entry except commenting on, putting a collections to self of self information duplication such as praising In volume.
Gather targeted customer and gather all collection entries forwarding chain data, according to collecting the number being forwarded from whom in entry According to, obtain each the collection entry data to original collection entry.Start to crawl data to parent from current collection entry.Root According to being forwarded from these data of where as guidance, trace back to original collection entry positions always.Each in trace back process Node is all the duplication of original collection entry, and is constituted the pathway figure of a chain by these nodes, referred to as " forwarding chain ". Each forwarding chain is comprised some collection bar destination aggregation (mda)s by one and constitutes.Establishment ID with the collection entry of each forwarding Represent a node on this forwarding chain.
C, multilamellar latent variable model are based on popular Bayes latent variable model Latent Dirichlet Allocation (LDA), all text messages and the set forwarding chain data to constitute to targeted customer's set (use user-id table Show, referred to herein as " pinners ") extract user interest model.
Further, described step C specifically includes:
C1, ground floor model are to calculate user to collect item text and describe the implicit theme of information.
C2, calculate each collection entry and belong to the probability of each theme
C3, second layer model are the interest distributions calculating implicit theme.
C4, the interest distribution of calculating user.
Further, described step C1 includes:
C11, LDA are a kind of popular Bayes's latent variable models, are widely used in machine learning and natural language processing. The basic thought of LDA be document be the set of multiple theme, a theme is the probability distribution of word.LDA is false based on " word bag " If, say, that in document, the order of word can be left in the basket.User version information is through LDA model, by implicit theme handle User and word are associated generating user-theme-word three layers Bayesian model.
C12, Perplexity are used for measuring meter LDA model, and Perplexity value is the least, and model is the best.With Perplexity measures ground floor LDA model, computation model Perplexity, chooses optimal number of topics NT1For next step.
P e r p l e x i t y ( U t e s t ) = exp { - Σ u = 1 U t log p ( w u ) Σ u = 1 U t N u }
p ( w u ) = Π n = 1 N m Σ k = 1 K p ( w n | z k ) p ( z k | u )
Wherein, UtestFor test aggregate user, UtFor the total number of users of test set, wuCollection entry for user u describes The set of letters of information, p (wu) it is that the collection entry of user u describes the set of letters of information generating probability, N under user modelu For user u collection entry the total words of set of letters of information is described.K is the theme sum, NmCollect for all users Entry describes the set of letters of information.
Described step C2 includes:
C21, the description information of a collection entry may be created by user be also likely to be to turn and adopt, by one group of phrase Become, w can be expressed aspin={ w1, w2... wi..., wN, wherein, wiDescribing information i-th word for collection entry, N is for receiving Hide entry and describe information set of letters sum.Theme setIt is that the probability of word divides Cloth, wherein, zkFor kth theme, NT1For optimal theme sum.WithRepresent that collection entry pin belongs to a theme zkProbability.
p z k min = 1 N Σ i = 1 N p ( w i | z k )
C22, result according to ground floor LDA model, calculate each collection entry pin and belong to a theme zkProbabilityAccording to Pareto's law, we select 0.2 as marginal value, ifThen collection entry belongs to this master Topic.So obtain belonging to NT1The collection bar destination aggregation (mda) of individual theme.
Described step C3 includes:
C31, a collection entry are typically to start to another user to terminate from a user.One collection entry also may be used To be forwarded by other users, a forwarding chain is user's propagation at social plan exhibition network.Each collection entry forwards on chain ID represent.
C32, it is that this that assume based on " word bag " is theoretical according to LDA, ID is regarded as independent word.So, to hidden Containing theme NT1Obtain three layers of Bayesian model of theme-interest-ID through LDA model, obtain implicit description message subject Interest distribution.
Described step C4 includes:
C41, combine user imply theme distribution and theme interest distribution, through matrix multiple, use can be calculated The interest distribution at family.Can be described as the set that user is multiple interest, an interest is the probability distribution of ID.
D, user based on multilamellar latent variable model recommend
Described step D includes:
D1, due to user u1With user u2Similarity and user u2With user u1Between similar be identical.Jensen- Shannon divergence is a kind of method of tolerance probability distribution distance (similarity degree).Jensen-Shannon divergence ratio Kullback-Leibler divergence more balances, and its reference order is symmetrical, i.e. result and parameter put in order unrelated. Jensen-Shannon divergence is the least, represents that similarity is the biggest.Calculate user Jensen-on targeted customer and all forwarding chains Shannon divergence is as the similarity between two user P and Q.
D J S ( P | | Q ) = 1 2 D K L ( P | | M ) + 1 2 D K L ( Q | | M ) = 1 2 Σ i P ( i ) l n 2 P ( i ) P ( i ) + Q ( i ) + 1 2 Σ i Q ( i ) l n 2 Q ( i ) P ( i ) + Q ( i )
Wherein,
User's collection on all collection entries forwarding chain is collectively referred to as other users and gathers Urec={ uR1, uR2..., uRi..., uRN, wherein uRiFor forwarding the Ri ID of node on chain, RN is node quantity on the forwarding chain of collection entry set.To target User gathers, and calculates each targeted customer uiU is gathered with RN other usersrecIn each user Jensen-Shannon divergence make Being the similarity between two users, result of calculation can be expressed as WhereinRepresent targeted customer uiTo user uRiJensen-Shannon divergence value.
D2, to targeted customer uiThe Jensen-Shannon divergence value calculated Sort from small to large, be worth the least the most similar with user interest.Take Top-N as recommending user to targeted customer.
Accompanying drawing illustrates:
Fig. 1 is that this example one forwards chain schematic diagram.
Fig. 2 is this example one multilamellar latent variable model framework schematic diagram.
Fig. 3 is this example one Perplexity result schematic diagram.
Fig. 4 is this example one community discovery result schematic diagram.
Fig. 5 is this example one MAP result schematic diagram.
Detailed description of the invention:
Below in conjunction with drawings and Examples, technical scheme is described in detail.
The present embodiment is carried out for certain social plan exhibition network truthful data, and 100 targeted customers in example are network In real user, respectively from three classification in, wherein No.1-No.35 belongs to classification one, and No.36-No.75 belongs to classification Two, No.76-No.100 belong to classification three, altogether comprise 633337 collection entries and collect the forwarding chain that entry is corresponding.
A, set up a neologisms dictionary, comprise about 300000 neologisms, for conventional and popular key word.Set up one Stop words dictionary, comprises 1433 stop words, and these words do not have concrete implication in statement.
The description Information Pull participle instrument ICTCLAS that 100 targeted customers gather all collection entries carries out participle. After participle, the description information of 100 users is divided into independent word, is removed by nonsensical word simultaneously.In this example one The description information of individual collection entry is for " which quality is Taobao's descriptive labelling must possess?", through participle obtain Taobao, commodity, Describe, possess, quality, necessary, which it is stop words, so removing.
B, reading targeted customer collect entry and forward chain data, by each forwarding chain data to forward each node on chain The numbering i.e. ID creating user of collection entry is labelling, is expressed as R={p1, p2..., pn}.By every forwarding chain Latter two node pn-1And pnRemove.In this example, the forwarding chain of a collection entry of a targeted customer can use user ID be expressed as 38450,115078,86804,60952,310115,86588,269584,280741,298423,15278, 31028,256217,271691}, remove node pn-1And pnRear forwarding chain be expressed as 86804,60952,310115,86588, 269584,280741,298423,15278,31028,256217,271691}.
C, based on text message and forward chain data, extract user interest model.
Described step C specifically includes:
C1, ground floor model are to calculate user to collect item text and describe the implicit theme of information, choose optimal number of topics.
C2, calculate each collection entry and belong to the probability of each theme
C3, second layer model are the interest distributions calculating implicit theme.
C4, the interest distribution of calculating user.
Described step C1 specifically includes:
C11, description information to 100 targeted customers carry out LDA modeling, obtain user-theme-word three layers Bayes's mould Type, is associated user and word by theme.
Wherein, p (w | u) represent that the word of 100 targeted customer's set is distributed, p (w | t) andRepresent the master of set of letters Topic distribution, p (t | u) and θuRepresent the probability distribution that in user's set, theme occurs.
C12, experiment are chosen the 10% of data acquisition system as test set, and i.e. 10 users are as test set.Calculate Perplexity, as number of topics NT1When >=30, Perplexity value tends to be steady, and N is describedT1When >=30, model is from quality and calculating Complexity is all optimum.So arranging NT1=30.A user u in 100 targeted customers in this exampleiTo 30 themes Probability isWherein
Described step C2 specifically includes:
C21, the result of three layers of Bayesian model based on ground floor user-theme-word, calculate each collection entry pin Belong to a theme zkProbabilityOne collection entry pin description information through participle be Taobao, commodity, description, Possessing, quality, the most each word is to a theme zkProbability be respectively 0.805,0.456,0.771,0.002,0.002}, Then
C22, calculate targeted customer and gather 100 targeted customers and gather all collection entries and belong to NT1=30 themes general RateIfThen think that this collection entry belongs to this theme.So obtain belonging to NT1=30 themes Collection bar destination aggregation (mda).
Described step C3 includes:
C31, each collection entry forward the ID on chain to represent.NT1=30 themes are expressed as by relevant forwarding The set of chain composition, it may also be said to be to forward the set of node associated user ID composition on chain.
C32, to NT1=30 themes obtain three layers of Bayesian model of theme-interest-ID through LDA model, To the implicit interest distribution describing message subject, with crossing interest, theme and ID are associated.
Wherein, p (uid | t) represents NT1The ID distribution of=30 themes, p (uid | int) andRepresent ID The interest distribution of set, p (int | t) and θtRepresent the probability distribution that in theme set, interest occurs.
Described step C4 includes:
C41, combine user and imply the interest distribution of theme distribution and theme, the implicit theme of 100 targeted customers set Can be expressed as: the matrix of 30 themes of 100 user *, the interest distribution of theme can be expressed as: 30 theme * interest Matrix.Through matrix multiple, the interest probabilities distribution of the matrix of 100 user's * interest, i.e. user can be calculated.Also may be used With say be user be the set of multiple interest, an interest is the probability distribution of ID.
P (int | u)=p (int | t) p (t | u)=θtθu
Wherein, p (int | d) represents the probability distribution that in 100 users, interest occurs.
One advantage of multi-modal latent variable model is can to find communities of users from description information and forwarding chain.This reality Example compares with classical LDA model algorithm, and multi-modal latent variable model can be more preferable compared with tradition LDA model algorithm Community of user socialization is described.
D, user based on multilamellar latent variable model recommend
Described step D includes:
D1,100 users gathering targeted customer, calculate each targeted customer uiU is gathered with RN other usersrec In the Jensen-Shannon divergence of each user can be expressed as the similarity between two users, result of calculationWhereinRepresent targeted customer uiTo user uRiJensen- Shannon divergence value.
D2, to targeted customer uiThe Jensen-Shannon divergence value calculated Sorting from small to large, Jensen-Shannon divergence value is the least the most similar with user interest distribution.Take Top-N as recommendation User is to targeted customer.In this example, this algorithm contrasts with five kinds of baseline algorithm: (1) is based on multi-modal latent variable mould Type and the proposed algorithm of Kullback-Leibler divergence, (2) recommendation based on LDA model and Jensen-Shannon divergence is calculated Method, (3) are based on LDA model and the proposed algorithm of Kullback-Leibler divergence, and (4) URSRP algorithm, and (5) ID goes out The popularity proposed algorithm of existing frequency.Contrast from Average Accuracy (MAP) index, it is recommended that effect achieves and significantly carries Rising, MLLDA-JSD achieves improvement than other baseline algorithm.We the most also verify that Jensen-Shannon divergence is in tolerance Between different user, similarity is better than Kullback-Labler divergence.

Claims (3)

1. a multilamellar latent variable model user draws a portrait the method extracted, it is characterised in that the method includes:
A, set up dictionary and disable dictionary, utilizing the text of participle instrument ICTCLAS collection all to user entry to describe information Carry out participle;
B, in social plan network, the collection entry of active user all can be carried out turning by other users adopts;If user turns adopts one Collection entry, then can replicate this all self information of collection entry in a storage book arriving self;
Gather targeted customer and gather all collection entries forwarding chain data, according to collecting the data being forwarded from whom in entry, obtain Take each the collection entry data to original collection entry;Start to crawl data to parent from current collection entry;According to turning From where, these data are as guidance, trace back to original collection entry positions always;Each node in trace back process It is all the duplication of original collection entry, and is constituted the pathway figure of a chain by these nodes, referred to as " forwarding chain ";Each Forward chain to be comprised some collection bar destination aggregation (mda)s by one to constitute;Establishment ID generation with the collection entry of each forwarding A node on this forwarding chain of table;
User interest model is extracted in C, all text messages gathering targeted customer and the set forwarding chain data to constitute;
Described step C specifically includes:
C1, ground floor model are to calculate user to collect item text and describe the implicit theme of information;
C2, calculate each collection entry and belong to the probability of each theme;
C3, second layer model are the interest distributions calculating implicit theme;
C4, the interest distribution of calculating user;
D, user based on multilamellar latent variable model recommend.
Method the most according to claim 1, it is characterised in that:
Described step C1 includes:
C11, user version information, through LDA model, are associated user and word to generate user-master by implicit theme Three layers of Bayesian model of topic-word;
C12, with Perplexity measure ground floor LDA model, computation model Perplexity, choose optimal number of topics NT1For Next step;
P e r p l e x i t y ( U t e s t ) = exp { - Σ u = 1 U t log p ( w u ) Σ u = 1 U t N u }
p ( w u ) = Π n = 1 N m Σ k = 1 K p ( w n | z k ) p ( z k | u )
Wherein, UtestFor test aggregate user, UtFor the total number of users of test set, wuCollection entry for user u describes information Set of letters, p (wu) it is that the collection entry of user u describes the set of letters of information generating probability, N under user modeluFor with Family u collection entry the total words of set of letters of information is described;K is the theme sum, NmEntry is collected for all users The set of letters of description information;
Described step C2 includes:
C21, the description information of a collection entry may be created by user be also likely to be to turn and adopt, and is made up of one group of word, table It is shown as wpin={ w1, w2... wi..., wN, wherein, wiDescribing information i-th word for collection entry, N describes letter for collection entry Interest statement set of words sum;Theme setIt is the probability distribution of word, wherein, zkFor kth master Topic, NT1For optimal theme sum;WithRepresent that collection entry pin belongs to a theme zkProbability;
p z k min = 1 N Σ i = 1 N p ( w i | z k )
C22, result according to ground floor LDA model, calculate each collection entry pin and belong to a theme zkProbabilityChoosing Select 0.2 as marginal value, ifThen collection entry belongs to this theme;So obtain belonging to NT1The receipts of individual theme Hide bar destination aggregation (mda);
Described step C3 includes:
C31, each collection entry forward the ID on chain to represent;
C32, to implicit theme NT1Obtain three layers of Bayesian model of theme-interest-ID through LDA model, implied The interest distribution of message subject is described;
Described step C4 includes:
C41, combine user imply theme distribution and theme interest distribution, through matrix multiple, be calculated the interest of user Distribution.
Method the most according to claim 1, it is characterised in that:
Described step D includes:
Between on D1, calculating targeted customer and all forwarding chains, user's Jensen-Shannon divergence is as two user P and Q Similarity;
D J S ( P | | Q ) = 1 2 Q K L ( P | | M ) + 1 2 D K L ( Q | | M ) = 1 2 Σ i P ( i ) ln 2 P ( i ) P ( i ) + Q ( i ) + 1 2 Σ i Q ( i ) ln 2 Q ( i ) P ( i ) + Q ( i )
Wherein,
User's collection on all collection entries forwarding chain is collectively referred to as other users and gathers Urec={ uR1, uR2..., uRi..., uRN, Wherein RN is node quantity on the forwarding chain of collection entry set;Targeted customer is gathered, calculates each targeted customer uiAnd RN Other users individual gather UrecIn the Jensen-Shannon divergence of each user as the similarity between two users, calculate Result is expressed asWhereinRepresent targeted customer uiTo user uRi's Jensen-Shannon divergence value;
D2, to targeted customer uiThe Jensen-Shannon divergence value calculated Sort from small to large, be worth the least the most similar with user interest;Take Top-N as recommending user to targeted customer.
CN201610250016.7A 2016-04-21 2016-04-21 A kind of method that multilayer latent variable model user portrait extracts Expired - Fee Related CN105869058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610250016.7A CN105869058B (en) 2016-04-21 2016-04-21 A kind of method that multilayer latent variable model user portrait extracts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610250016.7A CN105869058B (en) 2016-04-21 2016-04-21 A kind of method that multilayer latent variable model user portrait extracts

Publications (2)

Publication Number Publication Date
CN105869058A true CN105869058A (en) 2016-08-17
CN105869058B CN105869058B (en) 2019-10-29

Family

ID=56632428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610250016.7A Expired - Fee Related CN105869058B (en) 2016-04-21 2016-04-21 A kind of method that multilayer latent variable model user portrait extracts

Country Status (1)

Country Link
CN (1) CN105869058B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357835A (en) * 2017-06-22 2017-11-17 电子科技大学 It is a kind of that method for digging and system are predicted based on the interest of topic model and forgetting law
CN108876643A (en) * 2018-05-24 2018-11-23 北京工业大学 It is a kind of social activity plan exhibition network on acquire(Pin)Multimodal presentation method
CN110209875A (en) * 2018-07-03 2019-09-06 腾讯科技(深圳)有限公司 User content portrait determines method, access object recommendation method and relevant apparatus
CN112836507A (en) * 2021-01-13 2021-05-25 哈尔滨工程大学 Method for extracting domain text theme
CN116383521A (en) * 2023-05-19 2023-07-04 苏州浪潮智能科技有限公司 Subject word mining method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060136589A1 (en) * 1999-12-28 2006-06-22 Utopy, Inc. Automatic, personalized online information and product services
CN103064917A (en) * 2012-12-20 2013-04-24 中国科学院深圳先进技术研究院 Specific-tendency high-influence user group discovering method orienting microblog
CN103500340A (en) * 2013-09-13 2014-01-08 南京邮电大学 Human body behavior identification method based on thematic knowledge transfer
CN103886067A (en) * 2014-03-20 2014-06-25 浙江大学 Method for recommending books through label implied topic
CN104991956A (en) * 2015-07-21 2015-10-21 中国人民解放军信息工程大学 Microblog transmission group division and account activeness evaluation method based on theme possibility model
CN105069003A (en) * 2015-06-15 2015-11-18 北京工业大学 User focus object recommendation calculation method based on forward chain similarity

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060136589A1 (en) * 1999-12-28 2006-06-22 Utopy, Inc. Automatic, personalized online information and product services
CN103064917A (en) * 2012-12-20 2013-04-24 中国科学院深圳先进技术研究院 Specific-tendency high-influence user group discovering method orienting microblog
CN103500340A (en) * 2013-09-13 2014-01-08 南京邮电大学 Human body behavior identification method based on thematic knowledge transfer
CN103886067A (en) * 2014-03-20 2014-06-25 浙江大学 Method for recommending books through label implied topic
CN105069003A (en) * 2015-06-15 2015-11-18 北京工业大学 User focus object recommendation calculation method based on forward chain similarity
CN104991956A (en) * 2015-07-21 2015-10-21 中国人民解放军信息工程大学 Microblog transmission group division and account activeness evaluation method based on theme possibility model

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357835A (en) * 2017-06-22 2017-11-17 电子科技大学 It is a kind of that method for digging and system are predicted based on the interest of topic model and forgetting law
CN107357835B (en) * 2017-06-22 2020-11-03 电子科技大学 Interest prediction mining method and system based on topic model and forgetting rule
CN108876643A (en) * 2018-05-24 2018-11-23 北京工业大学 It is a kind of social activity plan exhibition network on acquire(Pin)Multimodal presentation method
CN110209875A (en) * 2018-07-03 2019-09-06 腾讯科技(深圳)有限公司 User content portrait determines method, access object recommendation method and relevant apparatus
CN110209875B (en) * 2018-07-03 2022-09-06 腾讯科技(深圳)有限公司 User content portrait determination method, access object recommendation method and related device
CN112836507A (en) * 2021-01-13 2021-05-25 哈尔滨工程大学 Method for extracting domain text theme
CN112836507B (en) * 2021-01-13 2022-12-09 哈尔滨工程大学 Method for extracting domain text theme
CN116383521A (en) * 2023-05-19 2023-07-04 苏州浪潮智能科技有限公司 Subject word mining method and device, computer equipment and storage medium
CN116383521B (en) * 2023-05-19 2023-08-29 苏州浪潮智能科技有限公司 Subject word mining method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN105869058B (en) 2019-10-29

Similar Documents

Publication Publication Date Title
CN106250412B (en) Knowledge mapping construction method based on the fusion of multi-source entity
Tu et al. Rumor2vec: a rumor detection framework with joint text and propagation structure representation learning
CN103678670B (en) Micro-blog hot word and hot topic mining system and method
Bholat et al. Text mining for central banks
CN104820629B (en) A kind of intelligent public sentiment accident emergent treatment system and method
CN104778209B (en) A kind of opining mining method for millions scale news analysis
CN104615608B (en) A kind of data mining processing system and method
CN109829166B (en) People and host customer opinion mining method based on character-level convolutional neural network
CN103699521B (en) Text analyzing method and device
CN103631859A (en) Intelligent review expert recommending method for science and technology projects
CN105528437B (en) A kind of question answering system construction method extracted based on structured text knowledge
CN105869058A (en) Method for user portrait extraction based on multilayer latent variable model
CN107766585A (en) A kind of particular event abstracting method towards social networks
CN109446331A (en) A kind of text mood disaggregated model method for building up and text mood classification method
CN103793503A (en) Opinion mining and classification method based on web texts
Pong-Inwong et al. Improved sentiment analysis for teaching evaluation using feature selection and voting ensemble learning integration
CN105354305A (en) Online-rumor identification method and apparatus
CN114444516B (en) Cantonese rumor detection method based on deep semantic perception map convolutional network
CN103488637B (en) A kind of method carrying out expert Finding based on dynamics community's excavation
CN103412878A (en) Document theme partitioning method based on domain knowledge map community structure
Ishfaq et al. Identifying the influential bloggers: a modular approach based on sentiment analysis
Liu et al. Identifying experts in community question answering website based on graph convolutional neural network
Borah Detecting covid-19 vaccine hesitancy in india: a multimodal transformer based approach
Shehnepoor et al. ScoreGAN: A fraud review detector based on multi task learning of regulated GAN with data augmentation
CN106156192A (en) Public sentiment data clustering method and public sentiment data clustering system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191029