CN101547162A - Method and device for tagging user based on user state information - Google Patents

Method and device for tagging user based on user state information Download PDF

Info

Publication number
CN101547162A
CN101547162A CN200810086931A CN200810086931A CN101547162A CN 101547162 A CN101547162 A CN 101547162A CN 200810086931 A CN200810086931 A CN 200810086931A CN 200810086931 A CN200810086931 A CN 200810086931A CN 101547162 A CN101547162 A CN 101547162A
Authority
CN
China
Prior art keywords
user
label
state
unit
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810086931A
Other languages
Chinese (zh)
Inventor
舒芳蕊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CN200810086931A priority Critical patent/CN101547162A/en
Publication of CN101547162A publication Critical patent/CN101547162A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for tagging a user based on user state information. In the method for tagging the user based on the user state information, the user state information is collected first, then text mining is carried out on the text information in the collected user state information so as to extract keywords, and the keywords are used as tags for tagging the user. The method for tagging the user based on the user state information of the invention obtains a user tag by analyzing the user state information, makes the user tag better reflect interest and state of the user, and can automatically update the user tag, thereby reflecting the changes of the interest and state of the user in time.

Description

Method and device based on state of user information labels user
Technical field
The present invention relates to computer communication field, for example, instantaneous communication system, BBS system etc. specifically, relate in above-mentioned system the method and apparatus based on state of user information labels user.
Background technology
In the system that the instant messaging service is provided, each uses the user of instant messaging service all to have state information.State of user information can be presented to the user in this user's the contacts list automatically, and like this, these users can know this user's current state by this state of user information.
At first, state information is to be provided by the instant messaging service provider, as online, off-line, state such as have much to do, leave.At present, many JICQs, for example MSN, QQ, Google Talk etc., all providing can be by the function of customization state information.Like this, state information not only can be represented user's online/off-line state (for example have much to do, leave etc.), can also represent interest (for example Harry Potter, shuttlecock etc.) of user's current mood (for example happiness, sadness etc.), user's current location (for example in Beijing, in New York etc.), user etc.For simplifying statement, in this manual, the user related information that uses " user's interest " unified denotion state of user, mood, position, interest etc. from user state information, to extract.
Labelling is a kind of classification feature easily, by labelling, can manage or search for relevant information easily.User's label is usually relevant with user's interest, according to user's label, can help the user search information of interest or the user of same interest is arranged.
At present, label to the user and mainly contain dual mode: the label function that friend-making websites such as personalized search that Google company provides and for example Consumating provide.
The Google personalized search is the improvement to the Google search, and it can arrange Search Results according to user's hobby.At first, according to user's search history, extract user's interest information, this interest information can be used as user's label.When the user carries out new search, according to this user's label Search Results is resequenced, give higher weight to result near user interest.Yet the keyword of user search only can reflect this user's interest in most cases indirectly, simultaneously, label for such extraction, because privacy concern, the user may not wish that the service provider does further application to it, has the user of same interest etc. with it as search.
And the Consumating website provides the label input function to the user, by the label of user's hobby of input expression oneself when registering.The user can increase, revises or delete label, and utilizes these labels, user can search for and the user that same interest is arranged oneself.Yet this mode can only manually be upgraded label by the user, and usually the user can be negligent of upgrading oneself label, the user's that causes upgrading in time label.
Summary of the invention
The present invention proposes in view of above-mentioned technical problem just, its purpose is to provide a kind of method and apparatus based on state of user information labels user, it makes user's label can reflect user's interest and variation thereof more accurately, and can upgrade user's label automatically.
According to a first aspect of the invention, provide a kind of method, comprising: collect described state of user information based on state of user information labels user; Text message in the collected described state of user information is carried out text mining to extract keyword; And use described keyword described user to be tagged as label.
According to a second aspect of the invention, provide a kind of device based on state of user information labels user, comprising: collector unit is used to collect described state of user information; The text mining unit is used for the text message of the described state of user information of being collected by described collector unit is carried out text mining, to extract keyword; And the unit that tags, be used to use described keyword described user to be tagged as label.
Description of drawings
Fig. 1 is the flow chart of the method based on state of user information labels user according to an embodiment of the invention;
Fig. 2 is the flow chart based on state of user information labels user's method according to another embodiment of the invention;
Fig. 3 is the flow chart that a plurality of users are divided into groups as an application of user's label;
Fig. 4 is the schematic diagram of an example of explanation associated user net;
Fig. 5 delivers the flow chart of personalized advertisement as the Another application of user's label to the user;
Fig. 6 is the flow chart as the supposition tendency information of the Another application of user's label;
Fig. 7 is the flow chart as the historic state information of obtaining the user of the Another application of user's label;
Fig. 8 is the block diagram based on state of user information labels user's device according to the first embodiment of the present invention;
Fig. 9 is the block diagram based on state of user information labels user's device according to a second embodiment of the present invention;
Figure 10 is the block diagram based on state of user information labels user's device of a third embodiment in accordance with the invention;
Figure 11 is the block diagram based on state of user information labels user's device of a fourth embodiment in accordance with the invention;
Figure 12 is the block diagram based on state of user information labels user's device according to a fifth embodiment of the invention;
Figure 13 is the block diagram based on state of user information labels user's device according to a sixth embodiment of the invention.
Embodiment
Believe that by below in conjunction with the detailed description of accompanying drawing to specific embodiments of the invention, above and other objects of the present invention, feature and advantage can become more obvious.
Fig. 1 is the flow chart of the method based on state of user information labels user according to an embodiment of the invention.Below in conjunction with accompanying drawing present embodiment is described in detail.
In the embodiment of present embodiment and back, be that example describes with the user of instantaneous communication system.Certainly, the embodiment of present embodiment and back is applicable to that also other has the system that the function of state information is provided to the user.
As shown in Figure 1, at first at step S100, collect state of user information.As previously mentioned, state information can be the information that the user imports, and comprises text message and non-text message, for example picture.The information of user input can reflect user's current mood (for example happiness, sadness etc.), user's action message (for example watching exhibition etc. at shopping, weekend), user's current location (for example in Beijing etc.), user's hobby (for example play, shuttlecock) etc.State information also can be that system is the information that the user generates, for example the information of time of producing of state information, song that the user is listening, user's that system generates action message (for example have much to do, leave etc.) etc.In instant messaging breath system, the part of state of user information, the user's that generates of the information of user input, system action message for example can be presented to the user in this user's the contacts list.
Further, after having collected state of user information, state of user information is stored (step S105).In the present embodiment, adopt the form storaging state information of " timestamp of user ID+state information+state information ".Those of ordinary skill in the art understands easily, also can adopt other form storaging state information.
Then,, the text message in the collected state of user information is carried out text mining, to extract keyword at step S110.In the present embodiment, employing is based on the text mining technology of N gram language model.Particularly,, remove the stop words in the text information, then remaining speech in the text information is carried out the root reduction, and text information is carried out the cutting of N unit, to obtain the keyword of text information for each text message.
About details based on the text mining technology of N gram language model, can be referring to " the N-Gram Statistics for Natural Language Understanding and TextProcessing " that C.Y.Suen showed, IEEE Trans.On Pattern Analysis and Machine Intelligence, vol.1, No.2, April 1979, pp.164-172.
Certainly, those of ordinary skill in the art knows easily, can also adopt other text mining technology to handle text message.
Then,, use the keyword that in step S110, obtains the user to be tagged, thereby obtain user's label data as label at step S120.For the label that the user manually imports, then become user's label automatically, add in the label data of access customer.
In addition, when the user had upgraded the state information of oneself, the method for present embodiment also comprised: check whether the user has new state information (step S125).This step can regularly be carried out, and also can carry out immediately after the each update mode information of user.If in step S125, be checked through the user new state information is arranged, then return execution in step S100 to S120.
By above description as can be seen, present embodiment based on state of user information labels user's method by state of user information being analyzed the label that obtains the user, make user's label more can reflect user's interest and state, and can upgrade user's label automatically, thereby reflect user's interest and state variation in time.
Fig. 2 is the flow chart based on state of user information labels user's method according to another embodiment of the invention, and wherein identical with aforesaid embodiment part adopts identical mark, and suitably omits its explanation.Below in conjunction with accompanying drawing, describe embodiments of the invention in detail.
In the present embodiment, as shown in Figure 2, after step S110, after promptly having obtained the keyword of state information, at step S112, calculate each occurrence number of these keywords, i.e. the frequency of occurrences of each keyword in all collected state informations, thus can know user's main interest.Then, at step S115, select (for example n) keyword that occurrence number is maximum of predetermined quantity, as user's label.Then, utilize these labels to user tag (step S120).
By above description as can be seen, the method based on state of user information labels user of present embodiment further selects to represent user's the keyword of main interest as user's label, thereby embodies user's interest better.
Fig. 3 is the flow chart that a plurality of users are divided into groups as an application of user's label.Describe in detail and should use below in conjunction with accompanying drawing, wherein, suitably omit its explanation for the part identical with the embodiment of front.
In should using, a plurality of users' label is that the method based on state of user information labels user by Fig. 1 or embodiment shown in Figure 2 obtains, and each user's label constitutes this user's label data.
Shown in Fig. 3 a,,, calculate the similarity between this user's label data and other each users' the label data for each user among a plurality of users at step S300.Similarity is used to describe the close degree between each label data.Provide an example explanation below and how to calculate similarity.
Suppose that instantaneous communication system keeps a jumbo vocabulary, it has covered everyday words most in the natural language.This vocabulary is represented with N dimensional vector W, and is for example, as follows:
W = Harry Potter Desperate Housewife Basketball · · · N × 1
Suppose that this instantaneous communication system has M user, this M user's label obtains by the method based on state of user information labels user of Fig. 1 or embodiment shown in Figure 2, and for example, the label data of user i is as follows, wherein each expression label:
basketball , Huang Tao , afternoon wonderful , Harry Potter 7 beautiful autumn , in Shanghai . . .
Then, with N dimensional vector X iThe occurrence number of each speech in the label data of user i among the expression vocabulary W, vectorial X iEach element representation at the occurrence number of speech in the label data of user i of the corresponding position of vocabulary W, vectorial X iInitial value be made as 0.For example, for user i, its label data comprises " Harry Potter " and " Basketball ", correspondingly, and vectorial X iFor
1 0 1 . . . N × 1
Like this, for each user i, all obtain corresponding vectorial X i
Below, utilize each user's vectorial X iCalculate the similarity between the label data.At first to each vectorial X iCarry out normalization, obtain the normalization vector
Figure A200810086931D0012140827QIETU
, promptly
X i ′ = X i sum ( X i ) , i=1,2,...,M
Then, calculate each user's normalization vector
Figure A200810086931D00125
Vector with other each user (the Euclidean distance between the j ≠ i) d ij = | | x i ′ - x j ′ | | 2 , As the similarity between these two users' the label data.
Then, at step S310, for each user, (for example k) of selection predetermined quantity has the user of maximum similarity with it, to form the user's collection (hereinafter referred to as " similar interests user collection ") that has similar interests with this user, that is, there are k user and this user to have similar interests.
Then, at step S320, for each user, extract the total label between concentrated each user of this user and similar interests user, and at step S330, the user's gathering that has among a plurality of users with having the identical label of label is groups of users, wherein is somebody's turn to do total label as group name.Like this, can divide into groups to a plurality of users.Be appreciated that each user can be assigned in a plurality of groups of users.
In should using, a plurality of users can be all users of instantaneous communication system.Usually, the number of users of instantaneous communication system is very many, thereby causes amount of calculation very big.For addressing this problem, a plurality of users also can be made of the user who is associated with certain user.For example, in instantaneous communication system, suppose that the user in the contacts list of user i constitutes customer group G1, defining each user among this customer group G1 and the distance (degree of association) of user i is 1, user in the contacts list of each user among the customer group G1 constitutes customer group G2 so, then the distance of each user among the customer group G2 and user i is 2, and the like, can obtain with user i is the associated user net at center.Thus, in the present embodiment, a plurality of users also can be all users in above-mentioned certain user's the associated user net.It is the associated user net at center that Fig. 4 shows with user Rachel.
By above description as can be seen, should the user be divided into groups of users with label according to the user with different interest.
Further, shown in Fig. 3 b, after having set up a plurality of groups of users (step S330),, check in each the label data among a plurality of users whether new label is arranged at step S340, if new label is arranged, then at step S342, for the user with new label, the search group name comprises the groups of users of this new label, at step S345, this user is added the groups of users that is searched then.
Further, shown in Fig. 3 c, after having set up a plurality of groups of users (step S330), whether at step S350, checking has new user, if new user is arranged, then at step S352, to each label in this new user's the label data, the search group name comprises the groups of users of this label, at step S355, the user that this is new adds in each groups of users that is searched then.
Further, shown in Fig. 3 d, at step S360, receive about searching the request of the user with similar interests, then at step S362 from inquiring user, label data according to this inquiring user, search for corresponding groups of users, that is, group name comprises the groups of users of the label of this inquiring user, and, return the groups of users that is searched to this inquiring user at step S365.Fig. 5 delivers the flow chart of personalized advertisement as the Another application of user's label to the user.Describe in detail and should use below in conjunction with accompanying drawing, wherein, suitably omit its explanation for the part identical with the embodiment of front.
In should using, user's label is that the method based on state of user information labels user by Fig. 1 or embodiment shown in Figure 2 obtains, and this user's label constitutes its label data.
As shown in Figure 5, at first, at step S500, according to user's label data, selector is should the advertisement of user's label as its personalized advertisement, then, at step S510, is pushed to this user with selected personalized advertisement.
In addition, for aforesaid groups of users, also can be according to the group name of groups of users, select the advertisement that meets personalized advertisement, and it is pushed to all users in this groups of users as this groups of users.
By above description as can be seen, should use the orientation delivery that can realize personalized advertisement according to user's label.
Fig. 6 is the flow chart as the supposition tendency information of the Another application of user's label.Describe present embodiment in detail below in conjunction with accompanying drawing, wherein, suitably omit its explanation for the part identical with the embodiment of front.
In should using, user's label is that the method based on state of user information labels user by Fig. 1 or embodiment shown in Figure 2 obtains, and this user's label constitutes its label data.
As shown in Figure 6, at first, at step S600, the scope of designated user.In the present embodiment, user's scope can be all users of instantaneous communication system, also can be foregoing certain user's associated user net.
Then, at step S610, collect the label data of all users in the specified scope, and at step S620, calculate the occurrence number of each label in the collected label data, in should using, the occurrence number of label has reflected the popularity degree of this label in specified user's scope.Then, at step S630, determine (for example m) label that occurrence number is the highest of predetermined quantity, as the tendency information in specified user's the scope.
By above description as can be seen, should use most popular label in the scope that can obtain certain user according to user's label.
Fig. 7 is the flow chart as the historic state information of obtaining the user of the Another application of user's label.Describe present embodiment in detail below in conjunction with accompanying drawing, wherein, suitably omit its explanation for the part identical with the embodiment of front.
In should using, user's label is that the method based on state of user information labels user by Fig. 1 or embodiment shown in Figure 2 obtains.
As shown in Figure 7, at first,, specify at least one label by the user at step S700, then, at step S710, for each of at least one specified label, the corresponding text message of search in this state of user information of being stored, that is, search comprises the text message of this label.As previously mentioned, when storage state of user information, can adopt the form of " timestamp of user ID+state information+state information " to store.So, at step S720, according to the timestamp of the text message that is searched, can obtain to have the non-text message of identical time stamp, preferably, non-text message is a picture.Then,, these text messages and non-text message are arranged, thereby can be obtained this user's historic state information about this label according to time sequencing at step S730.
Further, the user can also specify the initial time stamp of state information and termination time to stab, then, and during from the initial time stamp to the termination time, stabbing, search for corresponding text message and non-text message, and these text messages and non-text message are arranged according to time sequencing.
By above description as can be seen, utilize the label based on the user of present embodiment to obtain the method for user's historic state information, the historic state information that the user can customized user.
Under same inventive concept, Fig. 8 is the block diagram based on state of user information labels user's device 800 according to the first embodiment of the present invention.Below in conjunction with accompanying drawing, describe present embodiment in detail.
As shown in Figure 8, the device 800 based on state of user information labels user of present embodiment comprises: state information collection unit 801, and it collects state of user information; Text mining unit 802, it carries out text mining for the text message in the state of user information of being collected by state information collection unit 801, to extract keyword; And the unit 803 that tags, it uses the keyword that obtains by text mining unit 802 as label the user to be tagged.
Further, after state of user information has been collected in state information collection unit 801, by the collected state of user information of memory cell 804 storages.In memory cell 804, state of user information adopts the form of " timestamp of user ID+state information+state information " to store.Certainly, memory cell 804 also can adopt other storage format.
State information collection unit 801 carries out text mining by each text message in the 802 pairs of state informations in text mining unit after having collected state of user information, to obtain keyword.In the present embodiment, the text mining technology that text mining unit 802 adopts based on the N gram language model, particularly, in text mining unit 802, at first, stop words is removed the stop words in the unit removal text message, by the root reduction unit remaining speech in the text message is carried out the root reduction again, then, N unit cutting unit carries out the cutting of N unit to text message, thereby obtains the keyword of text message.
The keyword that obtains in text mining unit 802 is provided for the unit 803 that tags, and then, the unit 803 that tags uses these keywords that the user is tagged.Then, user's sign and label are stored in the corresponding tag ram 806, and user's label constitutes this user's label data.
In addition, the device 800 based on state of user information labels user of present embodiment also comprises: state information inspection unit 805, it checks whether the user has new state information, and when being checked through new state information, state of a control information collection unit 801 is collected this new state information.Then, this new state information is provided for text mining unit 802 with the extraction keyword, and uses these keywords that the user is tagged by the unit 803 that tags.
What should be pointed out that present embodiment can realize as shown in Figure 1 the method based on state of user information labels user based on state of user information labels user's device 800 in operation.
Fig. 9 is the block diagram based on state of user information labels user's device 900 according to a second embodiment of the present invention, and wherein identical with front embodiment part adopts identical mark, and suitably omits its explanation.Below in conjunction with accompanying drawing, present embodiment is elaborated.
As shown in Figure 9, the device 900 based on state of user information labels user of present embodiment, except comprise state information collection unit 801, text mining unit 802, the unit 803 that tags, memory cell 804, state information inspection unit 805 and the tag ram 806, also comprise: first computing unit 901, it calculates the occurrence number of the keyword that is obtained by text mining unit 802; And selected cell 902, (for example n) keyword that occurrence number is maximum that it selects predetermined quantity as user's label, and offers the unit 803 that tags.
What should be pointed out that present embodiment can realize as shown in Figure 2 the method based on state of user information labels user based on state of user information labels user's device 900 in operation.
Figure 10 is the block diagram based on state of user information labels user's set of a third embodiment in accordance with the invention, wherein for the part identical with the embodiment of front, suitably omits its explanation.Describe present embodiment in detail below in conjunction with accompanying drawing.
As shown in figure 10, the device 1000 based on state of user information labels user of present embodiment, on the basis of device 900 based on state of user information labels user shown in Figure 9, also comprise: similarity calculated 1001, it is for each user, calculates the similarity between this user's label data and other each users' the label data; Similar interests user selection unit 1002, it selects (for example k) of predetermined quantity to have the user of maximum similarity, collects (hereinafter referred to as similar interests user collection) with the user that this user has similar interests with composition; Extraction unit 1003, it extracts the total label between this user and concentrated each user of similar interests user; And accumulation unit 1004, it assembles the user who has the label identical with total label among a plurality of users is groups of users, wherein should have label as group name.
In the present embodiment, a plurality of users can be all users of instantaneous communication system, also can be foregoing with all users in the associated user net of certain user-center.
Further, the device 1000 based on state of user information labels user of present embodiment also comprises: inspection unit 1005, and it checks in each the label data among a plurality of users whether new label is arranged or check whether new user is arranged; And search unit 1006, when it is checked through new label at inspection unit 1005, for user with this new label, the search group name comprises the groups of users of this new label, when being checked through new user, to each label in this new user's the label data, the search group name comprises the groups of users of this label; Wherein, the user that accumulation unit 1004 will have this new label adds the corresponding groups of users that is searched, and the user that this is new adds each groups of users that is searched.
Further, the device 1000 based on state of user information labels user of present embodiment also comprises: receiving element 1007, it receives about searching the request of the user with similar interests from inquiring user, and provides it to search unit 1006.Search unit 1006 according to the label data of this inquiring user, is searched for corresponding groups of users, and is returned the groups of users that is searched to this inquiring user after receiving this request.
What should be pointed out that present embodiment can realize as shown in Figure 3 the application based on state of user information labels user based on state of user information labels user's set 1000 in operation.
Figure 11 is the block diagram based on state of user information labels user's device 1100 of a fourth embodiment in accordance with the invention, wherein for the part identical with the embodiment of front, suitably omits its explanation.Describe present embodiment in detail below in conjunction with accompanying drawing.
As shown in figure 11, the device 1100 based on state of user information labels user of present embodiment, on the basis of device 900 based on state of user information labels user shown in Figure 9, also comprise: advertisement selection unit 1101, it is according to user's label data, selector is should the advertisement of user's label, as this user's personalized advertisement; And push unit 1102, it is pushed to this user with personalized advertisement.
In addition, advertisement selection unit 1101 can also be according to the group name of aforesaid groups of users, selects the advertisement that the meets personalized advertisement as this groups of users, and by push unit 1102 it is pushed to all users in this groups of users.
What should be pointed out that present embodiment can realize as shown in Figure 5 the application of delivering personalized advertisement to the user based on state of user information labels user's device 1100 in operation.
Figure 12 is the block diagram based on state of user information labels user's device 1200 according to a fifth embodiment of the invention, wherein for the part identical with the embodiment of front, suitably omits its explanation.Describe present embodiment in detail below in conjunction with accompanying drawing.
As shown in figure 12, the device 1200 based on state of user information labels user of present embodiment on the basis of device 900 based on state of user information labels user shown in Figure 9, also comprises: designating unit 1201, the scope of its designated user; Label data collector unit 1202, it collects the label data of the user in the specified scope; Second computing unit 1203, it calculates the occurrence number of each label in the collected label data; And determining unit 1204, it determines (for example m) label that occurrence number is the highest of predetermined quantity, as the tendency information in specified user's the scope.
In the present embodiment, user's scope can be all users of instantaneous communication system, also can be foregoing associated user net with certain user-center.
What should be pointed out that present embodiment can realize the application of supposition tendency information as shown in Figure 6 based on state of user information labels user's device 1200 in operation.
Figure 13 is the block diagram of the device 1300 based on state of user information labels user according to an embodiment of the invention, wherein for the part identical with the embodiment of front, suitably omits its explanation.Describe present embodiment in detail below in conjunction with accompanying drawing.
As shown in figure 13, the device 1700 based on state of user information labels user of present embodiment, on the basis of device 900 based on state of user information labels user shown in Figure 9, also comprise: label designating unit 1301 is used for specifying at least one label by the user; Text message search unit 1302, it is for each of at least one specified label, the corresponding text message of search in the state of user information of being stored; Non-text message acquiring unit 1303, it obtains the non-text message with identical time stamp according to the timestamp of the text message that is searched; And arrangement units 1304, it arranges text message and non-text message according to time sequencing.
In addition, the device 1300 based on state of user information labels user of present embodiment also comprises: timestamp designating unit 1305 is used for being stabbed by the initial time stamp and the termination time of user's designated state information.In this case, text message search unit 1302 is searched for corresponding text message during stabbing from the initial time stamp to the termination time.
What should be pointed out that present embodiment can realize the application of the historic state information of obtaining the user as shown in Figure 7 based on state of user information labels user's device 1300 in operation.
Should be understood that, in the foregoing description can be by such as very lagre scale integrated circuit (VLSIC) or gate array, realize based on state of user information labels user's device and each part thereof such as the semiconductor of logic chip, transistor etc. or such as the hardware circuit of the programmable hardware device of field programmable gate array, programmable logic device etc., also can use the software of carrying out by various types of processors to realize, also can realize by the combination of above-mentioned hardware circuit and software.
Though more than describe the method and apparatus based on state of user information labels user of the present invention in detail by some exemplary embodiments, but above these embodiment are not exhaustive, and those skilled in the art can realize variations and modifications within the spirit and scope of the present invention.Therefore, the present invention is not limited to these embodiment, and scope of the present invention is only defined by the appended claims.

Claims (20)

1. method based on state of user information labels user comprises:
Collect described state of user information;
Text message in the collected described state of user information is carried out text mining to extract keyword; And
Use described keyword described user to be tagged as label.
2. the method based on state of user information labels user according to claim 1, wherein, described text mining step comprises: for each described text message,
Remove the stop words in the text information;
Remaining speech in the text information is carried out the root reduction; And
Text information is carried out the cutting of N unit, to obtain the keyword of text information.
3. the method based on state of user information labels user according to claim 1 and 2 also comprises:
Check whether described user has new state information; And
If have, then described new state information is carried out described collection step, described text mining step and the described step that tags.
4. according to any described method of claim 1 to 3, also comprise: store collected described state of user information based on state of user information labels user.
5. according to any described method of claim 1 to 4, also comprise based on state of user information labels user:
Calculate the occurrence number of described keyword; And
Select the maximum keyword of occurrence number of predetermined quantity, as described user's label.
6. according to any described method based on state of user information labels user of claim 1 to 5, wherein, each user's label constitutes this user's label data, and described method also comprises:
For each of described a plurality of users,
Calculate the similarity between this user's label data and other each users' the label data;
Select the user with maximum similarity of predetermined quantity, to form the user's collection that has similar interests with this user;
Extract the total label between concentrated each user of this user and described user; And
It is groups of users that the user who has the label identical with above-mentioned total label among described a plurality of users is assembled, and wherein above-mentioned total label is as group name.
7. the method based on state of user information labels user according to claim 6 also comprises:
Check in described a plurality of users' the label data whether new label is arranged;
If new label is arranged, then for user with described new label,
The search group name comprises the groups of users of described new label;
This user is added the groups of users that is searched;
Whether check has new user;
If new user is arranged, then
To each label in described new user's the label data, the search group name comprises the groups of users of this label;
Described new user is added each groups of users that is searched;
8. according to any described method based on state of user information labels user of claim 1 to 6, wherein, described user's label constitutes this user's label data, and described method also comprises:
According to described user's label data, the advertisement of label of selecting to meet described user is as described user's personalized advertisement; And
Described personalized advertisement is pushed to described user.
9. according to any described method based on state of user information labels user of claim 1 to 6, wherein, described user's label constitutes this user's label data, and described method comprises:
The scope of designated user;
Collect the label data of all users in the specified scope;
Calculate the occurrence number of each label in the collected label data; And
Determine the highest label of occurrence number of predetermined quantity, as the tendency information in the specified user scope.
10. according to any described method of claim 1 to 6, also comprise based on state of user information labels user:
Described user specifies at least one label;
For at least one specified label each, in the described state of user information of being stored, search for corresponding text message;
According to the timestamp of the text message that is searched, obtain non-text message with identical time stamp; And
Arrange above-mentioned text message and non-text message according to time sequencing.
11. the device based on state of user information labels user comprises:
The state information collection unit is used to collect described state of user information;
The text mining unit is used for the text message of the described state of user information of being collected by described state information collection unit is carried out text mining, to extract keyword; And
The unit that tags is used to use described keyword as label described user to be tagged.
12. the device based on state of user information labels user according to claim 11, wherein, described text mining unit comprises:
Stop words is removed the unit, is used for removing the stop words of described text message;
The root reduction unit is used for the remaining speech of described text message is carried out the root reduction; And
N unit cutting unit is used to use the described text message of N unit's algorithm cutting, to obtain the keyword of described text message.
13., also comprise according to claim 11 or 12 described devices based on state of user information labels user:
The state information inspection unit is used to check whether described user has new state information, and when being checked through new state information, controlling described state information collection unit and collect described new state information.
14. according to any described device based on state of user information labels user of claim 11 to 13, also comprise: memory cell is used to store collected described state of user information.
15., also comprise according to any described device of claim 11 to 14 based on state of user information labels user:
First computing unit is used to calculate the occurrence number of the described keyword that is obtained by described text mining unit; And
Selected cell is used to select the maximum keyword of occurrence number of predetermined quantity, as described user's label, and offers the described unit that tags.
16. according to any described device based on state of user information labels user of claim 11 to 15, wherein, each user's label constitutes this user's label data, described device also comprises:
Similarity calculated is used for each for described a plurality of users, calculates the similarity between this user's label data and other each users' the label data;
The similar interests user selection unit is used to select the user with maximum similarity of predetermined quantity, to form the user's collection that has similar interests with this user;
Extraction unit is used to extract the total label between this user and concentrated each user of described user; And
Accumulation unit is used for the user that described a plurality of users have a label identical with above-mentioned total label assembled and is groups of users, and wherein above-mentioned total label is as group name.
17. the device based on state of user information labels user according to claim 16 also comprises:
Inspection unit is used for checking whether described a plurality of users' label data has new label or check whether new user is arranged; And
Search unit, be used for when described inspection unit is checked through new label, for user with described new label, the search group name comprises the groups of users of described new label, when described inspection unit is checked through new user, to each label in described new user's the label data, the search group name comprises the groups of users of this label;
Wherein, the user that described accumulation unit will have described new label adds the corresponding groups of users that is searched, and described new user is added each groups of users that is searched.
18. according to any described device based on state of user information labels user of claim 11 to 15, wherein, described user's label constitutes this user's label data, described device also comprises:
The advertisement selection unit is used for the label data according to described user, and selection meets the advertisement of described user's label, as described user's personalized advertisement; And
Push unit is used for described personalized advertisement is pushed to described user.
19. according to any described device based on state of user information labels user of claim 11 to 15, wherein, described user's label constitutes this user's label data, described device also comprises:
Designating unit is used to specify user's scope;
The label data collector unit is used to collect the label data of the user in the specified scope;
Second computing unit is used for calculating the occurrence number of collected each label of label data; And
Determining unit is used for determining the highest label of occurrence number of predetermined quantity, as described tendency information.
20., also comprise according to any described device of claim 11 to 15 based on state of user information labels user:
The label designating unit is used for specifying at least one label by described user;
The text message search unit is used for each at least one specified label, the corresponding text message of search in the described state of user information of being stored;
Non-text message acquiring unit is used for the timestamp according to the text message that is searched, and obtains the non-text message with identical time stamp; And
Arrangement units is used for arranging above-mentioned text message and non-text message according to time sequencing.
CN200810086931A 2008-03-28 2008-03-28 Method and device for tagging user based on user state information Pending CN101547162A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810086931A CN101547162A (en) 2008-03-28 2008-03-28 Method and device for tagging user based on user state information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810086931A CN101547162A (en) 2008-03-28 2008-03-28 Method and device for tagging user based on user state information

Publications (1)

Publication Number Publication Date
CN101547162A true CN101547162A (en) 2009-09-30

Family

ID=41194059

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810086931A Pending CN101547162A (en) 2008-03-28 2008-03-28 Method and device for tagging user based on user state information

Country Status (1)

Country Link
CN (1) CN101547162A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693248A (en) * 2011-04-14 2012-09-26 天脉聚源(北京)传媒科技有限公司 Network information searching method and system
CN102790727A (en) * 2011-05-19 2012-11-21 腾讯科技(深圳)有限公司 Method and system for dynamically pushing personal labels of users
CN102833176A (en) * 2011-06-13 2012-12-19 腾讯科技(深圳)有限公司 Method, device and system for obtaining information
CN103188133A (en) * 2011-12-29 2013-07-03 北京神州泰岳软件股份有限公司 Method and device for quick and convenient communication from client-side friend label
CN103218355A (en) * 2012-01-18 2013-07-24 腾讯科技(深圳)有限公司 Method and device for generating tags for user
CN103390002A (en) * 2012-05-09 2013-11-13 北京千橡网景科技发展有限公司 Method and equipment for updating POI (Point of Interest) tags
WO2014071782A1 (en) * 2012-11-09 2014-05-15 腾讯科技(深圳)有限公司 User interest recommendation method and apparatus
CN103942703A (en) * 2013-01-18 2014-07-23 北京米时科技股份有限公司 System and method for providing network advertisements through electronic card
CN104216881A (en) * 2013-05-29 2014-12-17 腾讯科技(深圳)有限公司 Method and device for recommending individual labels
CN104750464A (en) * 2013-12-25 2015-07-01 中国移动通信集团公司 User state sensing and managing methods and equipment
WO2015131748A1 (en) * 2014-03-07 2015-09-11 Tencent Technology (Shenzhen) Company Limited Method and apparatus for pushing target information
CN105373619A (en) * 2015-12-03 2016-03-02 中国联合网络通信集团有限公司 User big data based user group analysis method and system
CN105701176A (en) * 2016-01-04 2016-06-22 浪潮软件股份有限公司 Data integration method and apparatus
CN107171934A (en) * 2017-05-05 2017-09-15 沈思远 Information processing method, instant communication client and the system of immediate communication tool
CN107786943A (en) * 2017-11-15 2018-03-09 北京腾云天下科技有限公司 A kind of tenant group method and computing device
CN111125506A (en) * 2018-11-01 2020-05-08 百度在线网络技术(北京)有限公司 Interest circle subject determination method, device, server and medium
CN112134872A (en) * 2020-09-16 2020-12-25 江苏省未来网络创新研究院 Network system with multi-application-layer cloud computing function
WO2020258773A1 (en) * 2019-06-25 2020-12-30 广州视源电子科技股份有限公司 Method, apparatus, and device for determining pushing user group, and storage medium
US11657607B2 (en) 2020-11-13 2023-05-23 International Business Machines Corporation Non-intrusive image identification

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693248A (en) * 2011-04-14 2012-09-26 天脉聚源(北京)传媒科技有限公司 Network information searching method and system
CN102790727B (en) * 2011-05-19 2016-02-17 腾讯科技(深圳)有限公司 A kind of method and system of dynamic propelling movement individual subscriber label
CN102790727A (en) * 2011-05-19 2012-11-21 腾讯科技(深圳)有限公司 Method and system for dynamically pushing personal labels of users
CN102833176A (en) * 2011-06-13 2012-12-19 腾讯科技(深圳)有限公司 Method, device and system for obtaining information
CN103188133A (en) * 2011-12-29 2013-07-03 北京神州泰岳软件股份有限公司 Method and device for quick and convenient communication from client-side friend label
CN103188133B (en) * 2011-12-29 2016-08-17 北京神州泰岳软件股份有限公司 Client good friend's label UltraDialUp communication method and device
CN103218355A (en) * 2012-01-18 2013-07-24 腾讯科技(深圳)有限公司 Method and device for generating tags for user
CN103218355B (en) * 2012-01-18 2016-08-31 腾讯科技(深圳)有限公司 A kind of method and apparatus generating label for user
CN103390002A (en) * 2012-05-09 2013-11-13 北京千橡网景科技发展有限公司 Method and equipment for updating POI (Point of Interest) tags
WO2014071782A1 (en) * 2012-11-09 2014-05-15 腾讯科技(深圳)有限公司 User interest recommendation method and apparatus
CN103942703A (en) * 2013-01-18 2014-07-23 北京米时科技股份有限公司 System and method for providing network advertisements through electronic card
CN104216881A (en) * 2013-05-29 2014-12-17 腾讯科技(深圳)有限公司 Method and device for recommending individual labels
CN104750464B (en) * 2013-12-25 2019-09-10 中国移动通信集团公司 A kind of perception of User Status, management method and equipment
CN104750464A (en) * 2013-12-25 2015-07-01 中国移动通信集团公司 User state sensing and managing methods and equipment
WO2015131748A1 (en) * 2014-03-07 2015-09-11 Tencent Technology (Shenzhen) Company Limited Method and apparatus for pushing target information
US11196829B2 (en) 2014-03-07 2021-12-07 Tencent Technology (Shenzhen) Company Limited Method and apparatus for pushing target information
CN105373619B (en) * 2015-12-03 2018-12-07 中国联合网络通信集团有限公司 A kind of user group's analysis method and system based on user's big data
CN105373619A (en) * 2015-12-03 2016-03-02 中国联合网络通信集团有限公司 User big data based user group analysis method and system
CN105701176A (en) * 2016-01-04 2016-06-22 浪潮软件股份有限公司 Data integration method and apparatus
CN107171934B (en) * 2017-05-05 2019-10-25 沈思远 Information processing method, instant communication client and the system of immediate communication tool
CN107171934A (en) * 2017-05-05 2017-09-15 沈思远 Information processing method, instant communication client and the system of immediate communication tool
CN107786943B (en) * 2017-11-15 2020-09-01 北京腾云天下科技有限公司 User grouping method and computing device
CN107786943A (en) * 2017-11-15 2018-03-09 北京腾云天下科技有限公司 A kind of tenant group method and computing device
CN111125506A (en) * 2018-11-01 2020-05-08 百度在线网络技术(北京)有限公司 Interest circle subject determination method, device, server and medium
WO2020258773A1 (en) * 2019-06-25 2020-12-30 广州视源电子科技股份有限公司 Method, apparatus, and device for determining pushing user group, and storage medium
CN112134872A (en) * 2020-09-16 2020-12-25 江苏省未来网络创新研究院 Network system with multi-application-layer cloud computing function
CN112134872B (en) * 2020-09-16 2022-07-26 江苏省未来网络创新研究院 Network system with multi-application-layer cloud computing function
US11657607B2 (en) 2020-11-13 2023-05-23 International Business Machines Corporation Non-intrusive image identification

Similar Documents

Publication Publication Date Title
CN101547162A (en) Method and device for tagging user based on user state information
CN101876981B (en) A kind of method and device building knowledge base
CN102279851B (en) Intelligent navigation method, device and system
CN102053983B (en) Method, system and device for querying vertical search
CN102368788B (en) Information pushing method and apparatus thereof
CN104850546B (en) Display method and system of mobile media information
CN106503015A (en) A kind of method for building user's portrait
CN101901450A (en) Media content recommendation method and media content recommendation system
CN106504099A (en) A kind of system for building user's portrait
US9015158B2 (en) Contents creating device and contents creating method
CN103218355A (en) Method and device for generating tags for user
CN104216881A (en) Method and device for recommending individual labels
CN109451147B (en) Information display method and device
CN106415644A (en) Dynamic content item creation
CN107911448A (en) Content pushing method and device
CN103384883A (en) Semantic enrichment by exploiting Top-K processing
TWI417751B (en) Information providing device, information providing method, information application program, and information recording medium
CN106095797A (en) The data display method of handheld terminal, display system and client
CN110059237A (en) A kind of preference information acquisition system and its recommended method based on search engine
CN109726295A (en) Brand knowledge map display methods, device, figure server and storage medium
CN103123651A (en) Method of rapidly searching multiple same-kind paper, device and mobile equipment
CN100555283C (en) A kind of directly at the dissemination method and the system of user's relevant information
CN110297953A (en) Product information recommended method, device, computer equipment and storage medium
TWI399657B (en) A provider, a method of providing information, a program, and an information recording medium
CN114066533A (en) Product recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090930