WO2013117147A1 - 微博排序、搜索、展示方法和系统 - Google Patents
微博排序、搜索、展示方法和系统 Download PDFInfo
- Publication number
- WO2013117147A1 WO2013117147A1 PCT/CN2013/071325 CN2013071325W WO2013117147A1 WO 2013117147 A1 WO2013117147 A1 WO 2013117147A1 CN 2013071325 W CN2013071325 W CN 2013071325W WO 2013117147 A1 WO2013117147 A1 WO 2013117147A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- microblog
- information
- user
- content
- category
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012163 sequencing technique Methods 0.000 title abstract 5
- 239000013598 vector Substances 0.000 claims description 66
- 238000012549 training Methods 0.000 claims description 61
- 230000000694 effects Effects 0.000 claims description 29
- 238000004364 calculation method Methods 0.000 claims description 20
- 230000003993 interaction Effects 0.000 claims description 12
- 238000013145 classification model Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000000875 corresponding effect Effects 0.000 description 24
- 238000005516 engineering process Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 9
- 239000000284 extract Substances 0.000 description 6
- 230000002996 emotional effect Effects 0.000 description 4
- 235000001497 healthy food Nutrition 0.000 description 4
- 238000012546 transfer Methods 0.000 description 3
- 238000007635 classification algorithm Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000012797 qualification Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/14—Digital output to display device ; Cooperation and interconnection of the display device with other functional units
Definitions
- the present invention relates to network technologies, and in particular, to a microblog ordering, searching, and displaying method and system.
- Weibo has become an important platform for users to communicate with each other and users to show themselves. Users can search for Weibo to get the information they are interested in.
- the traditional microblog sorting method generally sorts the microblogs in chronological order, and sorts the newer microblogs in the front.
- the traditional microblog sorting method because all users' microblogs are mixed together, is arranged in chronological order, which causes users to spend a lot of energy and time to find interesting and relevant ones from the numerous microblogs. Weibo.
- a microblog sorting method includes the following steps:
- the microblog information is displayed according to the sorted results.
- a microblog sorting system comprising:
- the microblog information obtaining module is configured to obtain the microblog information requested by the user
- the scoring module includes a user information scoring module and a content information scoring module, wherein the user information scoring module is configured to extract microblog publishing user information in the microblog information, and publish user information based on the microblog The microblog information is scored; the content information scoring module is configured to extract content information in the microblog information, and score the microblog information according to the content information;
- a sorting module configured to sort the microblog information according to the score
- a display module configured to display the microblog information according to the sorted result.
- a microblog search method is provided for the user to view.
- a microblog search method which sorts the microblog search results according to the microblog ranking method, wherein the step of acquiring the microblog information requested by the user comprises: performing a search according to a keyword input by the user, and obtaining the user request Weibo information.
- microblog search system is provided for user convenience.
- a microblog search system comprising the above microblog ranking system, wherein the microblog information acquisition module is configured to perform a search according to a keyword input by a user, and obtain microblog information requested by the user.
- microblog display method is provided for the user to view.
- a microblog display method in which the microblog request result is sorted according to the microblog ranking method, wherein the step of obtaining the microblog information requested by the user includes: obtaining the user according to the microblog request information corresponding to the user identifier Requested Weibo information.
- microblog display system is provided for user convenience.
- a microblog display system comprising the above microblog ranking system, wherein the microblog information obtaining module is configured to obtain the microblog information requested by the user according to the microblog request information corresponding to the user identifier.
- microblog sorting, searching, displaying method and system extract microblog information in the microblog information and publish the user information and content information, scoring the microblog, and sorting the microblog information according to the score, which may be related to the user.
- the information is ranked in front, so that users can view the microblog information.
- FIG. 1 is a schematic flow chart of a microblog sorting method in an embodiment
- FIG. 2 is a schematic flowchart of scoring microblog information by extracting user information of microblogs in the microblog information in an embodiment
- FIG. 3 is a schematic flowchart of scoring microblog information by extracting user information of microblogs in the microblog information in another embodiment
- FIG. 4 is a schematic flowchart of extracting microblog information by extracting content information in microblog information in an embodiment
- FIG. 5 is a schematic flow chart of features of a training microblog topic category in an embodiment
- FIG. 6 is a schematic flowchart of acquiring a training subset of a microblog topic category in an embodiment
- FIG. 7 is a schematic diagram of an acquisition of a technology network type training subset in an embodiment
- FIG. 8 is a schematic diagram showing the principle of a microblog sorting method in an embodiment
- FIG. 9 is a schematic structural diagram of a microblog sorting system in an embodiment
- FIG. 10 is a schematic structural diagram of a scoring module in an embodiment
- FIG. 11 is a schematic structural diagram of a user information scoring module in an embodiment
- FIG. 12 is a schematic structural diagram of a user information scoring module in another embodiment
- FIG. 13 is a schematic structural diagram of a content information scoring module in an embodiment
- FIG. 14 is a schematic structural diagram of a classification model training module in an embodiment.
- a microblog ordering method includes the following steps:
- Step S101 Acquire microblog information requested by the user.
- Step S102 extracting microblog publishing user information and content information in the microblog information, and scoring the microblog information.
- the microblog information is also scored high.
- Step S103 sorting the microblog information according to the score.
- the microblog information is sorted according to the level of the score, that is, the higher the score of the microblog information, the higher the ranking.
- step S104 the microblog information is displayed according to the sorted result.
- the microblog ranking method extracts user information and content information of the microblog in the microblog information, scores the microblog, and sorts the microblog information according to the score, and can list the microblog information related to the user in front. This makes it easy for users to view Weibo information.
- the step of extracting the microblog publishing user information in the microblog information in step S102 to score the microblog information includes:
- Step S112 obtaining a microblog operation record of the microblog publishing user, and calculating the activity level of the microblog publishing user according to the microblog operation record.
- the microblog operation record corresponding to the ID of the user may be found in the database in which the microblog operation record of the user has been stored according to the ID of the microblog publishing user.
- the microblog operation record may include: whether it is a VIP user, a microblog update frequency, a microblog transfer rate, a microblog original rate, a microblog forwarded comment number, a microblog average word count, a funny score, and the like.
- the funny score can be obtained based on other users' funny scores on the Weibo publishing user's Weibo.
- the Weibo operation record of Weibo's published users reflects the activity of Weibo's published users. Specifically, if Weibo publishes a user as a VIP user, or if the Weibo update frequency is high, the reposting rate is high, the original rate is high, the number of forwarded comments is large, the average number of words is large, or the funny score is high, the corresponding setting can be made. The popularity of Weibo publishing users is also high.
- step S122 the microblog information is scored according to the activity level.
- the Weibo publishing user has a high degree of activity, and the rating of the Weibo can be correspondingly increased, because the microblogging published by the user with high activity is more likely to be of interest to the user.
- the microblog information of the microblog publishing user with high activity is also scored high, and the microblog information with high score is ranked first, and the microblog information that is more likely to cause user interest is ranked in front. It is convenient for users to view the Weibo information they are interested in.
- the step of extracting the microblog publishing user information in the microblog information in step S102 to score the microblog information includes:
- Step S132 Acquire the personal information of the microblog publishing user and the personal information of the microblog requesting user, and calculate the similarity between the personal information of the microblog publishing user and the personal information of the microblog requesting user.
- the personal information corresponding to the user's ID may be found in a database in which the user's personal information has been stored according to the ID of the user.
- the personal information may include: hobbies, education, professional, geographical, personalized signature, collected microblog information, common friends, user type information, and the like.
- the user types can be classified into: technology type, entertainment type, sports type, art type, political type, and the like.
- the user type information includes a user type vector, and the component of the user type vector represents a score of the user biased toward a certain user type.
- the first component of the user type vector may be defined to represent the technology type score and the second component representation.
- the user type vector can be expressed as (3, 4, ).
- the user type corresponding to the component with the highest score among the components of the user type vector may be selected as the user type of the user.
- the user type vector may be obtained by a user's manual setting, or may be obtained by counting the user's attention to the microblog user and the user's friend's user type. For example, among the Weibo users and the users' friends who are concerned by the user, the number of people belonging to the technology type is 5, and the component corresponding to the technology type in the user type vector may be set to 5.
- the value of the similarity between the microblog publishing user and the microblog requesting user may be increased.
- the category to which the user's hobby belongs may be found in a database storing the category to which the hobby belongs.
- the value of similarity between users can also be increased.
- the similarity of the user type information may also be obtained by calculating the distance of the user type vector. The smaller the distance between the two user type vectors, the higher the similarity of the user type information, and the corresponding users. The similarity is also high.
- Step S142 Acquire an interaction record between the microblog publishing user and the microblog requesting user, and calculate the degree of association between the microblog publishing user and the microblog requesting user according to the interaction record.
- the interaction record includes references, accesses, comments, forwarding records, and the like between users. Specifically, if the number of references, visits, comments, and forwardings between users is high, the degree of association between users can be set accordingly.
- Step S152 the microblog information is scored according to the similarity and the degree of association.
- the scoring of the microblog information may be increased if the similarity between the microblog publishing user and the microblog requesting user is high or the degree of association is high.
- the score of the microblog information is also high, and the microblog with high score is obtained.
- the information is ranked first.
- These microblog information is also the microblog information that is more likely to cause the microblog to request the user's interest, so that the user can view the microblog information of interest.
- the step of extracting the microblog posting user information in the microblog information in step S102 to score the microblog information includes steps S112 to S152.
- the scoring of the microblog information in step S152 may be performed on the basis of the scoring of the microblog information in step S122. That is, the score obtained by the microblog publishing user's activity level and the score obtained by the microblog publishing user and the microblog requesting user's personal information similarity and relevance are used as the comprehensive score of the microblog information, and the above two can be set. The proportion of ratings in the overall score.
- the step of extracting the content information in the microblog information in step S102 to score the microblog information includes:
- Step S162 Acquire microblog content in the microblog information, and obtain a topic category vector of the microblog content in the microblog information according to the microblog content and the feature of the microblog topic category.
- the microblog content includes the text content of the microblog, that is, the content published by the microblog user, and the microblog content may further include the comment content of the microblog.
- the microblog content published by the publishing user of the microblog at the time of the microblog publishing time may be obtained, and the microblog content is The blog content is put together.
- the microblog theme categories include: political and military, culture and art, financial stocks, emotional life, social legal system, entertainment gossip, technology network, healthy food, sports, automobile real estate, education job hunting, fashion tourism, and the like.
- each component of the subject category vector represents a score of the microblog content biased to a certain microblog topic category, for example, the first component of the subject category vector represents a political and military class score, and the second component representation The scores of culture and art, and so on.
- the subject category vector (5, 10, ...) indicates that the score of Weibo content belongs to the political and military category, and the score attributed to the culture and art category is 10.
- the microblog topic category corresponding to the component with the highest score is the microblog topic category to which the microblog content belongs.
- the features of the microblog topic category may be pre-trained. Further, the existing naive Bayes text classification algorithm may be used to classify the microblog content, and the topic category vector of the microblog content is obtained, and details are not described herein. .
- Step S172 Acquire a historical microblog content of the microblog requesting user, and obtain a topic category vector of the historical microblog content of the microblog requesting the microblog according to the historical microblog content and the feature of the microblog theme category.
- the Weibo content requested by the user in the recent time period (which can be preset) can be obtained.
- the average of the plurality of vectors may be obtained as the subject category vector of the historical microblog content of the microblog requesting user.
- Step S182 calculating the microblog content in the microblog information and the history microblog content of the microblog requesting user according to the topic category vector of the microblog content in the microblog information and the topic category vector of the microblog requesting the user's historical microblog content. The similarity between the two.
- the similarity between the microblog content in the microblog information and the historical microblog content of the microblog requesting user may be calculated by calculating the distance between the two subject category vectors.
- the smaller the distance the higher the similarity is set.
- Step S192 the microblog information is scored according to the similarity.
- the score of the microblog information is also high, and the microblog information with high score is ranked in front.
- these top-ranked Weibo content is more likely to attract users' interest, so that users can view the Weibo they are interested in.
- the microblog topic category needs to be trained in advance, and the microblog ranking method is further include:
- Step S501 Acquire a preset microblog topic category.
- the microblog theme categories include: political and military, culture and art, financial stocks, emotional life, social legal system, entertainment gossip, technology network, healthy food, sports, automobile real estate, education job hunting, fashion tourism, and the like.
- Step S502 acquiring a training subset of the microblog topic category.
- step S502 includes: step S512, searching for a microblog according to a keyword of a microblog topic category, and acquiring an initial training subset of the microblog topic category; and step S522, according to the pre- The number of times is repeated to perform the following steps S532 and S542: step S532, the high frequency words in the initial training subset are counted; and in step S542, the search results are added to the initial training subset according to the high frequency word search microblog.
- the name of the Weibo theme category and its split words can be used as keywords of the Weibo theme category, such as political and military categories, and political, military, and political military can be used as keywords in this category, and according to these keywords.
- a "technical, network, and technology network” may be added to the query set QS1, a word in the QS1 is used as a keyword search microblog, and a training subset RS1 is obtained; a high frequency word in the statistical RS1 is obtained.
- the method for obtaining the training subset of the microblog topic category in this embodiment can obtain a large number of microblog training samples for each topic category, and provides a basis for extracting the features of each microblog topic category from the training subset.
- Step S503 extracting features of the microblog topic category from the training subset.
- the existing classification training method can be used to train the microblog content in the training subset of each topic category, and extract the features of each topic category. I will not repeat them here.
- the method before the step S104, the method further includes:
- the Weibo content in the Weibo information is classified, and the display category to which the Weibo content belongs is obtained.
- the display category may include the Weibo theme categories in the above, such as political and military categories, cultural and art categories, financial stocks, and the like.
- the microblog topic category to which the microblog content belongs may be obtained according to the topic category vector of the microblog content in the microblog information acquired in step S162, and the subject category corresponding to the component with the highest score in the subject category vector may be the microblog content belongs to.
- Weibo theme category may be obtained according to the topic category vector of the microblog content in the microblog information acquired in step S162, and the subject category corresponding to the component with the highest score in the subject category vector may be the microblog content belongs to.
- microblog topic category in addition to the microblog topic category, other display categories may be added, such as a friend class, a location class, a funny class, a help forwarding class, an advertisement activity class, and the like.
- Whether the microblog information belongs to a friend class can be judged according to whether the microblog user and the microblog request the user as a friend.
- whether the microblog publishing user and the microblog requesting user are friends are found in the database in which the friend correspondence relationship has been stored according to the ID of the microblog publishing user and the ID of the microblog requesting user.
- Whether the microblog information belongs to the location class can be judged according to whether the address of the user and the microblog requesting the user belongs to the same region (can be set as a county, a district, etc.) according to the microblog.
- Whether the microblog is a funny class can be judged according to whether the funny score value found in the database in which the user's funny score has been stored is greater than a preset threshold according to the ID of the microblog publishing user.
- the user's funny score can be obtained based on other users' funny scores for the user.
- Whether the microblog information belongs to the help forwarding class or the advertisement activity class can be judged according to whether there is help, high frequency words, etc. in the content of the microblog.
- the microblog display category may also include a hot topic class.
- the high-frequency record can be obtained by parsing the content of the webpage; the high-frequency record is scored according to the historical microblog content of the user requested by the microblog; and the microblog in the search result is classified into a hot topic according to the high-frequency record score.
- the content of the webpage can be parsed according to the existing open source tool Html-parser, and a phrase whose number of occurrences exceeds a preset threshold, that is, a high frequency record, is obtained.
- the high frequency record can be scored according to the similarity between the high frequency record and the microblog requesting the user's historical microblog content. Specifically, the number of times that the high frequency record appears in the microblog content that the microblog requests the user to post, forward, and comment can be counted, and the high frequency record is scored according to the number of times.
- the high frequency record of the previous preset position may be selected, and the microblog information of the high frequency record appearing in the microblog content is selected, and the microblog information is classified into a hot topic category.
- step S104 the specific process of step S104 is: displaying the microblog information according to the display category to which the microblog content belongs and the result of the above sorting.
- the Weibo information may be classified according to each display category, and the Weibo information with high scores is arranged in front of each display category.
- the microblog information is divided into multiple display categories for display, which is convenient for the user to select the microblog category of interest to view, which is convenient for the user's operation.
- each display category is displayed in the order of the scores of Weibo, and the Weibo is ranked in the top order.
- the Weibo publishing users are more active, or Weibo publishes the user's personal information and Weibo.
- the similarity of the personal information of the requesting user is high, or the degree of association between the microblog publishing user and the microblog requesting user is high, so that the user can view the microblog that is interested in the user.
- FIG. 8 is a schematic diagram of the principle of the microblog ranking method in one embodiment:
- a microblog sorting method can score microblog information according to user information and content information published by Weibo.
- the microblog publishing user information score is recorded as U, and the content information score is recorded as C.
- the microblog publishing user information score U can be based on the microblog publishing user activity score A, the microblog publishing user and the microblog requesting user's personal information similarity score P, the microblog publishing user and the microblog requesting user's relevance degree score.
- R is calculated.
- Weibo published user activity score A can be obtained according to the following information of Weibo published users: whether it is VIP user, Weibo update frequency, Weibo transfer rate, Weibo original rate, Weibo forwarded comments, Weibo average Word count, funny score, etc.;
- Weibo published user and Weibo request user's personal information similarity score P can be obtained according to the following information: hobbies, education, professional, geographical, personalized signature, collection of Weibo Information, common friends, user type information, and the like;
- the relevance score R of the microblog publishing user and the microblog requesting user may be obtained according to the interaction record between the microblog publishing user and the microblog requesting user, and the interaction record includes a reference, Access, comment, forward records, and more.
- the microblog content information score C may be calculated according to the similarity between the microblog content and the historical microblog content of the microblog requesting user, wherein the similarity may request the user's historical microblog according to the microblog topic category vector and the microblog. The distance between the subject category vectors is calculated. Finally, the above score can be integrated to obtain a comprehensive score of the microblog information.
- a microblog ranking system includes a microblog information obtaining module 10, a scoring module 20, a sorting module 30, and a display module 40, wherein:
- the microblog information obtaining module 10 is configured to obtain microblog information requested by the user.
- the scoring module 20 includes a user information scoring module 201 and a content information scoring module 202, as shown in FIG. 10, wherein the user information scoring module 201 is configured to extract the microblog publishing user information in the microblog information, and publish the user information according to the microblog. The microblog information is scored; the content information scoring module 202 is configured to extract the content information in the microblog information, and score the microblog information according to the content information.
- the user information scoring module 201 and the content information scoring module 202 score the microblog information in a comprehensive manner.
- the comprehensive score of the microblog information is also high.
- the sorting module 30 is configured to sort the microblog information according to the above score.
- the ranking module 30 sorts the microblog information according to the level of the above comprehensive score, that is, the higher the microblog information score, the higher the ranking.
- the display module 40 is configured to display the microblog information according to the sorted result.
- the microblog sorting system extracts user information and content information of the microblog in the microblog information, scores the microblog, and sorts the microblog information according to the score, and can arrange the microblog information related to the user in front. This makes it easy for users to view Weibo information.
- the user information scoring module 201 includes an activity calculation unit 211 and a first scoring unit 221, wherein:
- the activity calculation unit 211 is configured to acquire a microblog operation record of the microblog publishing user, and calculate the activity level of the microblog publishing user according to the microblog operation record.
- the activity calculation unit 211 may find the microblog operation record corresponding to the ID of the user in the database in which the microblog operation record of the user has been stored according to the ID of the microblog publishing user.
- the microblog operation record may include: whether it is a VIP user, a microblog update frequency, a microblog transfer rate, a microblog original rate, a microblog forwarded comment number, a microblog average word count, a funny score, and the like.
- the funny score can be obtained based on other users' funny scores on the Weibo publishing user's Weibo.
- the Weibo operation record of Weibo's published users reflects the activity of Weibo's published users. Specifically, if Weibo publishes a user as a VIP user, or if the Weibo update frequency is high, the reposting rate is high, the original rate is high, the number of forwarded comments is large, the average number of words is large, or the funny score is high, the corresponding setting can be made. The popularity of Weibo publishing users is also high.
- the first scoring unit 221 is configured to score the microblog information according to the activity level.
- the microblog publishing user has high activity
- the first scoring unit 221 can correspondingly increase the scoring of the microblog, because the microblog that publishes the user with high activity is more likely to be interested in the user.
- the microblog information of the microblog publishing user with high activity is also scored high, and the microblog information with high score is ranked first, and the microblog information that is more likely to cause user interest is ranked in front. It is convenient for users to view the Weibo information they are interested in.
- the user information scoring module 201 includes a personal information similarity calculation unit 231, an association degree calculation unit 241, and a second scoring unit 251, wherein:
- the personal information similarity calculation unit 231 is configured to acquire the personal information of the microblog publishing user and the personal information of the microblog requesting user, and calculate the similarity between the personal information of the microblog publishing user and the personal information of the microblog requesting user.
- the personal information similarity calculation unit 231 can find the personal information corresponding to the ID of the user in the database in which the personal information of the user has been stored according to the ID of the user.
- the personal information may include: hobbies, education, professional, geographical, personalized signature, collected microblog information, common friends, user type information, and the like.
- the user types can be classified into: technology type, entertainment type, sports type, art type, political type, and the like.
- the user type information includes a user type vector, and the component of the user type vector represents a score of the user biased toward a certain user type.
- the first component of the user type vector may be defined to represent the technology type score and the second component representation.
- the user type vector can be expressed as (3, 4, ).
- the user type corresponding to the component with the highest score among the components of the user type vector may be selected as the user type of the user.
- the user type vector may be obtained by a user's manual setting, or may be obtained by counting the user's attention to the microblog user and the user's friend's user type. For example, among the Weibo users and the users' friends who are concerned by the user, the number of people belonging to the technology type is 5, and the component corresponding to the technology type in the user type vector may be set to 5.
- the personal information similarity calculation unit 231 can increase the value of the similarity between the microblog publishing user and the microblog requesting user. .
- the personal information similarity calculation unit 231 may search for a category to which the user's interest category belongs in a database in which the category of the interest category is stored.
- the user's academic qualifications are the same, such as undergraduate or doctoral degree, the value of similarity between users can also be increased.
- the similarity of the user type information may also be obtained by calculating the distance of the user type vector. The smaller the distance between the two user type vectors, the higher the similarity of the user type information, and the corresponding users. The similarity is also high.
- the association degree calculation unit 241 is configured to acquire an interaction record between the microblog publishing user and the microblog requesting user, and calculate the degree of association between the microblog publishing user and the microblog requesting user according to the interaction record.
- the interaction record includes references, accesses, comments, forwarding records, and the like between users. Specifically, if the number of times of reference, access, comment, and forwarding between users is high, the degree of association calculation unit 241 can set the degree of association between the users to be high.
- the second scoring unit 251 is configured to score the microblog information according to the similarity and the degree of association described above.
- the second scoring unit 251 may increase the scoring of the microblog information.
- the score of the microblog information is also high, and the microblog with high score is obtained.
- the information is ranked first.
- These microblog information is also the microblog information that is more likely to cause the microblog to request the user's interest, so that the user can view the microblog information of interest.
- the user information scoring module 201 includes an activity calculation unit 211, a first scoring unit 221, a personal information similarity calculation unit 231, an association degree calculation unit 241, and a second scoring unit 251.
- the scoring of the microblog information by the second scoring unit 251 can be performed on the basis of the scoring of the microblog information by the first scoring unit 221, That is, the score obtained by the microblog publishing user's activity level and the score obtained by the microblog publishing user and the microblog requesting user's personal information similarity and relevance are used as the comprehensive score of the microblog information, and the above two can be set.
- the proportion of ratings in the overall score is not limited to the overall score.
- the content information scoring module 202 includes a category vector extracting unit 212, a content similarity calculating unit 222, and a third scoring unit 232, wherein:
- the class vector extracting unit 212 is configured to acquire the microblog content in the microblog information, and acquire the topic category vector of the microblog content in the microblog information according to the microblog content and the feature of the microblog topic category.
- the microblog content includes the text content of the microblog, that is, the content published by the microblog user, and the microblog content may further include the comment content of the microblog.
- the category vector extracting unit 212 may obtain the microblog content published by the publishing user of the microblog in the similar time (pre-settable) of the microblog publishing time point. , put together multiple pieces of Weibo content together.
- the microblog theme categories include: political and military, culture and art, financial stocks, emotional life, social legal system, entertainment gossip, technology network, healthy food, sports, automobile real estate, education job hunting, fashion tourism, and the like.
- each component of the subject category vector represents a score of the microblog content biased to a certain microblog topic category, for example, the first component of the subject category vector represents a political and military class score, and the second component representation The scores of culture and art, and so on.
- the subject category vector (5, 10, ...) indicates that the score of Weibo content belongs to the political and military category, and the score attributed to the culture and art category is 10.
- the microblog topic category corresponding to the component with the highest score is the microblog topic category to which the microblog content belongs.
- the feature of the microblog topic category may be pre-trained.
- the class vector extracting unit 212 may use the existing naive Bayesian text classification algorithm to classify the microblog content, and obtain the topic category vector of the microblog content. I will not repeat them here.
- the category vector extracting unit 212 is further configured to acquire the historical microblog content of the microblog requesting user, and obtain the theme category vector of the historical microblog content of the microblog requesting the microblog according to the historical microblog content and the feature of the microblog theme category.
- the category vector extracting unit 212 may acquire the microblog content that the microblog requests the user to publish in the recent time period (pre-settable).
- the category vector extracting unit 712 may find the average of the plurality of vectors as the subject category vector of the historical microblog content of the microblog requesting user.
- the content similarity calculation unit 222 is configured to calculate the microblog content and the microblog requesting user in the microblog information according to the topic category vector of the microblog content in the microblog information and the topic category vector of the microblog requesting the user's historical microblog content. The similarity between historical Weibo content.
- the content similarity calculation unit 222 can calculate the similarity between the microblog content in the microblog information and the historical microblog content of the microblog requesting user by calculating the distance between the two topic category vectors. Preferably, the smaller the distance, the higher the similarity is set.
- the third scoring unit 232 is configured to score the microblog information according to the similarity.
- the higher the similarity the higher the score of the microblog information by the third scoring unit 232.
- the score of the microblog information is also high, and the microblog information with high score is ranked in front.
- these top-ranked Weibo content is more likely to attract users' interest, so that users can view the Weibo they are interested in.
- the microblog ranking system further includes a classification model training module 50, which is used to train samples of each microblog topic category, and extract features of each microblog topic category.
- the classification model training module 50 includes a topic category acquisition module 501, a training set acquisition module 502, and a feature extraction module 503:
- the topic category obtaining module 501 is configured to acquire a preset microblog topic category.
- the microblog theme categories include: political and military, culture and art, financial stocks, emotional life, social legal system, entertainment gossip, technology network, healthy food, sports, automobile real estate, education job hunting, fashion tourism, and the like.
- the training set acquisition module 502 is configured to acquire a training subset of the microblog topic category.
- the training set obtaining module 502 may search the microblog according to the keyword of the microblog topic category to obtain an initial training subset of the microblog topic category; and repeatedly perform the following steps according to the preset number of times: counting the initial training subset High frequency words; search for microblogs based on high frequency words, adding search results to the initial training subset.
- the training set obtaining module 502 can use the microblog topic category name and its split word as keywords of the microblog topic category, such as political and military categories, and can use political, military, and political military as keywords in this category. And searching according to these keywords to obtain an initial training subset of the category. Further, after the initial training subset is preprocessed, the word segmentation, the filtering stop words are processed, and the high frequency words in the initial subset are counted. Further, the combination of high frequency words and high frequency words can be continuously searched as keywords to obtain more microblog training samples. And repeating the steps of counting the high frequency words in the initial training subset according to the preset number of times, searching the microblog according to the high frequency words, and adding the search results to the initial training subset.
- the method for obtaining the training subset of the microblog topic category in this embodiment can obtain a large number of microblog training samples for each topic category, and provides a basis for extracting the features of each microblog topic category from the training subset.
- the feature extraction module 503 is configured to extract features of the microblog topic category from the training subset.
- the feature extraction module 503 can use the existing classification training method to train the microblog content in the training subset of each topic category, and extract the features of each topic category. I will not repeat them here.
- the microblog ranking system further includes a display category classification module (not shown) for classifying the microblog content in the microblog information according to the preset microblog display category, and obtaining the microblog content.
- the display category may include the Weibo theme categories in the above, such as political and military categories, cultural and art categories, financial stocks, and the like.
- the microblog topic category to which the microblog content belongs may be obtained according to the topic category vector of the microblog content in the microblog information acquired by the category vector extracting unit 212, and the subject category corresponding to the component with the highest score in the subject category vector may be the microblog.
- the category of the Weibo topic to which the content belongs may include the Weibo theme categories in the above, such as political and military categories, cultural and art categories, financial stocks, and the like.
- the display category classification module may search for a friend between the microblog publishing user and the microblog requesting user in the database in which the friend correspondence relationship has been stored according to the ID of the microblog publishing user and the ID of the microblog requesting user. .
- Whether the microblog information belongs to the location class can be judged according to whether the address of the user and the microblog requesting the user belongs to the same region (can be set as a county, a district, etc.) according to the microblog.
- Whether the microblog is a funny class can be judged according to whether the funny score value found in the database in which the user's funny score has been stored is greater than a preset threshold according to the ID of the microblog publishing user.
- the user's funny score can be obtained based on other users' funny scores for the user.
- Whether the microblog information belongs to the help forwarding class or the advertisement activity class can be judged according to whether there is help, high frequency words, etc. in the content of the microblog.
- the microblog display category may also include a hot topic class.
- the display category classification module can parse the webpage content to obtain a high frequency record; and according to the microblog requesting the user's historical microblog content, the high frequency record is scored; according to the high frequency record score, the microblog in the search result is classified as a hot topic. class.
- the display category classification module can parse the webpage content according to the existing open source tool Html-parser, and obtain a phrase whose appearance number exceeds a preset threshold, that is, a high frequency record.
- the high frequency record can be scored according to the similarity between the high frequency record and the microblog requesting the user's historical microblog content. Specifically, the number of times that the high frequency record appears in the microblog content that the microblog requests the user to post, forward, and comment can be counted, and the high frequency record is scored according to the number of times.
- the high frequency record of the previous preset position may be selected, and the microblog information of the high frequency record appearing in the microblog content is selected, and the microblog information is classified into a hot topic category.
- the display module 40 is configured to display the microblog information according to the display category to which the microblog content belongs and the result of the above sorting.
- the display module 40 can display the microblog information according to each display category, and arrange the microblog information with high scores in the front of each display category.
- the microblog information is divided into multiple display categories for display, which is convenient for the user to select the microblog category of interest to view, which is convenient for the user's operation.
- each display category is displayed in the order of the scores of Weibo, and the Weibo is ranked in the top order.
- the Weibo publishing users are more active, or Weibo publishes the user's personal information and Weibo.
- the similarity of the personal information of the requesting user is high, or the degree of association between the microblog publishing user and the microblog requesting user is high, so that the user can view the microblog that is interested in the user.
- a microblog search method which sorts the microblog search results according to the microblog ranking method, wherein the step of obtaining the microblog information requested by the user includes: searching according to the keyword input by the user, and obtaining the microblog information requested by the user. .
- the traditional search engine may be used to search for keywords input by the user, and search for microblog information matching the keywords, thereby obtaining the microblog information requested by the user.
- a microblog search system includes the above microblog ranking system, wherein the microblog information obtaining module 10 is configured to perform a search according to a keyword input by a user, and obtain microblog information requested by the user.
- the microblog information obtaining module 10 may search for keywords input by the user by using a traditional search engine, and search for microblog information matched with the keywords, thereby obtaining microblog information requested by the user.
- a microblog display method in which the microblog request result is sorted according to the microblog sorting method, wherein the step of obtaining the microblog information requested by the user includes: obtaining the micro request request according to the microblog request information corresponding to the user identifier Bo information.
- the microblog request information corresponding to the user identifier may be preset to obtain the microblog information of the crowd corresponding to the user identifier.
- the user identifier (such as the user ID) can be used to find the crowd that the user pays attention to or listen to, and the user's friend, and obtain the microblog information of the crowd and the user friend in the near period, thereby obtaining the user. Requested Weibo information.
- a microblog display system comprising the microblog ranking system, wherein the microblog information obtaining module 10 is configured to obtain the microblog information requested by the user according to the microblog request information corresponding to the user identifier.
- the microblog request information corresponding to the user identifier may be preset to obtain the microblog information of the crowd corresponding to the user identifier.
- the microblog information obtaining module 10 may search for the crowd that the user pays attention to or listen to and the user's friend according to the user identifier (such as the user ID), and obtain the micro and the user's friends in the short period of time. Bo information, so as to get the microblog information requested by the user.
- the storage medium may be a magnetic disk, an optical disk, or a read-only storage memory (Read-Only) Memory, ROM) or Random Access Memory (RAM).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Primary Health Care (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Human Computer Interaction (AREA)
- Information Transfer Between Computers (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (18)
- 一种微博排序方法,包括以下步骤:获取用户请求的微博信息;提取所述微博信息中的微博发表用户信息与内容信息,对所述微博信息进行评分;根据所述评分对所述微博信息进行排序;按照所述排序的结果展示所述微博信息。
- 根据权利要求1所述的微博排序方法,其特征在于,提取所述微博信息中的微博发表用户信息对所述微博信息进行评分的步骤包括:获取所述微博发表用户的微博操作记录,根据所述微博操作记录计算所述微博发表用户的活跃度;根据所述活跃度对所述微博信息进行评分。
- 根据权利要求1或2所述的微博排序方法,其特征在于,提取所述微博信息中的微博发表用户信息对所述微博信息进行评分的步骤包括:获取所述微博发表用户的个人信息以及微博请求用户的个人信息,计算所述微博发表用户的个人信息与所述微博请求用户的个人信息之间的相似度;获取所述微博发表用户与所述微博请求用户之间的交互记录,根据所述交互记录计算所述微博发表用户与所述微博请求用户之间的关联度;根据所述相似度和所述关联度对所述微博信息进行评分。
- 根据权利要求1所述的微博排序方法,其特征在于,提取所述微博信息中的内容信息对所述微博信息进行评分的步骤包括:获取所述微博信息中的微博内容,根据所述微博内容以及微博主题类别的特征获取所述微博信息中的微博内容的主题类别向量;获取微博请求用户的历史微博内容,根据所述历史微博内容以及微博主题类别的特征获取所述微博请求用户的历史微博内容的主题类别向量;根据所述微博信息中的微博内容的主题类别向量和所述微博请求用户的历史微博内容的主题类别向量,计算所述微博信息中的微博内容与所述微博请求用户的历史微博内容之间的相似度;根据所述相似度对所述微博信息进行评分。
- 根据权利要求4所述的微博排序方法,其特征在于,在提取所述微博信息中的内容信息对所述微博信息进行评分的步骤之前,所述方法还包括:获取预设的微博主题类别;获取所述微博主题类别的训练子集;从所述训练子集中提取出所述微博主题类别的特征。
- 根据权利要求5所述的微博排序方法,其特征在于,所述获取所述微博主题类别的训练子集的步骤包括:根据所述微博主题类别的关键词搜索微博,获取所述微博主题类别的初始训练子集;按照预设次数重复执行以下步骤:统计所述初始训练子集中的高频词;根据所述高频词搜索微博,将搜索结果加入所述初始训练子集。
- 根据权利要求1所述的微博排序方法,其特征在于,在所述按照所述排序的结果展示所述微博信息的步骤之前,所述方法还包括:按照预设的微博展示类别对所述微博信息中的微博内容进行归类,得到所述微博内容所属的展示类别;所述按照所述排序的结果展示所述微博信息的步骤为:按照所述微博内容所属的展示类别及所述排序的结果展示所述微博信息。
- 一种微博排序系统,其特征在于,包括:微博信息获取模块,用于获取用户请求的微博信息;评分模块,所述评分模块包括用户信息评分模块和内容信息评分模块,所述用户信息评分模块用于提取所述微博信息中的微博发表用户信息,根据所述微博发表用户信息对所述微博信息进行评分;所述内容信息评分模块用于提取所述微博信息中的内容信息,根据所述内容信息对所述微博信息进行评分;排序模块,用于根据所述评分对所述微博信息进行排序;展示模块,用于按照所述排序的结果展示所述微博信息。
- 根据权利要求8所述的微博排序系统,其特征在于,所述用户信息评分模块包括:活跃度计算单元,用于获取所述微博发表用户的微博操作记录,根据所述微博操作记录计算所述微博发表用户的活跃度;第一评分单元,根据所述活跃度对所述微博信息进行评分。
- 根据权利要求8或9所述的微博排序系统,其特征在于,所述用户信息评分模块包括:个人信息相似度计算单元,用于获取所述微博发表用户的个人信息以及微博请求用户的个人信息,计算所述微博发表用户的个人信息与所述微博请求用户的个人信息之间的相似度;关联度计算单元,用于获取所述微博发表用户与所述微博请求用户之间的交互记录,根据所述交互记录计算所述微博发表用户与所述微博请求用户之间的关联度;第二评分单元,用于根据所述相似度和所述关联度对所述微博信息进行评分。
- 根据权利要求8所述的微博排序系统,其特征在于,所述内容信息评分模块包括:类别向量提取单元,用于获取所述微博信息中的微博内容,根据所述微博内容以及微博主题类别的特征获取所述微博信息中的微博内容的主题类别向量;所述类别向量提取单元还用于获取微博请求用户的历史微博内容,根据所述历史微博内容以及微博主题类别的特征获取所述微博请求用户的历史微博内容的主题类别向量;内容相似度计算单元,用于根据所述微博信息中的微博内容的主题类别向量和所述微博请求用户的历史微博内容的主题类别向量,计算所述微博信息中的微博内容与所述微博请求用户的历史微博内容之间的相似度;第三评分单元,用于根据所述相似度对所述微博信息进行评分。
- 根据权利要求11所述的微博排序系统,其特征在于,所述系统还包括分类模型训练模块,所述分类模型训练模块包括:主题类别获取模块,用于获取预设的微博主题类别;训练集获取模块,用于获取所述微博主题类别的训练子集;特征提取模块,用于从所述训练子集中提取出所述微博主题类别的特征。
- 根据权利要求12所述的微博排序系统,其特征在于,所述训练集获取模块用于根据所述微博主题类别的关键词搜索微博,获取所述微博主题类别的初始训练子集,并按照预设次数重复执行以下步骤:统计所述初始训练子集中的高频词,根据所述高频词搜索微博,将搜索结果加入所述初始训练子集。
- 根据权利要求8所述的微博排序系统,其特征在于,所述系统还包括:展示类别分类模块,用于按照预设的微博展示类别对所述微博信息中的微博内容进行归类,得到所述微博内容所属的展示类别;所述展示模块还用于按照所述微博内容所属的展示类别及所述排序的结果展示所述微博信息。
- 一种微博搜索方法,其特征在于,按照权利要求1-7任一所述的微博排序方法对微博搜索结果进行排序,其中,所述获取用户请求的微博信息的步骤包括:根据用户输入的关键字进行搜索,得到所述用户请求的微博信息。
- 一种微博搜索系统,其特征在于,包括权利要求8-14任一所述的微博排序系统,其中,所述微博信息获取模块用于根据用户输入的关键字进行搜索,得到所述用户请求的微博信息。
- 一种微博展示方法,其特征在于,按照权利要求1-7任一所述的微博排序方法对微博请求结果进行排序,其中,所述获取用户请求的微博信息的步骤包括:根据用户标识对应的微博请求信息,得到所述用户请求的微博信息。
- 一种微博展示系统,其特征在于,包括权利要求8-14任一所述的微博排序系统,其中,所述微博信息获取模块用于根据用户标识对应的微博请求信息,得到所述用户请求的微博信息。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2014515063A JP2014522540A (ja) | 2012-02-09 | 2013-02-04 | マイクロブログのシーケンシング、検索、表示方法及びシステム |
AP2014007382A AP2014007382A0 (en) | 2012-02-09 | 2013-02-04 | Method and system for sequencing, seeking, and displaying micro-blog |
KR1020137031978A KR20140012750A (ko) | 2012-02-09 | 2013-02-04 | 마이크로 블로그 배열, 검색 및 표시 방법과 시스템 |
EP13746647.0A EP2704040A4 (en) | 2012-02-09 | 2013-02-04 | METHOD AND SYSTEM FOR SEQUENCING, SEARCHING, AND DISPLAYING MICROBLOGS |
US14/109,949 US9785677B2 (en) | 2012-02-09 | 2013-12-17 | Method and system for sorting, searching and presenting micro-blogs |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210028740.7A CN103246670B (zh) | 2012-02-09 | 2012-02-09 | 微博排序、搜索、展示方法和系统 |
CN201210028740.7 | 2012-02-09 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/109,949 Continuation US9785677B2 (en) | 2012-02-09 | 2013-12-17 | Method and system for sorting, searching and presenting micro-blogs |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2013117147A1 true WO2013117147A1 (zh) | 2013-08-15 |
Family
ID=48926194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2013/071325 WO2013117147A1 (zh) | 2012-02-09 | 2013-02-04 | 微博排序、搜索、展示方法和系统 |
Country Status (7)
Country | Link |
---|---|
US (1) | US9785677B2 (zh) |
EP (1) | EP2704040A4 (zh) |
JP (1) | JP2014522540A (zh) |
KR (1) | KR20140012750A (zh) |
CN (1) | CN103246670B (zh) |
AP (1) | AP2014007382A0 (zh) |
WO (1) | WO2013117147A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860299A (zh) * | 2020-07-17 | 2020-10-30 | 北京奇艺世纪科技有限公司 | 目标对象的等级确定方法、装置、电子设备及存储介质 |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103678474B (zh) * | 2013-09-24 | 2016-10-05 | 浙江大学 | 一种在社交网络中快速获取大量热门话题的方法 |
CN103744918A (zh) * | 2013-12-27 | 2014-04-23 | 东软集团股份有限公司 | 基于垂直领域的微博搜索排序方法及系统 |
CN104899201B (zh) * | 2014-03-04 | 2019-05-14 | 腾讯科技(北京)有限公司 | 文本提取方法、敏感词判定方法、装置和服务器 |
CN104317881B (zh) * | 2014-04-11 | 2017-11-24 | 北京理工大学 | 一种基于用户话题权威性的微博重排序方法 |
US9819633B2 (en) * | 2014-06-18 | 2017-11-14 | Social Compass, LLC | Systems and methods for categorizing messages |
US9819618B2 (en) * | 2014-06-18 | 2017-11-14 | Microsoft Technology Licensing, Llc | Ranking relevant discussion groups |
CN106294363A (zh) * | 2015-05-15 | 2017-01-04 | 厦门美柚信息科技有限公司 | 一种论坛帖子评价方法、装置及系统 |
JP6842825B2 (ja) * | 2015-09-25 | 2021-03-17 | 株式会社ユニバーサルエンターテインメント | 情報提供システム、情報提供方法、及びプログラム |
CN105468714B (zh) * | 2015-11-20 | 2018-11-09 | 北京邮电大学 | 一种基于论坛的自媒体信息展示方法和系统 |
CN105808722B (zh) * | 2016-03-08 | 2020-07-24 | 苏州大学 | 一种信息判别方法和系统 |
CN105824951B (zh) * | 2016-03-23 | 2019-10-11 | 百度在线网络技术(北京)有限公司 | 检索方法和装置 |
CN106027303B (zh) * | 2016-05-24 | 2019-07-16 | 腾讯科技(深圳)有限公司 | 一种征信特征获取方法及其设备 |
CN107844492A (zh) * | 2016-09-19 | 2018-03-27 | 阿里巴巴集团控股有限公司 | 一种进行对象排序和展示搜索对象的方法及设备 |
CN108280198B (zh) * | 2018-01-29 | 2021-03-02 | 口碑(上海)信息技术有限公司 | 榜单生成方法及装置 |
CN109885763B (zh) * | 2019-01-26 | 2021-04-16 | 北京工业大学 | 一种基于用户头像的博文推荐方法 |
CN109948313B (zh) * | 2019-03-15 | 2022-11-25 | 江苏金智教育信息股份有限公司 | 一种个人信息查看赋权的方法和装置 |
CN110941759B (zh) * | 2019-11-20 | 2022-11-11 | 国元证券股份有限公司 | 一种微博情感分析方法 |
CN117093762B (zh) * | 2023-07-18 | 2024-02-13 | 南京特尔顿信息科技有限公司 | 一种舆情数据评估分析系统及方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101305371A (zh) * | 2005-09-13 | 2008-11-12 | 谷歌公司 | 对博客文档进行排名 |
CN101661474A (zh) * | 2008-08-26 | 2010-03-03 | 华为技术有限公司 | 一种搜索方法和系统 |
CN102016825A (zh) * | 2007-08-17 | 2011-04-13 | 谷歌公司 | 对社交网络对象进行排名 |
CN102063488A (zh) * | 2010-12-29 | 2011-05-18 | 南京航空航天大学 | 一种基于语义的代码搜索方法 |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09231238A (ja) * | 1996-02-20 | 1997-09-05 | Omron Corp | テキスト検索結果表示方法及び装置 |
JP2006185356A (ja) * | 2004-12-28 | 2006-07-13 | Canon Inc | 情報処理装置及びその処理方法、プログラム、記憶媒体、並びに文書分類システム |
US7421429B2 (en) * | 2005-08-04 | 2008-09-02 | Microsoft Corporation | Generate blog context ranking using track-back weight, context weight and, cumulative comment weight |
US7765209B1 (en) | 2005-09-13 | 2010-07-27 | Google Inc. | Indexing and retrieval of blogs |
US8171128B2 (en) * | 2006-08-11 | 2012-05-01 | Facebook, Inc. | Communicating a newsfeed of media content based on a member's interactions in a social network environment |
JP2007334502A (ja) * | 2006-06-13 | 2007-12-27 | Fujifilm Corp | 検索装置、方法およびプログラム |
CN101004749A (zh) * | 2006-12-26 | 2007-07-25 | 朱莉君 | 一种互联网用户交流平台的构建方法 |
JP4802125B2 (ja) * | 2007-03-09 | 2011-10-26 | 富士通株式会社 | ウェブログ管理プログラム、ウェブログ管理装置およびウェブログ管理方法 |
CN101561805B (zh) * | 2008-04-18 | 2014-06-25 | 日电(中国)有限公司 | 文档分类器生成方法和系统 |
US20100042612A1 (en) * | 2008-07-11 | 2010-02-18 | Gomaa Ahmed A | Method and system for ranking journaled internet content and preferences for use in marketing profiles |
US8145636B1 (en) * | 2009-03-13 | 2012-03-27 | Google Inc. | Classifying text into hierarchical categories |
JP2010218475A (ja) | 2009-03-19 | 2010-09-30 | Nifty Corp | ブログ分析方法及び装置 |
KR20100125697A (ko) | 2009-05-21 | 2010-12-01 | 장경호 | 블로그를 이용한 광고 및 정보 제공 시스템 |
US8719302B2 (en) * | 2009-06-09 | 2014-05-06 | Ebh Enterprises Inc. | Methods, apparatus and software for analyzing the content of micro-blog messages |
US8539161B2 (en) | 2009-10-12 | 2013-09-17 | Microsoft Corporation | Pre-fetching content items based on social distance |
CN102088419B (zh) * | 2009-12-07 | 2012-08-15 | 倪加元 | 一种在社交网络中查找好友信息的方法和系统 |
US20110178995A1 (en) * | 2010-01-21 | 2011-07-21 | Microsoft Corporation | Microblog search interface |
US8606792B1 (en) * | 2010-02-08 | 2013-12-10 | Google Inc. | Scoring authors of posts |
US20110231296A1 (en) * | 2010-03-16 | 2011-09-22 | UberMedia, Inc. | Systems and methods for interacting with messages, authors, and followers |
US8751511B2 (en) * | 2010-03-30 | 2014-06-10 | Yahoo! Inc. | Ranking of search results based on microblog data |
US20110302103A1 (en) * | 2010-06-08 | 2011-12-08 | International Business Machines Corporation | Popularity prediction of user-generated content |
US8583674B2 (en) * | 2010-06-18 | 2013-11-12 | Microsoft Corporation | Media item recommendation |
US8954451B2 (en) * | 2010-06-30 | 2015-02-10 | Hewlett-Packard Development Company, L.P. | Selecting microblog entries based on web pages, via path similarity within hierarchy of categories |
US20120042020A1 (en) * | 2010-08-16 | 2012-02-16 | Yahoo! Inc. | Micro-blog message filtering |
US9324112B2 (en) * | 2010-11-09 | 2016-04-26 | Microsoft Technology Licensing, Llc | Ranking authors in social media systems |
US8825679B2 (en) * | 2011-02-15 | 2014-09-02 | Microsoft Corporation | Aggregated view of content with presentation according to content type |
US8898151B2 (en) * | 2011-06-22 | 2014-11-25 | Rogers Communications Inc. | System and method for filtering documents |
CN102332006B (zh) * | 2011-08-03 | 2016-08-03 | 百度在线网络技术(北京)有限公司 | 一种信息推送控制方法及装置 |
US8751917B2 (en) * | 2011-11-30 | 2014-06-10 | Facebook, Inc. | Social context for a page containing content from a global community |
US20130159277A1 (en) * | 2011-12-14 | 2013-06-20 | Microsoft Corporation | Target based indexing of micro-blog content |
-
2012
- 2012-02-09 CN CN201210028740.7A patent/CN103246670B/zh active Active
-
2013
- 2013-02-04 JP JP2014515063A patent/JP2014522540A/ja active Pending
- 2013-02-04 EP EP13746647.0A patent/EP2704040A4/en not_active Withdrawn
- 2013-02-04 KR KR1020137031978A patent/KR20140012750A/ko not_active Application Discontinuation
- 2013-02-04 AP AP2014007382A patent/AP2014007382A0/xx unknown
- 2013-02-04 WO PCT/CN2013/071325 patent/WO2013117147A1/zh active Application Filing
- 2013-12-17 US US14/109,949 patent/US9785677B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101305371A (zh) * | 2005-09-13 | 2008-11-12 | 谷歌公司 | 对博客文档进行排名 |
CN102016825A (zh) * | 2007-08-17 | 2011-04-13 | 谷歌公司 | 对社交网络对象进行排名 |
CN101661474A (zh) * | 2008-08-26 | 2010-03-03 | 华为技术有限公司 | 一种搜索方法和系统 |
CN102063488A (zh) * | 2010-12-29 | 2011-05-18 | 南京航空航天大学 | 一种基于语义的代码搜索方法 |
Non-Patent Citations (1)
Title |
---|
See also references of EP2704040A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111860299A (zh) * | 2020-07-17 | 2020-10-30 | 北京奇艺世纪科技有限公司 | 目标对象的等级确定方法、装置、电子设备及存储介质 |
CN111860299B (zh) * | 2020-07-17 | 2023-09-08 | 北京奇艺世纪科技有限公司 | 目标对象的等级确定方法、装置、电子设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US9785677B2 (en) | 2017-10-10 |
CN103246670B (zh) | 2016-02-17 |
EP2704040A4 (en) | 2015-08-05 |
EP2704040A1 (en) | 2014-03-05 |
AP2014007382A0 (en) | 2014-01-31 |
CN103246670A (zh) | 2013-08-14 |
KR20140012750A (ko) | 2014-02-03 |
JP2014522540A (ja) | 2014-09-04 |
US20140108388A1 (en) | 2014-04-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2013117147A1 (zh) | 微博排序、搜索、展示方法和系统 | |
US20220020056A1 (en) | Systems and methods for targeted advertising | |
US8099415B2 (en) | Method and apparatus for assessing similarity between online job listings | |
WO2012134180A2 (ko) | 문장에 내재한 감정 분석을 위한 감정 분류 방법 및 컨텍스트 정보를 이용한 다중 문장으로부터의 감정 분류 방법 | |
WO2015003480A1 (zh) | 一种社交媒体中的信息推荐方法和装置 | |
Malik et al. | Comparing mobile apps by identifying ‘Hot’features | |
US20120131020A1 (en) | Method and apparatus for assembling a set of documents related to a triggering item | |
WO2010036013A2 (ko) | 웹 문서에서의 의견 추출 및 분석 장치 및 그 방법 | |
US20100262597A1 (en) | Method and system for searching information of collective emotion based on comments about contents on internet | |
Choudhari et al. | Video search engine optimization using keyword and feature analysis | |
Jones et al. | TREC 2020 podcasts track overview | |
US20090313217A1 (en) | Systems and methods for classifying search queries | |
JP2009524158A5 (zh) | ||
KR20080044915A (ko) | 블로그 문서의 순위 부여 | |
Cheng et al. | On effective personalized music retrieval by exploring online user behaviors | |
JP6429382B2 (ja) | コンテンツ推薦装置、及びプログラム | |
Völske et al. | What users ask a search engine: Analyzing one billion russian question queries | |
CN110309265A (zh) | 一种决定视频是否推送相关法律知识的方法 | |
JP2014085862A (ja) | 予測対象コンテンツにおける将来的なコメント数を予測する予測サーバ、プログラム及び方法 | |
Sawicki et al. | Exploring usability of reddit in data science and knowledge processing | |
US20070239735A1 (en) | Systems and methods for predicting if a query is a name | |
Zhao et al. | Why you should listen to this song: Reason generation for explainable recommendation | |
CN108140034B (zh) | 使用主题模型基于接收的词项选择内容项目 | |
US20120023119A1 (en) | Data searching system | |
Mullick et al. | Harnessing twitter for answering opinion list queries |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13746647 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2013746647 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2013746647 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 20137031978 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2014515063 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |