WO2010123264A2 - Online community post search method and apparatus based on interactions between online community users and computer readable storage medium storing program thereof - Google Patents
Online community post search method and apparatus based on interactions between online community users and computer readable storage medium storing program thereof Download PDFInfo
- Publication number
- WO2010123264A2 WO2010123264A2 PCT/KR2010/002478 KR2010002478W WO2010123264A2 WO 2010123264 A2 WO2010123264 A2 WO 2010123264A2 KR 2010002478 W KR2010002478 W KR 2010002478W WO 2010123264 A2 WO2010123264 A2 WO 2010123264A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- online community
- score
- users
- user
- posts
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 68
- 230000003993 interaction Effects 0.000 title claims abstract description 42
- 238000013507 mapping Methods 0.000 claims description 13
- 238000005315 distribution function Methods 0.000 claims description 11
- 238000013016 damping Methods 0.000 claims description 2
- 238000011160 research Methods 0.000 description 5
- 238000007418 data mining Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000003012 network analysis Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000012885 constant function Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- PWPJGUXAGUPAHP-UHFFFAOYSA-N lufenuron Chemical compound C1=C(Cl)C(OC(F)(F)C(C(F)(F)F)F)=CC(Cl)=C1NC(=O)NC(=O)C1=C(F)C=CC=C1F PWPJGUXAGUPAHP-UHFFFAOYSA-N 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005295 random walk Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000013077 scoring method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9538—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
Definitions
- the present invention relates to online community post searching technique based on interactions between online community users; and more particularly, to an online community post search technique based on interactions between online community users to provide specialized results according to search intentions of online community users.
- online community posts created in various forums online e.g., blogs, online social networks and bulletins
- Online community web sites related on specific interests such as photography, music, science and digital equipment are popular.
- Some online community web sites have several tens of thousands of registered users and millions of user-created contents (UCC).
- a blog is a representative one. According to statistical data, it is known that the number of blog users ranges from millions to tens of millions. In the modern life inseparable from the Internet, the blog extends over all aspects of the life, e.g., society, economy, culture and politics, and its influence tends to increase. Blog users, i.e., bloggers, share cyberspace referred to as blogoshpere. The blogoshpere has something in common with general web in that users have connections by hyperlinks. However, the blog is different from the general web in that the blog provides a private space. A private blog freely offers individual interests such as politics, economy, culture, art, sports and hobbies via posts containing text, pictures and the like.
- the blog has a more regularized form than the forms of general web pages, and mainly deals with individual concerns. Accordingly, a search engine may be required to provide more detailed and powerful search functions only for blogs. Actually, blog search techniques have been developed all over the world, and many research papers have been published.
- search ranking may be determined based on simple statistical figures of the blog posts. For example, there are the order of replies, track back best, the order of attention, the order of empathy, recommended contents, popular contents, the order of points, attractive blogs, top bloggers and the like.
- search ranking may be determined based on times at which blog posts are created. For example, there are new contents, today s hot blogs, today s hot posts, the order of registration, real-time popular contents, real-time new contents, the order of the latest contents, top bloggers of this week and the like.
- search ranking may be determined based on semantic matching (TF*IDF) between query keywords such as tags, keywords and the order of accuracy, and keywords of blog posts.
- conventional search ranking may be determined based on three factors of simple statistical figures, created times, semantic matching.
- the paper of Nitin Agarwal et al. (2008) deals with detection of influential bloggers in the blogoshpere.
- This paper proposed four statistical characteristics for measuring social gestures of the bloggers: a recognition characteristic such as the number of citations, an activity generation characteristic such as the number of comments, an originality characteristic such as the number of citations of other blogs, and an eloquence characteristic such as the length of the post.
- a recognition characteristic such as the number of citations
- an activity generation characteristic such as the number of comments
- an originality characteristic such as the number of citations of other blogs
- an eloquence characteristic such as the length of the post.
- the paper of Xiaodan Song et al. (2007) proposed a technique for detecting an influential opinion leader in the blogoshpere by introducing the concept of an influence rank.
- the paper of Craig Macdonald et al. (2008) disclosed a technique to search the entire blog or each post in the blog for experts on specific interested topics.
- the paper of Akshay Java et al. (2006) proposed a method for detecting influential bloggers by an influence model.
- the paper of P. Jurczyk et al. (2007) proposed a method for detecting authoritative users in a question/answering system (e.g., 'Yahoo! Answers').
- the paper of A. Mathopoulos et al. (2006) proposed a blog ranking technique using an implicit hyperlink. For example, it is assumed that two blog posts having the same topic are connected to each other by an implicit hyperlink. The implicit hyperlink may make up for a sparse link connection between blog posts. The implicit hyperlink is applied to bloggers and commenters in the same group in addition to the blog posts having the same topic.
- This paper also proposed a blog ranking algorithm using PageRank coefficients disclosed in the paper of L. Page et al. (2008).
- the paper of Xiaochuan Ni et al. (2007) proposed a technique for searching blog posts containing useful and influential information.
- the paper of Kritsada Sriphaew et al. (2008) proposed a technique for searching a cool blog based on characteristics related on topics of blogs. It proposed a stochastic model by supposing that the cool blog has clear topics, contains a sufficient number of posts, and provides topic consistency, for example.
- the users for searching online community posts such as blogs have various search intentions, fashion, and tendency. Even though many users for searching online community posts input the same search term, they may have different search purposes. For example, when a search term of 'landscape' is inputted, a certain user may intend to search artistic pictures, whereas another user may intend to search general pictures. Thus, the conventional search techniques in which ranking functions were implemented on the basis of one aspect are difficult to meet various search demands of the online community search users.
- the present invention provides an online community post search technique capable of providing search results according to various intentions of search users by using social network analysis and data mining technique.
- the present invention provides search results satisfying various intentions of the users with a simple search term by introducing the concept of a trend such as expertise and popularity.
- a method of searching online community posts based on interactions between online community users including: receiving first information on online community users and second information on online community posts created by the online community users; acquiring, for each of the online community users, third information on interactions in which online community users other than a specific online community user among the online community users perform on an online community post of the specific online community user; calculating, for each of the online community users, a first score representing trend of the specific online community user by using trend scores representing trends of the other online community users and the third information; and searching the online community users or the online community posts by using the first score based on interactions between the online community posts.
- a computer-readable storage medium storing a program for executing the method described above.
- an apparatus for searching online community posts based on interactions between online community users including: an online community user information storage unit for storing first information on online community users and first scores representing trends of the online community users; an online community post storage unit for storing second information on online community posts created by the online community users and third information on interactions which online community users other than a specific online community user of the online community users perform on an online community post of the specific online community user; and an expertise operation unit for calculating, for each of the online community users, a first score representing a trend of the specific online community user by using trend scores representing trends of the other online community users and the third information.
- FIG. 1 illustrates interactions of online community users in their online community posts
- FIG. 2 is a flowchart showing steps of an online community post search method based on interactions between online community users in accordance with an embodiment of the present invention
- FIG. 3 shows the expertise scores aligned according to the expertise ranks of the online community users
- FIG. 4 shows a process for calculating a popularity score by mapping a Gaussian distribution function to the expertise scores aligned according to the expertise ranks of the online community users
- FIG. 5 is a block diagram showing a schematic configuration of an online community post search apparatus based on interactions between online community users in accordance with the embodiment of the present invention.
- the concept of a trend is introduced as an index showing various search intentions of users, fashion, and tendency to utilize in post search.
- the trend-based search technique will be described in detail with expertise and popularity in the embodiment of the present invention.
- the user when searching blogs related to movies, the user does not search the blogs only with one term so that Search the blogs related to the most important movie is meaningless.
- the user may search the blogs related to the movies with the trend of popularity and expertise.
- movie-related blog search although different users search blogs related to movies of the same genre, some users prefer popular movies, while other users prefer movies of high expertise and artistry.
- Such tendency or preference of the searching user is defined in the embodiment of the invention as a trend. Accordingly, when the concept of the trend is introduced in search, it is possible to treat a search query such as "Search the blog posts related to movies of high expertise" or "Search the blog posts related to the most popular movies", unlike a conventional method of searching blog posts that the movie titles or actor/actress names are merely matched.
- Such trend-based blog search cannot be realized by simple comparison of statistical figures such as recommendations and hits for blog posts.
- FIG. 1 illustrates interactions of online community users in their online community posts.
- online community users A, B and C perform interactions in their online community posts.
- An online community post 110 of the online community user A posts up three posts of a post A1 111, a post A2 112 and a post A3 113.
- the post A1 111 has a reply 171 of the online community user B.
- the post A2 112 has an empathy 172 of the online community user B.
- the post A2 112 is linked via a link 174 to a post C1 151 written by the online community user C.
- the post A3 113 has a recommendation 175 of the online community user C.
- An online community post 130 of the online community user B posts up two posts of a post B1 131 and a post B2 132.
- the post B1 131 has a comment 173 of the online community user A.
- the post B2 132 is taken to a post C2 152 by a scrap 177 of the online community user C.
- An online community post 150 of the online community user C includes two posts of the post C1 151 and the post C2 152.
- the post C1 151 has the link 174 connected to the post A2 112 of the online community user A.
- the post C2 152 drawn from the post B2 132 of the online community user B by the scrap 177 is connected to the post A3 113 of the online community user A by a track back 176.
- the reply 171, the empathy 172, the comment 173, the link 174, the recommendation 175, the track back 176, the scrap 177 and the like have different names and forms depending on served online community services, but can be represented in numbers as interactions between the online community users.
- the interactions between the online community users may have various forms in addition to the above-mentioned ones.
- FIG. 2 is a flowchart illustrating a method for an online community post searching based on interactions between online community users in accordance with the embodiment of the present invention.
- first information on online community users and second information on online community posts created by the online community users are entered and stored in step S210.
- the first information includes user information such as IDs, names, addresses and the like, but it is not limited thereto.
- the second information includes titles and contents of the posts and the number of feedbacks, but it is not limited thereto.
- third information on interactions in which online community users other than a specific user of the online community users perform on the online community post written by the specific user is acquired for each of the online community users and stored in step S220.
- the third information includes the reply 171, the empathy 172, the comment 173, the link 174, the recommendation 175, the track back 176, and the scrap 177, but it is not limited thereto.
- a first score representing the expertise of the specific user is calculated for each of the online community users by using scores on the expertise of the other online community users and the third information in step S230.
- the first score can be used to determine a trend score indicating ranking of the online community posts.
- a second score representing the semantic similarity between the inputted search term and the online community posts is calculated in step S240. Further, the online community users or the online community posts are searched by using the first score and the second score in step S250.
- a social network between users in an online community space is modeled in a graph and online community user clustering is performed to create a sub community including the users having high correlation.
- the sub community created by the online community user clustering includes online community users who are representative in their fields.
- a trend ranking value of the online community user is calculated for each of the online community users in the sub community.
- the trend ranking value of the user is a score numerically showing the expertise and popularity and it can be obtained by social network analysis, data mining technique and the like.
- the online community posts are ranked by using a sum of a semantic similarity score between the user query and the online community post and a trend score of the online community post to return the ranking result.
- the trend score of the online community post is acquired by the trend ranking value of the online community users who perform interactions on the corresponding post or have correlation. This technique can be applied in the same way to all types of the online communities including blogs. It has been disclosed that the post ranking result is be calculated by the sum of the semantic similarity score and the trend score, however, those skilled in the art will recognize that the post ranking may be made by using only the trend score.
- the online community posts created by the online community users in fields of movies, music, photographs and the like can be ranked in aspects of, e.g., popularity and expertise. Some searchers prefer popular photographs, while other searchers prefer photographs of high expertise and photographic quality. Such tendency or preference of the searcher is defined as the trend as described above.
- the basic idea of ranking of the online community users using the trend of expertise is as follows. Assuming that a certain online community post includes high-quality and expert-level contents, the online community post may get more responses of experts than those of amateurs. On the other hand, if a certain post includes low-quality and amateur-level contents, the experts may lose interest in the post.
- an expertise rank (ER) value that is an expertise score of the online community user can be defined based on the interactions and relationships between the online community users.
- the ER value of an online community user 'u' is influenced by the ER values of other online community users 'v' who make responses or interactions such as replies, comments, empathy, recommendations, track backs, scraps and links on the post made by the online community user 'u'.
- replies, comments, empathy, recommendations, track backs, scraps and links on the post made by the online community user 'u' a case of comments is described as an example.
- a set of posts made by the online community user 'u' is referred to as
- an expertise rank value ER(u) of the online community user 'u' may be formulated as follows:
- ER(u) is an expertise score of a specific online community user 'u'
- ER(v) is an expertise score of other online community users 'v'
- Au is a set of online community posts created by the specific online community user 'u'
- CAu,v is the number of interactions performed by the other online community users 'v' on the online community posts belonging to the Au
- Cv is the number of all interactions performed by the other online community users 'v'
- 'd' is a damping factor representing the minimum influence.
- Math Figure 1 is only an example, and the expertise score may be calculated by various modifications of Math Figure 1.
- the expertise score of the other online community users is recursively used to calculate the expertise score of the specific online community user.
- various other methods using the interactions between the online community users may also be employed.
- the activity, reputation, sociability and the like of the online community users are considered in the calculation of the expertise score.
- FIG. 3 plots the expertise scores to the expertise ranks of the online community users, which results from ER distribution of the online community users obtained by Math Figure 1.
- Horizontal and vertical axes are plotted in log scale.
- the expertise score is 1000 or more if the expertise rank of the online community user is less than 10, whereas the expertise score is reduced to be less than 10 if the expertise rank of the online community user is more than 1000.
- FIG. 4 shows a process for calculating a popularity score by mapping a Gaussian distribution function to the plots of the expertise scores to the expertise ranks of the online community users.
- the popularity of the online community user is closely related to the expertise, but the popularity and expertise are not contrary to each other. Some online community users may have high expertise and high popularity, and some online community users may have low expertise and low popularity. It can be determined that the online community user having popularity is preferred by general online community users who have middle-level expertise rather than a high-level and expert user or a low-level and amateur user.
- the popularity score of the online community user can be defined by using the ER value serving as the above-described expertise score.
- the popularity score can be obtained by arranging the online community users according to the ER values, mapping a Gaussian distribution function to the alignment results, and applying a weight.
- the popularity rank (PR) of the online community user can be calculated by the following Math Figure 2:
- PR(u) is a popularity score of a specific online community user 'u'
- 'x' is an expertise rank of the specific online community user 'u'
- ' ⁇ ' is a standard deviation of the expertise ranks
- ' ⁇ ' is a middle value of the expertise ranks.
- Math Figure 2 is only an example, and the popularity score may be calculated by various modifications of Math Figure 2.
- the popularity score of the other online community users is calculated by mapping a Gaussian distribution function to the expertise scores aligned according to the expertise ranks of the online community users.
- various other methods for applying a high popularity score to an online community user having middle-level expertise may be employed.
- a Gaussian distribution function 420 is mapped to alignment results 410 of the expertise scores aligned according to the expertise ranks of the online community users.
- the Gaussian distribution function 420 may be mapped to have a maximum value at an approximately middle value of the alignment results 410 of the expertise scores. In this case, a maximum popularity score can be applied to the online community user having middle-level expertise.
- the middle value of the alignment results 410 of the expertise scores is substantially a middle value of results calculated in log scale.
- an expertise score 440 of the online community user having an expertise rank of about 100 is about 100.
- a popularity score 445 of the online community user having an expertise rank of about 100 is calculated to be very high, e.g., about 3000 in mapping of the Gaussian distribution function 420.
- an expertise score 430 of the online community user having an expertise rank of about 10 is about 1000.
- a popularity score 435 of the online community user having an expertise rank of about 10 is calculated to be very low, e.g., about 10 in mapping of the Gaussian distribution function 420.
- mapping function and mapping scale may be applied in various forms. For example, when a graph having horizontal and vertical axes plotted in general scale instead of log scale is created and a Gaussian distribution function is mapped, if the number of all online community users is 10,000, a maximum popularity score is given to an online community user having an expertise rank of about 5000. What expertise rank the user, who is given a maximum popularity score, has may be determined according to the various environment of the online community.
- a method for calculating a score of an online community post based on a specific trend in a search process may be performed by using the following Math Figure 3:
- the online community posts are ranked by using a score obtained by combining a semantic similarity score between the user query and the online community post with a trend score of the online community post, thereby finally returning the ranking result.
- a specific score calculation method will be described later.
- a weighted average concept is used in Math Figure 3
- various other equations using both a semantic similarity score and a trend score or using another value in addition to a semantic similarity score may be employed.
- the user may select and input a trend of search in advance before entering an initial search term.
- several formulas for calculating a post score based on the trend corresponding to Math Figure 3 are made for selectable trends such that a corresponding calculation formula can be used according to selection of the user.
- an online community post score is calculated by combining one semantic similarity score with plural trend scores, a search method using one or more trends is available.
- a score of an online community post 'p' related to a sepcific trend can be calculated by a function ⁇ as folllows:
- Up is a set of online community users having connections with the post p
- Pp is a set of posts having connections with the post p.
- a score of the online community post related to a specific trend may be defined by interactions between the post and online community users having connection with the post.
- Representative examples of a function ⁇ for calculating a trend score of the post based on a user score are as follows, but it is not limited thereto.
- baseline_post_score (p) the number of feedback users of p
- This method has an advantage of simplicity, but has a disadvantage that the quality of the post cannot be considered.
- ranking of the user who provides a feedback to the post is considered. That is, a score of the user influences a score of the post.
- a score of the post and a score of the user are simultaneously calculated by mutual reinforcement. That is, each user is given a portion of a score of his/her own post, whereas the post is given a portion of a score of the user who provides a feedback.
- W is a matrix whose element shows a relationship between a post and a maker of the post
- F is a matrix whose element shows a relationship between a feedback provider and a target post.
- a score of a certain post 'p' is calculated by measuring how much contribution a control user 'uc' and users, who have a difference equal to or smaller than 'w' between their scores and the score of the control user 'uc' have made on the post 'p'.
- a function of score_ similarity calculates score similarity between a control user and a specific user.
- the score similarity between users can be defined in various forms of a constant function, a trigonometric function, a Gaussian function and the like.
- the score similarity between users using Gaussian distribution can be defined as follows:
- a function ⁇ can be variuosly designed accorindg to the trend without departing from the scope of the invention.
- various data mining techniques may be used in the design of the function ⁇ .
- applicable data mining techniques there are 1-mode/2-mode social network analysis, Markov chain random walk model analysis, association rule mining, classification and the like.
- online communities may be classified into categories and expertise-related information may be formed for each category.
- a global trend rank value i.e., the same trend score
- the online community users and posts are clustered and classified according to the fields, and a trend score of the same field has relatively high importance.
- FIG. 5 is a block diagram showing a schematic configuration of an apparatus for searching online community posts based on interactions between online community users in accordance with the embodiment of the present invention.
- the apparatus includes an online community user information storage unit 510, an online community post storage unit 520, an expertise operation unit 530, a semantic similarity operation unit 540, a search unit 550, and a popularity operation unit 560.
- the online community user information storage unit 510 stores first information on online community users and first scores on expertise of online community users. In addition, various information on the online community users may be stored in the online community user information storage unit 510. Online community user expertise information 591 of the first scores and the first information are transferred from the online community user information storage unit 510 to the expertise operation unit 530.
- the online community post storage unit 520 stores second information on the online community posts created by the online community users and third information on interactions which online community users other than a specific user performs on the online community post created by the specific user.
- various information on the online community posts may be stored in the online community post storage unit 520.
- Interaction information 592 of the third information and the second information are transferred from the online community post storage unit 520 to the expertise operation unit 530.
- the expertise operation unit 530 calculates a first score representing the expertise of the specific online community user for each of the online community users by using scores on the expertise of the other online community users and the third information.
- the expertise operation unit 530 may include a multiplier for performing a first operation on each of the other online community users, the first operation including a multiplication of the scores of the other online community users and the third information, and an adder for performing a second operation including a sum of the results of the first operation on the other online community users.
- An expertise score 593 serving as the first score is transferred from the expertise operation unit 530 to the search unit 550 and the popularity operation unit 560.
- the semantic similarity operation unit 540 calculates a second score representing the semantic similarity between the inputted search term and the online community posts.
- a semantic similarity score 595 serving as the second score is transferred from the semantic similarity operation unit 540 to the search unit 550.
- the search unit 550 searches online community users or online community posts by using the first score or a third score and the second score.
- the search unit 550 may provide search results to the users by using the expertise score 593 serving as the first score, the semantic similarity score 595 serving as the second score and a popularity score 596 serving as the third score.
- the popularity operation unit 560 is selectively included according to the embodiment.
- the popularity operation unit 560 calculates the third score representing the expertise of the specific online community user for each of the online community users by ranking the online community users depending on the first score values and mapping a function having a higher value as a rank is closer to a middle rank.
- the search unit 550 may be configured to use the second score and at least one of the first score and the third score.
- the popularity score 596 serving as the third score is transferred from the popularity operation unit 560 to the search unit 550.
- the technical idea of the present invention may be applied to all types of online community posts such as blogs, online social networks and forums or bulletins, which are sharable online contents created by online community users.
- the present invention is not limited to the above-described embodiment.
- modules, functional blocks, means or any combination thereof may be embodied by various well-known devices such as an electronic circuit, an integrated circuit, and an application specific integrated circuit (ASIC).
- ASIC application specific integrated circuit
- Embodiments within the scope of the present invention may also include computer-readable media for carrying computer-executable instructions or data structures stored thereon.
- Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer.
- Such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Transfer Between Computers (AREA)
Abstract
An online community post search method based on interactions between online community users, includes: receiving first information on online community users and second information on online community posts created by the online community users; acquiring, for each of the online community users, third information on interactions which online community users other than a specific online community user among the online community users perform on an online community post of the specific online community user; and calculating, for each of the online community users, a first score representing trend of the specific online community user by using trend scores representing trends of the other online community users and the third information. Further, the method includes searching the online community users or the online community posts by using the first score based on interactions between the online community posts.
Description
The present invention relates to online community post searching technique based on interactions between online community users; and more particularly, to an online community post search technique based on interactions between online community users to provide specialized results according to search intentions of online community users.
Recently, online community posts created in various forums online, e.g., blogs, online social networks and bulletins, are spread rapidly as social media using the Internet. Online community web sites related on specific interests such as photography, music, science and digital equipment are popular. Some online community web sites have several tens of thousands of registered users and millions of user-created contents (UCC).
Among the online community posts, a blog is a representative one. According to statistical data, it is known that the number of blog users ranges from millions to tens of millions. In the modern life inseparable from the Internet, the blog extends over all aspects of the life, e.g., society, economy, culture and politics, and its influence tends to increase. Blog users, i.e., bloggers, share cyberspace referred to as blogoshpere. The blogoshpere has something in common with general web in that users have connections by hyperlinks. However, the blog is different from the general web in that the blog provides a private space. A private blog freely offers individual interests such as politics, economy, culture, art, sports and hobbies via posts containing text, pictures and the like.
The blog has a more regularized form than the forms of general web pages, and mainly deals with individual concerns. Accordingly, a search engine may be required to provide more detailed and powerful search functions only for blogs. Actually, blog search techniques have been developed all over the world, and many research papers have been published.
At present, search results based on three types of information can be obtained from blog sites. First, search ranking may be determined based on simple statistical figures of the blog posts. For example, there are the order of replies, track back best, the order of attention, the order of empathy, recommended contents, popular contents, the order of points, attractive blogs, top bloggers and the like. Second, search ranking may be determined based on times at which blog posts are created. For example, there are new contents, today s hot blogs, today s hot posts, the order of registration, real-time popular contents, real-time new contents, the order of the latest contents, top bloggers of this week and the like. Third, search ranking may be determined based on semantic matching (TF*IDF) between query keywords such as tags, keywords and the order of accuracy, and keywords of blog posts. In brief, conventional search ranking may be determined based on three factors of simple statistical figures, created times, semantic matching.
Meanwhile, the theme of the research papers published in a blog search field may be classified into two categories of influential blogger searching techniques and blog ranking techniques. The researches for the techniques are described in brief.
First,the researches on the remarkable blogger detection techniques are as follows. The paper of Nitin Agarwal et al. (2008) deals with detection of influential bloggers in the blogoshpere. This paper proposed four statistical characteristics for measuring social gestures of the bloggers: a recognition characteristic such as the number of citations, an activity generation characteristic such as the number of comments, an originality characteristic such as the number of citations of other blogs, and an eloquence characteristic such as the length of the post. In order to calculate an influence score of blog posts, the sum of weights of respective characteristics is calculated by each blogger the highest value of blog post scores is determined as his/her recognition score. The paper of Bi Chen et al. (2007) proposed a method for predicting a future topic by analyzing behavior pattern of bloggers in the time dimension and the social dimension. The paper of Xiaodan Song et al. (2007) proposed a technique for detecting an influential opinion leader in the blogoshpere by introducing the concept of an influence rank. The paper of Craig Macdonald et al. (2008) disclosed a technique to search the entire blog or each post in the blog for experts on specific interested topics. The paper of Akshay Java et al. (2006) proposed a method for detecting influential bloggers by an influence model. Further, the paper of P. Jurczyk et al. (2007) proposed a method for detecting authoritative users in a question/answering system (e.g., 'Yahoo! Answers').
Second, the researches on the blog ranking techniques are as follows. The paper of A. Kritikopoulos et al. (2006) proposed a blog ranking technique using an implicit hyperlink. For example, it is assumed that two blog posts having the same topic are connected to each other by an implicit hyperlink. The implicit hyperlink may make up for a sparse link connection between blog posts. The implicit hyperlink is applied to bloggers and commenters in the same group in addition to the blog posts having the same topic. This paper also proposed a blog ranking algorithm using PageRank coefficients disclosed in the paper of L. Page et al. (2008). The paper of Xiaochuan Ni et al. (2007) proposed a technique for searching blog posts containing useful and influential information. The paper of Kritsada Sriphaew et al. (2008) proposed a technique for searching a cool blog based on characteristics related on topics of blogs. It proposed a stochastic model by supposing that the cool blog has clear topics, contains a sufficient number of posts, and provides topic consistency, for example.
From the above description, it can be seen that conventional search techniques of bloggers and blog posts place the focus on one aspect, i.e., influence or information. The most influential or informative blogger or blog post was an important solution in conventional search techniques. However, the problems in the conventional search techniques of bloggers and blog posts have also been recognized in other search techniques of online community posts and users.
The users for searching online community posts such as blogs have various search intentions, fashion, and tendency. Even though many users for searching online community posts input the same search term, they may have different search purposes. For example, when a search term of 'landscape' is inputted, a certain user may intend to search artistic pictures, whereas another user may intend to search general pictures. Thus, the conventional search techniques in which ranking functions were implemented on the basis of one aspect are difficult to meet various search demands of the online community search users.
In view of the above, the present invention provides an online community post search technique capable of providing search results according to various intentions of search users by using social network analysis and data mining technique.
Further, the present invention provides search results satisfying various intentions of the users with a simple search term by introducing the concept of a trend such as expertise and popularity.
In accordance with a first aspect of the present invention, there is provided a method of searching online community posts based on interactions between online community users, the method including: receiving first information on online community users and second information on online community posts created by the online community users; acquiring, for each of the online community users, third information on interactions in which online community users other than a specific online community user among the online community users perform on an online community post of the specific online community user; calculating, for each of the online community users, a first score representing trend of the specific online community user by using trend scores representing trends of the other online community users and the third information; and searching the online community users or the online community posts by using the first score based on interactions between the online community posts.
In accordance with a second aspect of the present invention, there is provided a computer-readable storage medium storing a program for executing the method described above.
In accordance with a third aspect of the present invention, there is provided an apparatus for searching online community posts based on interactions between online community users, the apparatus including: an online community user information storage unit for storing first information on online community users and first scores representing trends of the online community users; an online community post storage unit for storing second information on online community posts created by the online community users and third information on interactions which online community users other than a specific online community user of the online community users perform on an online community post of the specific online community user; and an expertise operation unit for calculating, for each of the online community users, a first score representing a trend of the specific online community user by using trend scores representing trends of the other online community users and the third information.
In accordance with the present invention, it is possible to provide search results satisfying various intentions of users with a simple search term.
Further, it is possible to reflect changes in the online community post environment on a real time basis by defining a trend such as expertise and popularity based on interactions between online community users and performing a search operation with the trend.
FIG. 1 illustrates interactions of online community users in their online community posts;
FIG. 2 is a flowchart showing steps of an online community post search method based on interactions between online community users in accordance with an embodiment of the present invention;
FIG. 3 shows the expertise scores aligned according to the expertise ranks of the online community users;
FIG. 4 shows a process for calculating a popularity score by mapping a Gaussian distribution function to the expertise scores aligned according to the expertise ranks of the online community users; and
FIG. 5 is a block diagram showing a schematic configuration of an online community post search apparatus based on interactions between online community users in accordance with the embodiment of the present invention.
Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings which form a part hereof.
In the embodiment of the present invention, the concept of a trend is introduced as an index showing various search intentions of users, fashion, and tendency to utilize in post search. The trend-based search technique will be described in detail with expertise and popularity in the embodiment of the present invention.
For instance, when searching blogs related to movies, the user does not search the blogs only with one term so that Search the blogs related to the most important movie is meaningless. Here, the user may search the blogs related to the movies with the trend of popularity and expertise. In movie-related blog search, although different users search blogs related to movies of the same genre, some users prefer popular movies, while other users prefer movies of high expertise and artistry. Such tendency or preference of the searching user is defined in the embodiment of the invention as a trend. Accordingly, when the concept of the trend is introduced in search, it is possible to treat a search query such as "Search the blog posts related to movies of high expertise" or "Search the blog posts related to the most popular movies", unlike a conventional method of searching blog posts that the movie titles or actor/actress names are merely matched. Such trend-based blog search cannot be realized by simple comparison of statistical figures such as recommendations and hits for blog posts.
Hereinafter, there will be described the embodiment of the present invention in which post search is performed by using the concept of the trend.
FIG. 1 illustrates interactions of online community users in their online community posts. In this case, online community users A, B and C perform interactions in their online community posts.
An online community post 110 of the online community user A posts up three posts of a post A1 111, a post A2 112 and a post A3 113. The post A1 111 has a reply 171 of the online community user B. The post A2 112 has an empathy 172 of the online community user B. The post A2 112 is linked via a link 174 to a post C1 151 written by the online community user C. The post A3 113 has a recommendation 175 of the online community user C. An online community post 130 of the online community user B posts up two posts of a post B1 131 and a post B2 132. The post B1 131 has a comment 173 of the online community user A. The post B2 132 is taken to a post C2 152 by a scrap 177 of the online community user C. An online community post 150 of the online community user C includes two posts of the post C1 151 and the post C2 152. The post C1 151 has the link 174 connected to the post A2 112 of the online community user A. The post C2 152 drawn from the post B2 132 of the online community user B by the scrap 177 is connected to the post A3 113 of the online community user A by a track back 176. Here, the reply 171, the empathy 172, the comment 173, the link 174, the recommendation 175, the track back 176, the scrap 177 and the like have different names and forms depending on served online community services, but can be represented in numbers as interactions between the online community users. The interactions between the online community users may have various forms in addition to the above-mentioned ones.
FIG. 2 is a flowchart illustrating a method for an online community post searching based on interactions between online community users in accordance with the embodiment of the present invention.
First, first information on online community users and second information on online community posts created by the online community users are entered and stored in step S210. The first information includes user information such as IDs, names, addresses and the like, but it is not limited thereto. The second information includes titles and contents of the posts and the number of feedbacks, but it is not limited thereto. Further, third information on interactions in which online community users other than a specific user of the online community users perform on the online community post written by the specific user is acquired for each of the online community users and stored in step S220. The third information includes the reply 171, the empathy 172, the comment 173, the link 174, the recommendation 175, the track back 176, and the scrap 177, but it is not limited thereto. Then, a first score representing the expertise of the specific user is calculated for each of the online community users by using scores on the expertise of the other online community users and the third information in step S230. The first score can be used to determine a trend score indicating ranking of the online community posts. Further, a second score representing the semantic similarity between the inputted search term and the online community posts is calculated in step S240. Further, the online community users or the online community posts are searched by using the first score and the second score in step S250.
The online community post search method in accordance with the embodiment of the present invention will be explained in detail. In accordance with the embodiment of the present invention, a social network between users in an online community space is modeled in a graph and online community user clustering is performed to create a sub community including the users having high correlation. The sub community created by the online community user clustering includes online community users who are representative in their fields. A trend ranking value of the online community user is calculated for each of the online community users in the sub community. The trend ranking value of the user is a score numerically showing the expertise and popularity and it can be obtained by social network analysis, data mining technique and the like.
When the user submits a query in the form of a search term to search the posts, the online community posts are ranked by using a sum of a semantic similarity score between the user query and the online community post and a trend score of the online community post to return the ranking result. The trend score of the online community post is acquired by the trend ranking value of the online community users who perform interactions on the corresponding post or have correlation. This technique can be applied in the same way to all types of the online communities including blogs. It has been disclosed that the post ranking result is be calculated by the sum of the semantic similarity score and the trend score, however, those skilled in the art will recognize that the post ranking may be made by using only the trend score.
Hereinafter, the embodiment of the present invention will be described by using the social network analysis. The online community posts created by the online community users in fields of movies, music, photographs and the like can be ranked in aspects of, e.g., popularity and expertise. Some searchers prefer popular photographs, while other searchers prefer photographs of high expertise and photographic quality. Such tendency or preference of the searcher is defined as the trend as described above.
The basic idea of ranking of the online community users using the trend of expertise is as follows. Assuming that a certain online community post includes high-quality and expert-level contents, the online community post may get more responses of experts than those of amateurs. On the other hand, if a certain post includes low-quality and amateur-level contents, the experts may lose interest in the post.
From the above assumption, an expertise rank (ER) value that is an expertise score of the online community user can be defined based on the interactions and relationships between the online community users. The ER value of an online community user 'u' is influenced by the ER values of other online community users 'v' who make responses or interactions such as replies, comments, empathy, recommendations, track backs, scraps and links on the post made by the online community user 'u'. Here, a case of comments is described as an example. A set of posts made by the online community user 'u' is referred to as |Au|; all of the comments written by the online community users 'v',Cv; and a set of comments written by the online community users v on the posts which belong to the Au and are made by the online community user 'u', CAu,v. In this case, an expertise rank value ER(u) of the online community user 'u' may be formulated as follows:
where ER(u) is an expertise score of a specific online community user 'u', ER(v) is an expertise score of other online community users 'v', Au is a set of online community posts created by the specific online community user 'u', CAu,v
is the number of interactions performed by the other online community users 'v' on the online community posts belonging to the Au, Cv is the number of all interactions performed by the other online community users 'v' and 'd' is a damping factor representing the minimum influence.
Math Figure 1 is only an example, and the expertise score may be calculated by various modifications of Math Figure 1. In this embodiment, the expertise score of the other online community users is recursively used to calculate the expertise score of the specific online community user. However, various other methods using the interactions between the online community users may also be employed. The activity, reputation, sociability and the like of the online community users are considered in the calculation of the expertise score.
FIG. 3 plots the expertise scores to the expertise ranks of the online community users, which results from ER distribution of the online community users obtained by Math Figure 1. Horizontal and vertical axes are plotted in log scale. As the value of the expertise rank of the online community user or the online community user rank increases, in other words, as the expertise rank is lowered, the expertise score or the ER value decreases. The expertise score is 1000 or more if the expertise rank of the online community user is less than 10, whereas the expertise score is reduced to be less than 10 if the expertise rank of the online community user is more than 1000.
FIG. 4 shows a process for calculating a popularity score by mapping a Gaussian distribution function to the plots of the expertise scores to the expertise ranks of the online community users. First, the concept of the popularity trend will be explained before explanation of the mapping process.
The popularity of the online community user is closely related to the expertise, but the popularity and expertise are not contrary to each other. Some online community users may have high expertise and high popularity, and some online community users may have low expertise and low popularity. It can be determined that the online community user having popularity is preferred by general online community users who have middle-level expertise rather than a high-level and expert user or a low-level and amateur user.
From this point of view, the popularity score of the online community user can be defined by using the ER value serving as the above-described expertise score. The popularity score can be obtained by arranging the online community users according to the ER values, mapping a Gaussian distribution function to the alignment results, and applying a weight. In this case, the popularity rank (PR) of the online community user can be calculated by the following Math Figure 2:
where PR(u) is a popularity score of a specific online community user 'u'; 'x' is an expertise rank of the specific online community user 'u'; 'σ' is a standard deviation of the expertise ranks; and 'μ' is a middle value of the expertise ranks.
Math Figure 2 is only an example, and the popularity score may be calculated by various modifications of Math Figure 2. In this embodiment, the popularity score of the other online community users is calculated by mapping a Gaussian distribution function to the expertise scores aligned according to the expertise ranks of the online community users. However, various other methods for applying a high popularity score to an online community user having middle-level expertise may be employed.
As can be seen from FIG. 4, a Gaussian distribution function 420 is mapped to alignment results 410 of the expertise scores aligned according to the expertise ranks of the online community users. The Gaussian distribution function 420 may be mapped to have a maximum value at an approximately middle value of the alignment results 410 of the expertise scores. In this case, a maximum popularity score can be applied to the online community user having middle-level expertise.
In FIG. 4, horizontal and vertical axes are plotted in log scale. Accordingly, the middle value of the alignment results 410 of the expertise scores is substantially a middle value of results calculated in log scale. When the number of all online community users is 10,000 (log 10,000 = 4), a maximum popularity score is applied to the online community user having an expertise rank of about 100 (log 100 = 2). For example, in FIG. 4, an expertise score 440 of the online community user having an expertise rank of about 100 is about 100. However, a popularity score 445 of the online community user having an expertise rank of about 100 is calculated to be very high, e.g., about 3000 in mapping of the Gaussian distribution function 420. Further, an expertise score 430 of the online community user having an expertise rank of about 10 is about 1000. However, a popularity score 435 of the online community user having an expertise rank of about 10 is calculated to be very low, e.g., about 10 in mapping of the Gaussian distribution function 420.
However, the mapping function and mapping scale may be applied in various forms. For example, when a graph having horizontal and vertical axes plotted in general scale instead of log scale is created and a Gaussian distribution function is mapped, if the number of all online community users is 10,000, a maximum popularity score is given to an online community user having an expertise rank of about 5000. What expertise rank the user, who is given a maximum popularity score, has may be determined according to the various environment of the online community.
Next, a method for calculating a score of an online community post based on a specific trend in a search process may be performed by using the following Math Figure 3:
According to Math Figure 3, when the user submits a query, the online community posts are ranked by using a score obtained by combining a semantic similarity score between the user query and the online community post with a trend score of the online community post, thereby finally returning the ranking result. Meanwhile, a specific score calculation method will be described later. Although a weighted average concept is used in Math Figure 3, various other equations using both a semantic similarity score and a trend score or using another value in addition to a semantic similarity score may be employed.
According to occasions, the user may select and input a trend of search in advance before entering an initial search term. In this case, several formulas for calculating a post score based on the trend corresponding to Math Figure 3 are made for selectable trends such that a corresponding calculation formula can be used according to selection of the user. Further, when an online community post score is calculated by combining one semantic similarity score with plural trend scores, a search method using one or more trends is available.
A score of an online community post 'p' related to a sepcific trend can be calculated by a function Γ as folllows:
where Up is a set of online community users having connections with the post p, and Pp is a set of posts having connections with the post p.
That is, a score of the online community post related to a specific trend may be defined by interactions between the post and online community users having connection with the post. Representative examples of a functionΓ for calculating a trend score of the post based on a user score are as follows, but it is not limited thereto.
① Baseline method
It is the simplest method in which the number of feedback users on the post created by the user is regarded as a score of the post.
baseline_post_score (p) = the number of feedback users of p
The more feedback the post gets, the higher score the post gets. This method has an advantage of simplicity, but has a disadvantage that the quality of the post cannot be considered.
② Weight application method
In a different way from the baseline method, ranking of the user who provides a feedback to the post is considered. That is, a score of the user influences a score of the post.
Unlike the baseline method, a high score is given to the posts preferred by high-ranking users. Accordingly, there is an advantage that the quality of the post can be considered.
③ User/content co-ranking method
In a co-ranking method, a score of the post and a score of the user are simultaneously calculated by mutual reinforcement. That is, each user is given a portion of a score of his/her own post, whereas the post is given a portion of a score of the user who provides a feedback. These operations can be expressed by the following equations:
where and are a user score vector and a post score vector, respectively, W is a matrix whose element shows a relationship between a post and a maker of the post, and F is a matrix whose element shows a relationship between a feedback provider and a target post. In the co-ranking method, the user and contents exchange scores, and both the amount and quality of the contents may be considered. That is, the content which gets a feedback of a high-ranking user has a high score, and the user who makes the content having a high score has a high score.
④ Control user method
This is a post scoring method in which a specific control user is designated and a score is given to posts preferred by the designated user and users having ranks close to the rank of the designated user. This method is represented as follows:
According to this equation, a score of a certain post 'p' is calculated by measuring how much contribution a control user 'uc' and users, who have a difference equal to or smaller than 'w' between their scores and the score of the control user 'uc' have made on the post 'p'. A function of score_ similarity calculates score similarity between a control user and a specific user. The score similarity between users can be defined in various forms of a constant function, a trigonometric function, a Gaussian function and the like. For example, the score similarity between users using Gaussian distribution can be defined as follows:
Further, a function Γ can be variuosly designed accorindg to the trend without departing from the scope of the invention. In case of a main trend other than the expertise and popularity described in the embodiment, various data mining techniques may be used in the design of the function Γ. As examples of applicable data mining techniques, there are 1-mode/2-mode social network analysis, Markov chain random walk model analysis, association rule mining, classification and the like.
Meanwhile, in order to efficiently apply trends such as expertise and popularity according to occasions, online communities may be classified into categories and expertise-related information may be formed for each category. In this case, it is possible to obtain more accurate results compared to a case in which a global trend rank value, i.e., the same trend score, is applied without classification of categories of online community users and posts. For example, when the post related to landscape photography is evaluated by landscape photography experts, more accurate evaluation can be obtained compared to a case in which the post is evaluated by portrait photography experts. For this, the online community users and posts are clustered and classified according to the fields, and a trend score of the same field has relatively high importance.
FIG. 5 is a block diagram showing a schematic configuration of an apparatus for searching online community posts based on interactions between online community users in accordance with the embodiment of the present invention. The apparatus includes an online community user information storage unit 510, an online community post storage unit 520, an expertise operation unit 530, a semantic similarity operation unit 540, a search unit 550, and a popularity operation unit 560.
The online community user information storage unit 510 stores first information on online community users and first scores on expertise of online community users. In addition, various information on the online community users may be stored in the online community user information storage unit 510. Online community user expertise information 591 of the first scores and the first information are transferred from the online community user information storage unit 510 to the expertise operation unit 530.
The online community post storage unit 520 stores second information on the online community posts created by the online community users and third information on interactions which online community users other than a specific user performs on the online community post created by the specific user. In addition, various information on the online community posts may be stored in the online community post storage unit 520. Interaction information 592 of the third information and the second information are transferred from the online community post storage unit 520 to the expertise operation unit 530.
The expertise operation unit 530 calculates a first score representing the expertise of the specific online community user for each of the online community users by using scores on the expertise of the other online community users and the third information. The expertise operation unit 530 may include a multiplier for performing a first operation on each of the other online community users, the first operation including a multiplication of the scores of the other online community users and the third information, and an adder for performing a second operation including a sum of the results of the first operation on the other online community users. An expertise score 593 serving as the first score is transferred from the expertise operation unit 530 to the search unit 550 and the popularity operation unit 560.
The semantic similarity operation unit 540 calculates a second score representing the semantic similarity between the inputted search term and the online community posts. A semantic similarity score 595 serving as the second score is transferred from the semantic similarity operation unit 540 to the search unit 550.
The search unit 550 searches online community users or online community posts by using the first score or a third score and the second score. The search unit 550 may provide search results to the users by using the expertise score 593 serving as the first score, the semantic similarity score 595 serving as the second score and a popularity score 596 serving as the third score.
The popularity operation unit 560 is selectively included according to the embodiment. The popularity operation unit 560 calculates the third score representing the expertise of the specific online community user for each of the online community users by ranking the online community users depending on the first score values and mapping a function having a higher value as a rank is closer to a middle rank. In this case, the search unit 550 may be configured to use the second score and at least one of the first score and the third score. The popularity score 596 serving as the third score is transferred from the popularity operation unit 560 to the search unit 550.
The technical idea of the present invention may be applied to all types of online community posts such as blogs, online social networks and forums or bulletins, which are sharable online contents created by online community users. The present invention is not limited to the above-described embodiment.
In the embodiment of the present invention, the modules, functional blocks, means or any combination thereof may be embodied by various well-known devices such as an electronic circuit, an integrated circuit, and an application specific integrated circuit (ASIC).
Embodiments within the scope of the present invention may also include computer-readable media for carrying computer-executable instructions or data structures stored thereon. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures.
Although the embodiment of the present invention has been disclosed for illustrative purposes, those skilled in the art will appreciate that various modifications, changes and substitutions are possible without departing from the scope and spirit of the invention. For example, the present invention may be applied to pictures, images and the like which can be displayed on, e.g., LCD in addition to the text. Therefore, it is intended to cover all modifications and changes within the scope and spirit of the invention as disclosed in the accompanying claims.
Claims (15)
- A method of searching online community posts based on interactions between online community users, the method comprising:receiving first information on online community users and second information on online community posts created by the online community users;acquiring, for each of the online community users, third information on interactions in which online community users other than a specific online community user among the online community users perform on an online community post of the specific online community user;calculating, for each of the online community users, a first score representing trend of the specific online community user by using trend scores representing trends of the other online community users and the third information; andsearching the online community users or the online community posts by using the first score based on interactions between the online community posts.
- The method of claim 1, further comprising:calculating a second score representing semantic similarity between a search term entered by a user and the online community posts; andsearching the online community users or the online community posts by using the first score and the second score.
- The method of claim 1, wherein the third information is the number of the interactions, and said calculating a first score includes:performing a first operation on each of the other online community users, the first operation including a multiplication of the trend scores of the other online community users and the third information; andperforming a second operation including a sum of results of the first operation on the other online community users.
- The online community post search method of claim 3, wherein said calculating a first score is performed by the following equation:where ER(u) is the first score of the specific online community user 'u', ER(v) is the trend scores of the other online community users 'v', Au is a set of online community posts created by the specific online community user u, CAu,v is the number of interactions performed by the other online community users 'v' on the online community posts belonging to the Au, Cv is the number of all interactions performed by the other online community users 'v', and 'd' is a damping factor representing a minimum influence.
- The method of claim 1, further comprising:ranking the online community users in order of the first score;calculating a second score representing semantic similarity between a search term entered by a user and the online community posts; andcalculating a third score representing the trend of the specific online community user for each of the online community users by using a mapping function having a higher value as the rank is closer to a middle rank,wherein said searching is performed by using the second score and at least one of the first score and the third score.
- The method of claim 5, wherein the mapping function includes a Gaussian distribution function.
- The method of claim 6, wherein the Gaussian distribution function is represented by the following equation:where PR(u) is the third score of the specific online community user 'u', 'x' is the alignment rank of the specific online community user 'u','σ' is a standard deviation of alignment ranks, and 'μ' is a middle value of the alignment ranks.
- The method of claim 5, further comprising:selecting at least one of the first score and the third score for using in search to enter the selected one before entering the search term.
- The method of claim 1, further comprising:classifying the online community posts into categories based on specific criteria; andcalculating the trend scores of the other online community users and the first score representing the trend of the specific online community user for each of the categories.
- The method of claim 1, wherein the interactions include at least one of replies, comments, empathy, recommendations, track backs, scraps and links.
- A computer-readable storage medium storing a program for executing the method described in any one of claims 1 to 10.
- An apparatus for searching online community posts based on interactions between online community users, the apparatus comprising:an online community user information storage unit for storing first information on online community users and first scores representing trends of the online community users;an online community post storage unit for storing second information on online community posts created by the online community users and third information on interactions which online community users other than a specific online community user of the online community users perform on an online community post of the specific online community user; andan expertise operation unit for calculating, for each of the online community users, a first score representing a trend of the specific online community user by using trend scores representing trends of the other online community users and the third information.
- The online community post search apparatus of claim 12, further comprising:a semantic similarity operation unit for calculating a second score representing semantic similarity between an inputted search term and the online community posts; anda search unit for searching the online community users or the online community posts by using the first score and the second score.
- The online community post search apparatus of claim 12, wherein the third information is the number of the interactions, and the expertise operation unit includes a multiplier for performing a first operation on each of the other online community users, the first operation including a multiplication of trend scores of the other online community users and the third information, and an adder for performing a second operation including a sum of results of the first operation on the other online community users.
- The online community post search apparatus of claim 13, further comprising a popularity operation unit for ranking the online community users according to the first score to obtain an alignment rank of the specific online community user and calculating a third score representing the trend of the specific online community user for each of the online community users by using a mapping function having a higher value as the alignment rank is closer to a middle rank, wherein the search unit uses the second score and at least one of the first score and the third score.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020090034250A KR101088710B1 (en) | 2009-04-20 | 2009-04-20 | Method and Apparatus for Online Community Post Searching Based on Interactions between Online Community User and Computer Readable Recording Medium Storing Program thereof |
KR10-2009-0034250 | 2009-04-20 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2010123264A2 true WO2010123264A2 (en) | 2010-10-28 |
WO2010123264A3 WO2010123264A3 (en) | 2011-01-27 |
Family
ID=43011601
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2010/002478 WO2010123264A2 (en) | 2009-04-20 | 2010-04-20 | Online community post search method and apparatus based on interactions between online community users and computer readable storage medium storing program thereof |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR101088710B1 (en) |
WO (1) | WO2010123264A2 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012109487A2 (en) * | 2011-02-10 | 2012-08-16 | Microsoft Corporation | Social network based contextual ranking |
AU2013201006B2 (en) * | 2012-09-06 | 2014-07-03 | Fujifilm Business Innovation Corp. | Information classification program, information classification method, and information processing apparatus |
US20140244631A1 (en) * | 2012-02-17 | 2014-08-28 | Digitalsmiths Corporation | Identifying Multimedia Asset Similarity Using Blended Semantic and Latent Feature Analysis |
US20150373064A1 (en) * | 2014-06-18 | 2015-12-24 | International Business Machines Corporation | Enabling digital asset reuse through dynamically curated shared personal collections with eminence propagation |
CN106776792A (en) * | 2016-11-23 | 2017-05-31 | 北京锐安科技有限公司 | The method for digging and device of Web Community |
CN106886561A (en) * | 2016-12-29 | 2017-06-23 | 中国科学院自动化研究所 | Web Community's model influence sort method based on association in time interaction fusion |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102010418B1 (en) * | 2017-04-03 | 2019-08-14 | 네이버 주식회사 | Method and system for subject-based ranking considering writer-reader interaction |
US11320973B1 (en) | 2020-10-30 | 2022-05-03 | Oxopolitics Inc. | Method of providing user interface for social networking |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6845374B1 (en) * | 2000-11-27 | 2005-01-18 | Mailfrontier, Inc | System and method for adaptive text recommendation |
US20060069663A1 (en) * | 2004-09-28 | 2006-03-30 | Eytan Adar | Ranking results for network search query |
-
2009
- 2009-04-20 KR KR1020090034250A patent/KR101088710B1/en not_active IP Right Cessation
-
2010
- 2010-04-20 WO PCT/KR2010/002478 patent/WO2010123264A2/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6845374B1 (en) * | 2000-11-27 | 2005-01-18 | Mailfrontier, Inc | System and method for adaptive text recommendation |
US20060069663A1 (en) * | 2004-09-28 | 2006-03-30 | Eytan Adar | Ranking results for network search query |
Non-Patent Citations (2)
Title |
---|
KIM, J. H. ET AL.: 'The Bolg-Rank algorithm for the effective blog search.' 2008 AUTUMN CONFERENCE, KIISE vol. 35, 01 October 2008, pages 93 - 94 * |
SONG, C. W. ET AL.: 'Contents Recommendation Search System using Personalized Profile on Semantic Web.' JOURNAL OF THE KOREA CONTENTS ASSOCIATION vol. 8, no. 1, January 2008, pages 322 - 324 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9870424B2 (en) | 2011-02-10 | 2018-01-16 | Microsoft Technology Licensing, Llc | Social network based contextual ranking |
WO2012109487A3 (en) * | 2011-02-10 | 2012-11-08 | Microsoft Corporation | Social network based contextual ranking |
WO2012109487A2 (en) * | 2011-02-10 | 2012-08-16 | Microsoft Corporation | Social network based contextual ranking |
US10331785B2 (en) * | 2012-02-17 | 2019-06-25 | Tivo Solutions Inc. | Identifying multimedia asset similarity using blended semantic and latent feature analysis |
US20140244631A1 (en) * | 2012-02-17 | 2014-08-28 | Digitalsmiths Corporation | Identifying Multimedia Asset Similarity Using Blended Semantic and Latent Feature Analysis |
US10185765B2 (en) | 2012-09-06 | 2019-01-22 | Fuji Xerox Co., Ltd. | Non-transitory computer-readable medium, information classification method, and information processing apparatus |
AU2013201006B2 (en) * | 2012-09-06 | 2014-07-03 | Fujifilm Business Innovation Corp. | Information classification program, information classification method, and information processing apparatus |
US9628551B2 (en) * | 2014-06-18 | 2017-04-18 | International Business Machines Corporation | Enabling digital asset reuse through dynamically curated shared personal collections with eminence propagation |
US20150373064A1 (en) * | 2014-06-18 | 2015-12-24 | International Business Machines Corporation | Enabling digital asset reuse through dynamically curated shared personal collections with eminence propagation |
US10298676B2 (en) | 2014-06-18 | 2019-05-21 | International Business Machines Corporation | Cost-effective reuse of digital assets |
CN106776792A (en) * | 2016-11-23 | 2017-05-31 | 北京锐安科技有限公司 | The method for digging and device of Web Community |
CN106776792B (en) * | 2016-11-23 | 2020-07-17 | 北京锐安科技有限公司 | Network community mining method and device |
CN106886561A (en) * | 2016-12-29 | 2017-06-23 | 中国科学院自动化研究所 | Web Community's model influence sort method based on association in time interaction fusion |
Also Published As
Publication number | Publication date |
---|---|
KR101088710B1 (en) | 2011-12-01 |
KR20100115600A (en) | 2010-10-28 |
WO2010123264A3 (en) | 2011-01-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9576029B2 (en) | Trust propagation through both explicit and implicit social networks | |
WO2010123264A2 (en) | Online community post search method and apparatus based on interactions between online community users and computer readable storage medium storing program thereof | |
Miao et al. | AMAZING: A sentiment mining and retrieval system | |
CN108763321B (en) | Related entity recommendation method based on large-scale related entity network | |
Bao et al. | Competitor mining with the web | |
Kanwal et al. | A review of text-based recommendation systems | |
US20150254230A1 (en) | Method and system for monitoring social media and analyzing text to automate classification of user posts using a facet based relevance assessment model | |
US20080270384A1 (en) | System and method for intelligent ontology based knowledge search engine | |
Hong et al. | Multimedia question answering | |
US20090254540A1 (en) | Method and apparatus for automated tag generation for digital content | |
US20100153371A1 (en) | Method and apparatus for blending search results | |
Iftene et al. | Using semantic resources in image retrieval | |
WO2019112223A1 (en) | Electronic document retrieval method and server therefor | |
Secker et al. | AISIID: An artificial immune system for interesting information discovery on the web | |
JP2010282403A (en) | Document retrieval method | |
Jung | Contextualized query sampling to discover semantic resource descriptions on the web | |
Vallet et al. | Exploiting external knowledge to improve video retrieval | |
Chew et al. | Ranking without learning: towards historical relevance-based ranking of social images | |
WO2013129888A1 (en) | A method and system for non-ephemeral search | |
Lobo et al. | A novel method for analyzing best pages generated by query term synonym combination | |
Chelcioiu et al. | Semantic Meta-search Using Cohesion Network Analysis | |
Weng et al. | New information search model for online reviews with the perspective of user requirements | |
Galitsky et al. | Inverting semantic structure under open domain opinion mining | |
Šimko et al. | State-of-the-art: Semantics acquisition and crowdsourcing | |
Zhou et al. | Automobile, car and BMW: Horizontal and hierarchical approach in social tagging systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10767278 Country of ref document: EP Kind code of ref document: A2 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10767278 Country of ref document: EP Kind code of ref document: A2 |