KR20120130962A

KR20120130962A - System and method for analyzing similar inclination

Info

Publication number: KR20120130962A
Application number: KR1020110049080A
Authority: KR
Inventors: 성도헌
Original assignee: 성도헌
Priority date: 2011-05-24
Filing date: 2011-05-24
Publication date: 2012-12-04

Abstract

PURPOSE: A similarity tendency analyzing method and a system thereof are provided to analyze a user having a similarity tendency and supply an analyzing result to the user, thereby easily finding a tendency of an opponent in the online. CONSTITUTION: A database(300) stores user questionnaire response information. A similarity calculation means(250) checks a category of questionnaires which a first user and a second user respond for each user based on the questionnaire response information of the first user and the second user extracted in the database. The similarity calculation means sums the number of categories for each user. The similarity calculation means calculates the number of the categories overlapped between the first user the second user. The similarity calculation means calculates concern similarity of the second user for the first user by applying a weighted value. [Reference numerals] (200) Similarity tendency analyzing server; (210) Transmitting/receiving means; (220) Member management means; (230) Questionnaire providing menas; (240) Access information management means; (250) Similarity calculation means; (260) Personal management means; (270) Payment process means; (300) Database; (310) Member information storage means; (320) Access information storage means; (330) Questionnaire storage means

Description

Similarity Analysis Method and System {SYSTEM AND METHOD FOR ANALYZING SIMILAR INCLINATION}

The present invention relates to a technique for analyzing similarity between users, and more particularly, to a method and system for providing a questionnaire to a user and analyzing the similarity between users based on the contents answered to the questionnaire.

Today, due to the development of communication networks and communication terminals, various types of services are provided through communication systems. Such services vary from User Created Contents (UCC) services and Video On Demand (VOD) services. Among them, social networking services have recently been in the spotlight.

The social networking service is an online social networking service for the purpose of forming a new network with an unspecified person using an internet communication medium such as a mini homepage or strengthening a network with friends. Such social networking services are provided to users in various forms in almost every portal service company (e.g., Naver's MeToday, Daum, Nate's Cyworld, etc.).

These social networking services also serve to form online communities among people with similar common interests. For example, people interested in inline skates exchange each other's online contacts (eg, mini homepage address, blog address, etc.), thereby sharing knowledge related to inline skates on their community.

However, a user using a social networking service should grasp the main interests of other users by accessing other users' homepages, blogs, or checking other user's profiles in order to form contacts with others with similar interests. That is, the user checks whether the interests of other users are the same as his or her interests by accessing a homepage, a blog, or a profile in order to form a network with others with similar interests. However, since this method depends on the passive method of the user, not only does it take a long time to form a network, but also depends on the user's intuition and thus, there is a problem that the interests are actually inconsistent.

The present invention has been proposed to solve such a conventional problem, and provides a similar propensity analysis method and system for analyzing similar propensities among users based on data stored in a database, and recommending similar users with similar interests based on the analyzed information. Its purpose is to.

In particular, it is another object of the present invention to provide a method and system for analyzing similarity between users by calculating interest similarity and consensus similarity between users using response information of a user described in an online questionnaire.

Other objects and advantages of the present invention can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. It will also be readily apparent that the objects and advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.

Method for analyzing the similarity between users in the similarity analysis system according to the first aspect of the present invention for achieving the above object, the number of questionnaire categories by extracting the category of the questionnaire responded by the first user and the second user for each user A questionnaire category summing step of summing for each user; Calculating the number of overlapping questionnaire categories overlapping between the first user and the second user; And a questionnaire category of each user summed in the questionnaire category adding step to a ratio of the number of duplicate questionnaire categories calculated in the step of calculating the number of duplicates compared to the number of questionnaire categories of the first user summed in the questionnaire category adding step. And a similarity of interest calculation step of calculating a similarity of interest of the second user to the first user by multiplying the relative ratio of the number as a weight.

The method for analyzing similarity between users in the similarity analysis system according to the second aspect of the present invention for achieving the above object includes a category list of questionnaires responded to by the first user and the second user, and the first user and the first user. A category summing step of extracting at least one or more lists from the category list of the contents accessed by the second user for each user and summing the number of categories recorded in the corresponding category list for each user; Calculating a number of duplicates for calculating the number of categories overlapping between the first user and the second user; And weighting the relative ratio of the number of categories of each user summed in the category adding step to the ratio of the number of duplicate categories calculated in the duplication number calculating step relative to the number of categories of the first user summed in the category adding step. And calculating interest similarity by multiplying as to calculate interest similarity of the second user with respect to the first user.

The method for analyzing the similarity between users in the similarity analysis system according to the third aspect of the present invention for achieving the above object, by extracting the questionnaire questions answered by the first user and the second user for each user, the extracted questionnaire A questionnaire question summing step of adding up the number of questions for each user; An equal answer calculating step of calculating the number of questions in the questionnaire answered by the first user and the second user equally; And a ratio of the number of questionnaires of the respective users summed in the questionnaire question adding step to the ratio of the number of questions calculated in the same answer calculation step to the number of questionnaire questions of the first user summed in the questionnaire question adding step. And a consensus similarity calculating step of calculating a consensus similarity of the second user with respect to the first user by multiplying the ratio as a weight.

Similarity analysis system according to a fourth aspect of the present invention for achieving the above object, the database for storing the questionnaire response information for each user; And using the questionnaire response information of the first user and the second user extracted from the database, checking the category of the questionnaire responded by the first user and the second user for each user, and adding up the number of questionnaire categories for each user, The number of questionnaire categories of each of the users summed up to the ratio of the number of questionnaire categories overlapping the sum of the questionnaire categories of the first user by calculating the number of questionnaire categories overlapping between the user and the second user. And similarity calculating means for calculating a similarity of interest of the second user with respect to the first user by multiplying the relative ratio of as a weight.

The present invention analyzes users who have similar tendencies with each other and provides the analyzed results to the users, thereby helping them to easily and accurately grasp the other person's dispositions online as well as contributing to social networking on social networks. There is an advantage.

In addition, the present invention has an advantage of providing a more intuitive similarity tendency analysis result to users by calculating the similarity of interest similarity and consensus similarity between users based on questionnaire response information and content access information accessed by the user.

In addition, the present invention has an advantage of rationally quantifying similarity between users by applying interest weight and consensus similarity by applying variable weights depending on user's basic data.

BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments of the invention and, together with the description, serve to explain the principles of the invention. And shall not be construed as limited to such matters.
1 is a view showing the configuration of a similarity analysis system according to an embodiment of the present invention.
2 is a diagram illustrating a configuration of a similarity analysis server and a database according to an embodiment of the present invention.
3 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in the similarity analysis system according to an embodiment of the present invention.
4 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in the similarity analysis system according to another embodiment of the present invention.
5 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in a similarity analysis system according to another embodiment of the present invention.
6 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in the similarity analysis system according to another embodiment of the present invention.

The foregoing and other objects, features and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings, in which: There will be. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail.

Prior to describing a method and system for analyzing similarity tendency according to an embodiment of the present invention, terms to be described below are described.

Interest similarity is a quantification of the overlapping rate for the kind of content that two users have in common.

The consensus similarity is a digitized portion of two users empathizing with each other. Specifically, the sympathy similarity is a quantification of the proportion of content accessed by two users and the response rate of a questionnaire responding to the same.

Hereinafter, a preferred embodiment of the present invention will be described in detail with reference to the accompanying drawings.

1 is a view showing the configuration of a similarity analysis system according to an embodiment of the present invention.

As shown in FIG. 1, the similarity analysis system according to an embodiment of the present invention includes a similarity analysis server 200 and a database 300.

The similarity analysis server 200 communicates with each communication terminal 100 -N and other external servers through the network 400. The network 400 includes a mobile communication network and a broadband wired Internet network, and thus detailed description thereof will be omitted since it is a well-known conventional technology in the present invention.

The communication terminal 100 -N receives an online questionnaire from the similarity analysis server 200, records response information in each item listed in the online questionnaire according to a user's operation, and transmits the response information to the similarity analysis server 200. . In addition, the communication terminal 100 -N may receive contact information (eg, a profile, an e-mail address, etc.) of another user having similar interests or consensus of the user from the similarity analysis server 200. The communication terminal 100-N may be a mobile communication terminal such as a personal digital assistant (PDA), a smartphone, a wideband code division multiple access (WCDMA) phone, a code division multiple access (CDMA) phone, or a laptop, laptop computer, or desktop. A computer, a tablet computer, and the like may be a general purpose computer.

The database 300 stores member information such as an ID, a password, an address, a blog address, a homepage address, and stores payment information for each user. In addition, the database 300 stores content to which category information is added. In particular, the database 300 stores the questionnaire response information and the content access information received from each user's communication terminal 100-N for each user. The content access information is mapped with content identification information and category information accessed by the corresponding user. For example, in the content access information, the multimedia content executed by the user and the category information of the multimedia are mapped, or the product information purchased by the user or registered as the item of interest and the category information of the product information are mapped.

The similarity analysis server 200 performs a function of analyzing similarity between users based on questionnaire response information and content access information for each user of the database 300. In particular, the similarity analysis server 200 calculates interest similarity and consensus similarity as similarity analysis. The similarity of interest is a weight added to the overlapping rate of content, goods, hobbies, etc., which are commonly interested by two users, and is calculated by Equation 1 to be described later. In addition, the consensus similarity is calculated by Equation 2 to be described later as weights added to the percentage of the content that the two users access the same and the response rate of the questionnaire responding the same.

2 is a diagram illustrating a configuration of a similarity analysis server and a database according to an embodiment of the present invention.

As shown in FIG. 2, the similarity analysis server 200 according to an embodiment of the present invention includes a transceiver 210, a subscriber manager 220, a questionnaire provider 230, an access information manager 240, The similarity calculator 250, the network manager 260, and the payment processor 270 are included. In addition, the database 300 includes a subscriber information storage unit 310, an access information storage unit 320, and a questionnaire storage unit 330.

The subscriber information storage unit 310 stores member information such as ID / password, gender, age, occupation, address, blog address, and homepage address of the subscriber.

The access information storage unit 320 stores content access information for each user. In detail, the access information storage unit 320 stores content identification information accessed by the corresponding user and content access information to which category information of the content is mapped for each user. Here, the term "access" refers to an act of accessing online content such as playing, viewing, executing, or purchasing various online contents. For example, the access information storage unit 320 may include the content access information to which the identification information of the multimedia content executed by the user and the category information of the multimedia content are mapped or the product identification information purchased or registered as the item of interest by the user. Content access information to which product category information is mapped may be stored.

The category information indicates the classification information of the content, and category information for classifying a video such as sports, drama, movie, entertainment, documentary, etc. may be added for each video content, and the category of clothing, mobile phone, MP3, PMP, notebook, etc. Information may be added for each shopping content. In addition, category information for classifying music such as jazz, dance, hip hop, classical, ballad, etc. may be added for each audio content, and various category information may be added to the corresponding content and stored in the database 300.

Meanwhile, the access information storage unit 320 may store the content itself (that is, a content file) as the content identification information, or may store a URL (Uniform Resource Locator) of the content. That is, the access information storage unit 320 stores the content file or the content URL accessed by the user as the content identification information.

The questionnaire storage unit 330 stores an online questionnaire in which one or more questions are listed and category information (eg, a workbook, a movie, a game, etc.) is added. In addition, the questionnaire storage unit 330 classifies and stores the questionnaire response information in which the user's response to each item of the online questionnaire and the questionnaire category information are described for each user.

The transceiver 210 of the similarity analysis server 200 performs a function of exchanging data with another server or communication terminal 100 -N via the network 400.

The subscriber manager 220 performs a function of creating, modifying, and deleting member information. In detail, the subscriber management unit 220 receives data such as a name, ID / password, home address, blog address, and homepage address from a user who has requested a new membership, and stores the data in the subscriber information storage unit 310 so as to receive a new member. Proceed with the registration process. In addition, when the user who successfully logs in changes his information, the subscriber manager 220 reflects the changed information in the subscriber information storage 310 of the database 300 and updates the user data. In addition, the subscriber management unit 220 may delete the user's data from the subscriber information storage unit 310 when a user who successfully logs in requests for membership withdrawal. In addition, the subscriber manager 220 may authenticate whether the ID and password received from the communication terminal 100 -N are stored as member information of the subscriber information storage 310.

The questionnaire provider 230 provides an online questionnaire to a member or a non-member using the transceiver 210, and collects questionnaire response information. That is, the questionnaire provider 230 extracts a specific online questionnaire from the questionnaire storage unit 330, and provides the online questionnaire to one or more communication terminals 100 -N designated as a questionnaire or a questionnaire, and the online questionnaire. The questionnaire response information of the user who responds to the questionnaire is stored in the questionnaire storage unit 330.

The access information manager 240 maps the identification information of the content accessed by the user and the category information of the content and stores the mapping information in the access information storage 320. That is, when a specific user accesses the online content, the access information manager 240 checks the identification information (eg, URL or file name) of the content and category information (eg, music, game, etc.) of the content, and confirms the content. The content access information is generated by mapping the generated content identification information and category information, and the generated content access information is stored in the access information storage unit 320 as data of the corresponding user.

The payment processor 270 pays a service fee or a product purchase cost for a user who uses a paid service or purchases a specific product. In this case, the payment processing unit 270 may receive a corresponding service use fee or a product purchase cost from a corresponding user by using various payment means such as a credit card, a mobile phone, a passbook deposit, a gift certificate, and the like.

The similarity calculator 250 calculates interest similarity and consensus similarity between a plurality of users using data stored in the access information storage 320 and the questionnaire storage 330 of the database 300, respectively. In detail, the similarity calculator 250 displays the questionnaire response information written by the reference user and the comparison target user and the content access information of the two users in the access information storage 320 and the question storage 330 of the database 300. After each extraction, the similarity of interest is calculated using the extracted information and Equation 1 below. Here, the reference user is a user who is a reference for analyzing the similarity, and the comparison target user is a user who compares and analyzes how much similarity is with the reference user.

n (A): Number of categories of content accessed by comparison user + Number of categories of questionnaire responded by user

n (B): number of categories of content accessed by the reference user + number of categories of questionnaire responded by the reference user

n (A∩B): Sum of the number of duplicate content categories and the number of questionnaire categories between the reference user and the comparison user.

s is the smaller of n (A) and n (B)

l: the larger of n (A), n (B)

In addition, the similarity calculator 250 calculates the consensus similarity by substituting the questionnaire response information and the content access information of the two users, which are respectively prepared by the reference user and the comparison target user, into Equation 2 below.

n (X): Number of questions in questionnaire responded by comparison user + Number of contents accessed by comparison user

n (Y): Number of questions in each questionnaire responded by the reference user + Number of content accessed by the reference user

n (X∩Y): Number of questions answered by the reference user and the comparison user equally + Number of contents accessed by the reference user and the comparison user the same

N: integer that discarded the fractional part from 1 / m × n (X) -n (Y) |, with N = 2 when N <2

(The initial value of m is 1, which is a random constant that increases with the sum of the total number of content stored in the database and the total number of questions answered)

s: the smaller of n (X), n (Y)

l: the larger of n (X), n (Y)

In Equation 2, the arbitrary constant 'm' is a constant that increases so that the sum of the total number of contents stored in the database 300 and the total number of question items increases, and a width that is increased by an administrator is determined. For example, when the total number of contents registered in the database 300 and the total number of questionnaires are less than 1000, the arbitrary constant 'm' may be set to '1', and the total number of contents registered in the database 300 may be If the total number of questionnaires is more than 1000 and less than 2000, 'm' may be set to 2, and if the total is more than 2000, 'm' may be set to 3.

The network manager 260 performs a function of providing the reference user with contacts of other users having similarity with the reference user by using at least one of interest similarity and consensus similarity calculated by the similarity calculator 250. . In this case, the network manager 260 may refer to the similarity of interest for each user calculated by the similarity calculator 250 to determine whether there is a user whose interest similarity exceeds a predetermined first threshold (eg, 50%). If there is a check and existence, contact information such as a user's profile, phone number, and email address may be provided to the reference user as contact information. Alternatively, the network manager 260 may refer to the consensus similarity for each user calculated by the similarity calculator 250, and if there is a user whose consensus similarity exceeds the second threshold (eg, 70%), The contact information of this user can be provided to the reference user.

In addition, the network management unit 260 also serves to form a network between a plurality of users. In detail, the network manager 260 registers other users in a friend list of a specific user, thereby forming a network of users.

Hereinafter, a method of analyzing similarity between users in the similarity analysis system according to the present invention will be described in more detail with reference to FIGS. 3 to 6.

3 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in the similarity analysis system according to an embodiment of the present invention.

Referring to FIG. 3, the questionnaire providing unit 230 extracts one or more online questionnaires from the questionnaire storage unit 330, and requests or extracts the one or more online questionnaires extracted using the transceiver 210. Transmit to multiple communication terminals 100-N. Subsequently, the questionnaire provider 230 receives the questionnaire response information from the communication terminal 100 -N responding to the online questionnaire during the survey period and stores the questionnaire response information in the questionnaire storage unit 330 (S301). At this time, the questionnaire provider 230 confirms the user ID, maps the user ID and the questionnaire response information, and stores the user ID in the questionnaire storage unit 330. The questionnaire response information includes questionnaire category information and response information of the corresponding user for each question.

Subsequently, when the survey period ends, the similarity calculator 250 checks one or more user information (eg, IDs) that responded to the survey during the survey period, and access information of the database 300 using the user information. The storage 320 extracts content access information of each user (S303). That is, the similarity calculator 250 checks each user ID responding to the survey during the survey period, and accesses the storage information 320 of the database 300 to access the content access information mapped to the identified user IDs. Extract from. In this case, the similarity calculator 250 may select and extract only content access information stored in the access information storage 320 for a predetermined period (eg, during a survey period).

Next, the similarity calculator 250 checks the content category information from the extracted content access information, and adds the total number of categories for the content accessed by each user for each user (S305). In this case, when there is duplicate category information in the content category information of a specific user, the similarity calculator 250 may calculate the total number of categories by adding the duplicate category information by one number rather than adding them individually. . For example, when three categories of information such as 'music', 'game', and 'game' are extracted as category information on the content of user 'A', the similarity calculator 250 calculates a duplicate 'game'. The category information may be calculated as one number, and the total number of categories for the 'A' user may be added to two.

Next, the similarity calculator 250 extracts the questionnaire response information collected during the survey period by dividing the questionnaire response information by the user from the questionnaire storage unit 330 of the database 300 (S307). Next, the similarity calculator 250 checks the category information recorded in the extracted questionnaire response information, and adds the total number of categories for the questionnaire responded by the corresponding user for each user (S309). In this case, when there is overlapping category information in the category information of the questionnaire responded by a specific user, the similarity calculating unit 250 adds up the number of categories for the questionnaire by adding the number of the duplicated category information individually without adding them up individually. The number can be calculated. For example, if the category information for each questionnaire responded by the user 'B' is 'meeting', 'leisure', or 'meeting', the similarity calculation unit 250 may provide category information on the duplicate 'meeting' as one category. By calculating the number, the total number of categories for each questionnaire responded by the user 'A' may be added to two.

Next, the similarity calculator 250 selects a specific user as a reference user, and selects users other than the reference user as the comparison target user (S311). Here, the reference user is a user who is a reference for analyzing the similarity, and a user who answers the questionnaire may be sequentially selected from the reference target user. In addition, the comparison target user is a user who compares and analyzes how much similarity the user has with the reference user.

After selecting the reference user, the similarity calculator 250 calculates a sum of the number of content categories overlapping between the reference user and the specific comparison target user and the number of categories of the questionnaire. Subsequently, the similarity calculating unit 250 overlaps the number of content categories of the reference user, the number of categories of questionnaires participated by the reference user, the number of content categories of specific comparison target users, the number of categories of questionnaires participated by a specific comparison target user, and overlaps between the two users. By substituting the sum of the number of content categories and the number of questionnaire categories into Equation 1, interest similarity between the reference user and the first comparison target user is calculated (S313). Thus, using Equation 1, the similarity calculator 250 sequentially calculates similarity of interests between the comparison target user and the reference user other than the specific comparison target user.

For example, the content category of the reference user stored in the access information storage unit 320 is 'game', 'music', 'stock', 'movie', 'shopping', and the category of each questionnaire responded by the reference user is 'Securities', 'movies', and the content category of the first comparison target user stored in the access information storage unit 320 is 'game', 'fiction' and each category of the questionnaire responded by the first comparison target user Assume that it is a movie. In this case, "game" is duplicated in the content category, and "movie" is duplicated in the questionnaire category, and "2" is substituted in n (A∩B) in Equation (1). In addition, as the number of content categories of the reference user is '5' and the number of questionnaire categories is '2', '7' is substituted in n (B) of Equation 1. In addition, as the number of content categories of the first comparison target user is '2' and the number of questionnaire categories is '1', Equation n (A) is substituted with '3', and n (B) and n (A) As n (B) is large and n (A) is small, '3' and '7' are substituted for s and l in Equation 1, respectively. When each of these numbers is substituted into Equation 1, the similarity calculator 250 calculates a similarity of interest between the reference user and the first comparison target user as 18.69 (%).

The similarity of interest calculated through Equation 1 is high when the category of the accessed content matches between the reference user and the comparison target user or the category of the answered questionnaire matches. Also, interest similarity is a weight that is assigned to the category match rate between the reference user and the comparison user (

), It is more reasonable. That is, the weight becomes larger when the number of categories of the reference user and the comparison target user match, and becomes smaller when the difference in the number of categories becomes larger.

In this manner, when the similarity calculator 250 calculates the interest similarity between all the comparison target users and the reference user, the network manager 260 may determine a similarity degree of interest with the reference user (eg, 50%). Check whether there is a comparison target user exceeding (S315). Network management unit 260, if there is a comparison target user whose interest similarity exceeds a threshold value, the contact information such as blog address, email address, profile information, mobile phone number of the user database 300 Extracted from the subscriber information storage unit 310, and provided as the network recommendation information to the reference user (S317, S319). In this case, the network manager 260 may provide the reference user with contact information of all subscribers whose interest similarity is calculated to be equal to or greater than a threshold. Alternatively, the network manager 260 may select a predetermined number (eg, three) of comparison target users whose interest similarity is calculated among the plurality of subscribers whose interest similarity is calculated to be greater than or equal to a threshold value and recommend the reference user to the reference user. have.

On the other hand, the network manager 260 notifies the reference user that there is no user matching the Guam similarity when there is no comparison user whose interest similarity exceeds a threshold (eg, 50%).

4 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in the similarity analysis system according to another embodiment of the present invention.

In the following description with reference to FIG. 4, portions overlapping with the contents of FIG. 3 will be described by compressing and the differences will be mainly described.

Referring to FIG. 4, the questionnaire providing unit 230 extracts the online questionnaire from the questionnaire storage unit 330 and provides the extracted online questionnaire to a plurality of communication terminals 100 -N by using the transceiver 210. During the survey period, questionnaire response information is received from each communication terminal 100 -N responding to the questionnaire and stored in the questionnaire storage unit 330 (S401).

Subsequently, when the survey period ends, the similarity calculator 250 checks user IDs responding to the survey during the survey period, and accesses the content access information mapped to the identified user IDs. Extracted at 320 (S403). In this case, the similarity calculator 250 may select and extract only content access information stored in the access information storage 320 for a predetermined period (eg, during a survey period).

Next, the similarity calculator 250 adds the total number of contents accessed by the user based on the content identification information included in the extracted content access information (S405). Subsequently, the similarity calculator 250 extracts the questionnaire response information collected during the survey period by dividing the questionnaire response information by the user in the questionnaire storage unit 330 (S407). Next, the similarity calculation unit 250 adds up the total number of questionnaire questions answered by the user based on the extracted questionnaire response information for each user (S409).

Next, the similarity calculator 250 selects a specific user as a reference user, and selects users other than the reference user as the comparison target user (S411). Subsequently, the similarity calculator 250 calculates the number of content identification information (that is, the number of contents accessed by two users equally) and the number of questionnaires answered by two users identically between the reference user and a specific comparison target user. Calculate. Subsequently, the similarity calculating unit 250 may include the number of contents accessed by the reference user, the number of questions answered by the reference user, the number of contents accessed by a specific comparison target, the number of questions answered by the specific comparison target user, By substituting the number of contents accessed by both the reference user and the specific comparison target user and the number of question items answered by the two users in Equation 2, the consensus similarity between the reference user and the comparison target user is calculated (S413). . Thus, using Equation 2, the similarity calculator 250 sequentially calculates similarity of interests between other comparison target users and the reference user.

For example, the content identification information of the reference user stored in the access information storage unit 320 is 'www.XXX.com/aaa.asf', 'www.XXX.com/bbb.asp' and the reference user responds. The total number of questionnaires is 20, the content identification information of the first comparison target user stored in the access information storage unit 320 is 'www.XXX.com/aaa.asf', and the first comparison target survey is answered by the user. Suppose the total number of questions is 10. In addition, suppose that the reference target user and the first comparative target user responded to the same questionnaire with five answers, and the random constant 'm' is '4'.

In this case, 'www.XXX.com/aaa.asf' is duplicated in the content identification information, and five of the questionnaire response information are duplicated, and '6' is substituted into n (X∩Y) in Equation (2). . In addition, since the number of content identification information stored in the access information storage unit 320 of the reference target user is '2' and the total number of survey response information is '20', n (Y) in Equation 1 is '22'. 'Is assigned. In addition, as the number of content identification information stored in the access information storage unit 320 of the first comparison target user is '1' and the total number of survey response information is '10', n (Y) in Equation 1 As '11' is assigned and n (X) is large and N (Y) is small among n (X) and n (Y), s and l in Equation 1 are assigned as '11' and '22', respectively. do. In addition, '4' is substituted for N as 'm' is 4 and | n (X) -n (Y) | is calculated as 11.

When each of these numbers is substituted into Equation 2, the similarity calculator 250 calculates a consensus similarity between the reference user and the first comparison target user as 19.28 (%).

The similarity of the consensus is calculated high when the number of similarly accessed contents between the reference user and the comparison target user is large or the number of questionnaires answered with the same answer is large. In addition, consensus similarity is weighted based on the ratio of the number of matching content and the number of question items between the reference user and the comparison user.

), It is more rationally calculated. That is, the weight is given higher when the number of questions answered by the reference user and the comparison user and the number of contents accessed by the two users match, and the number of contents accessed by the two users or the number of questions in the survey answered. The lower the difference is, the lower. In addition, the weight may be increased such that the total number of contents registered in the database 300 may be increased or the total number of questionnaires of the questionnaire stored in the database 300 may be increased.

In this manner, when the consensus similarity calculation between all the comparison target users and the reference user is completed by the similarity calculator 250, the network manager 260 may have a consensus similarity level with the reference user (eg, 70%). Check whether there is a comparison target user exceeding (S415). If there is a comparison target user whose interest similarity exceeds a threshold, the network manager 260 extracts one or more user contacts corresponding to the interest from the subscriber information storage unit 310 to recommend the network to the reference user. Provided as information (S417, S419).

On the other hand, the similarity analysis server 200 may accurately find other users having similar tendencies with the reference user by using both interest similarity and consensus similarity, and provide the reference user with the recommendation information.

5 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in a similarity analysis system according to another embodiment of the present invention.

Hereinafter, in the description with reference to FIG. 5, portions overlapping with the contents of FIGS. 3 and 4 will be described by compressing and the differences will be mainly described.

Referring to FIG. 5, the questionnaire providing unit 230 transmits the online questionnaire extracted from the questionnaire storage unit 330 to the plurality of communication terminals 100 -N using the transceiver 210, and each communication terminal ( The questionnaire response information is received from the 100-N and stored in the questionnaire storage unit 330 (S501).

Subsequently, the similarity calculator 250 checks the user IDs responding to the survey during the survey period, and accesses the content access information mapped to the identified user IDs from the access information storage unit 320 of the database 300. Extract (S503). The content access information includes content identification information and category information of the corresponding content.

Next, the similarity calculator 250 checks category information in each extracted content access information, and adds the total number of categories for the content accessed by the user for each user (S505). Subsequently, the similarity calculator 250 extracts the questionnaire response information of the user who responded during the survey period by dividing the questionnaire by the user from the questionnaire storage unit 330 of the database 300 (S507). The questionnaire response information includes questionnaire category information and contents of answers to each question of the questionnaire. Next, the similarity calculator 250 checks category information in each extracted questionnaire response information, and based on the category information, sums the total number of categories for the questionnaire responded by the user for each user (S509).

Next, the similarity calculator 250 selects a specific user as a reference user, and selects users other than the reference user as the comparison target user (S511). Next, after the similarity calculator 250 selects the reference user, the number of content categories of the reference user, the number of categories of the questionnaire participated by the reference user, the number of content categories of the specific comparison target user, and the questionnaire in which the specific comparison target user participates By substituting the sum of the number of categories and the total number of content categories duplicated between the two users and the number of questionnaire categories in Equation 1, interest similarity between the reference user and each comparison target use is calculated (S513).

Next, the similarity calculator 250 determines whether there is a comparison target user whose interest similarity exceeds the first threshold value (eg, 50%), and selects the user as a consensus analysis target user if it exists. (S515). Subsequently, the similarity calculator 250 adds the total number of contents accessed by the consensus analysis target user and the reference target user for each user based on the content access information extracted in step S503 (S517). Next, the similarity calculator 250 adds the total number of questionnaire questions answered by the reference user and the consensus analysis target user for each user based on the questionnaire response information extracted in step S507 (S519).

Subsequently, the similarity calculating unit 250 may include the number of contents of the reference user, the total number of questions answered by the reference user, the number of contents of the consensus analysis target user, the total number of questions answered by the consensus analysis target user, and the reference user. And the consensus similarity between the reference user and the consensus analysis target user is calculated by substituting the number of content identification information accessed by all the consensus analysis target users and the number of questionnaire questions answered by the two users in Equation 2. S521). In this way, using Equation 2, the similarity calculator 250 sequentially calculates similarity of interests between the user of another consensus analysis target and the reference user.

When the consensus similarity calculation is completed by the similarity calculator 250, the network manager 260 checks whether or not the user whose consensus similarity exceeds the second threshold value (eg, 70%) exists. The contact information of the user is extracted from the subscriber information storage unit 310 (S523 and S525). Next, the network manager 260 provides the network recommendation information to the extracted reference user (S527).

Through the method of FIG. 5, a user whose interest similarity is greater than or equal to the first threshold and the consensus similarity is greater than or equal to the second threshold is recommended as a network recommendation user of the reference user. Branches can make connections with other users.

Meanwhile, the similarity analysis server 200 according to the present invention may calculate similarity of interest and consensus similarity for a designated user and provide the same to the user who has requested the similarity analysis.

6 is a flowchart illustrating a method of analyzing similarity between users based on questionnaire response information and content access information in the similarity analysis system according to another embodiment of the present invention.

Hereinafter, in the description with reference to FIG. 6, a user who owns communication terminal 1 (100-1) is referred to as a first user, and a user who owns communication terminal 2 (100-2) is referred to as a second user.

Referring to FIG. 6, the communication terminal 1 100-1 transmits a connection request message including an ID and a password to the similarity analysis server 200 according to the operation of the first user (S601). Then, the subscriber management unit 220 of the similarity analysis server 200 authenticates whether the ID and password are stored as the member information of the subscriber information storage unit 310, and if the authentication is successful, transmits and receives the transceiver unit 210. In step S603, an authentication success message is transmitted to the communication terminal 1 100-1.

Subsequently, after successfully logging in, the communication terminal 1 100-1 requests the similarity analysis server 200 to analyze the similarity tendency with the second user according to the user's manipulation (S605). In this case, the first user reads the blog, mini homepage, profile, comment, etc. of the second user by using the communication terminal 1 (100-1), and transmits the ID of the second user to the similarity analysis server 200. You can request an analysis of similarity between yourself and the second user.

Then, the similarity calculator 250 of the similarity analysis server 200 checks the IDs of the first user and the second user, and stores the content access information and the questionnaire response information mapped corresponding to the two IDs. Extracted from 320 and the questionnaire storage unit 330 (S607).

Next, the similarity calculator 250 checks category information in the extracted content access information, and adds the total number of categories for content accessed by the first user and the second user for each user. Subsequently, the similarity calculator 250 checks category information in each questionnaire response information, and based on the category information, sums the total number of categories for the questionnaire answered by the first user and the second user for each user.

Next, the similarity calculator 250 selects the first user as the reference user and selects the second user as the comparison target user. Next, the similarity calculator 250 may include the number of content categories of the first user (ie, the reference user), the number of categories of questionnaires in which the first user participates, the number of content categories of the second user, and the second user (ie, comparison). The interest similarity between the first user and the second user is calculated by substituting the sum of the category number of the questionnaire participated by the target user, the sum of the number of content categories overlapping between the two users, and the number of questionnaire categories in Equation 1 (S609).

Next, the similarity calculator 250 adds the total number of contents accessed by the first user and the second user for each user based on the content access information extracted in operation S607. Next, the similarity calculator 250 adds up the total number of questionnaire questions answered by the first user and the second user for each user based on the questionnaire response information extracted in step S607. Subsequently, the similarity calculator 250 may include the number of contents accessed by the first user (ie, the reference user), the total number of questions answered by the first user, the number of contents of the second user (ie, the comparison target user), and the like. By substituting Equation 2 into the total number of questionnaires answered by the second user, the number of contents accessed by both the first user and the second user, and the number of questionnaires answered by both users in Equation 2, The consensus similarity between the second users is calculated (S611).

The similarity calculator 250 transmits the calculated interest similarity and consensus similarity to the communication terminal 1 100-1 using the transceiver 210 when the similarity propensity analysis between the first user and the second user is completed ( S613). Then, the first user may check interest similarity and consensus similarity with the second user through the communication terminal 1 (100-1), and request the similarity analysis server 200 to add a friend to the second user ( S615).

Then, the network manager 260 of the similarity analysis server 200 transmits a message indicating that the first user has requested to add a friend to the communication terminal 2 (100-2) using the transceiver 210 (S617). ). Next, when the transceiver 210 receives the acknowledgment message from the communication terminal 2 (100-2), the network manager 260 forms an online network between the first user and the second user, and connects with the second user. The network formation message indicating that the network has been completed is transmitted to the communication terminal 1 (100-1) (S621, S623). At this time, the network manager 260 registers the second user in the friend list of the first user, thereby forming a network of the first user and the second user.

As described above, the similarity analysis server 200 according to the present invention analyzes the similarity between users and provides the analyzed results to the corresponding users, so that the tendency of the counterpart can be easily and accurately identified online. In addition, the similarity analysis server 200 calculates by calculating the similarity of interest similarity and consensus between users based on the questionnaire response information of the user and the content access information accessed by the user, thereby providing more intuitive similarity analysis results to users. can do. In addition, the similarity analysis server 200 calculates interest similarity and consensus similarity by applying weights varying according to the basic data of the comparison target user and the reference user, thereby quantifying the similarity tendency between the two users more reasonably. There is an advantage.

Meanwhile, in the above-described embodiments, the interest similarity and the consensus similarity have been described using the questionnaire response information collected during the survey period, but the present invention is not limited thereto, and the questionnaire collected during the period set by the administrator is described. By extracting the response information and the content access information, it is made clear that interest similarity and consensus similarity between users can be calculated based on the extracted information.

In addition, in the above-described embodiments, it has been described that both the questionnaire response information and content access information are used when calculating the consensus similarity and interest similarity, but the present invention uses any one of the questionnaire response information and content access information. It should be clear that interest similarity and consensus similarity between users can be calculated. In detail, when the similarity analysis server 200 calculates interest similarity, the information on the category of the questionnaire responded by each user and the number of questionnaire categories overlapping between the users are expressed by Equation 1 using only questionnaire response information of each user. By assigning, similarity of interest between the two users can be calculated. At this time, the similarity propensity analysis server 200 sets the number of categories of content accessed by the comparison target user and the reference user to the number '0' and substitutes them into Equation 1. Similarly, the similarity analysis server 200 checks the number of categories of the content accessed by the corresponding user and the number of categories of the content duplicated between the two users using only the content access information of each user, and substitutes the number into Equation 1. In this way, interest similarity between the two users can be calculated. At this time, the similarity analysis server 200 substitutes the number '0' into Equation 1 as the number of categories of questionnaires answered by the comparison target user and the reference user.

In addition, the similarity analysis server 200 checks the number of questionnaire questions answered by the user using only the questionnaire response information of each user, and confirms the number of questions answered by the user and assigns the same to Equation 2 above. Consensus similarity between users can be calculated. At this time, the similarity analysis server 200 sets the number of contents accessed by the comparison target user and the reference user to the number '0' and assigns the same to Equation 2, and the arbitrary constant 'm' of Equation 2 is the database 300. ) Will increase according to the sum of the total number of questions. Similarly, the similarity analysis server 200 checks the number of contents accessed by the user and the number of contents accessed by both users using only the content access information, and substitutes the same in Equation 2, but the questionnaire questions of both users are used. By substituting the number '0' as the number into Equation 2, the consensus similarity between users can be calculated. At this time, the arbitrary constant 'm' of Equation 2 is increased according to the total number of contents stored in the database 300.

While the specification contains many features, such features should not be construed as limiting the scope of the invention or the scope of the claims. In addition, the features described in the individual embodiments herein may be combined and implemented in a single embodiment. Conversely, various features described in the singular < Desc / Clms Page number 5 > embodiments herein may be implemented in various embodiments individually or in combination as appropriate.

Although the operations have been described in a particular order in the figures, it should be understood that such operations are performed in a particular order as shown, or that all described operations are performed to obtain a sequence of sequential orders, or a desired result . In certain circumstances, multitasking and parallel processing may be advantageous. It should also be understood that the division of various system components in the above embodiments does not require such distinction in all embodiments. The above-described program components and systems can generally be implemented as a single software product or as a package in multiple software products.

The method of the present invention as described above may be implemented as a program and stored in a recording medium (CD-ROM, RAM, ROM, floppy disk, hard disk, magneto-optical disk, etc.) in a computer-readable form. Since this process can be easily implemented by those skilled in the art will not be described in more detail.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention. The present invention is not limited to the drawings.

100-N: communication terminal 200: similarity analysis server
210: transceiver unit 220: subscriber management unit
230: questionnaire provision unit 240: access information management unit
250: similarity calculation unit 260: network management unit
270: payment processing unit 300: database
310: subscriber information storage unit 320: access information storage unit
330: survey storage 400: network

Claims

As a method of analyzing the similarity between users in the similarity analysis system,
A questionnaire category summing step of extracting categories of questionnaires answered by the first user and the second user for each user and summing the number of questionnaire categories for each user;
Calculating the number of overlapping questionnaire categories overlapping between the first user and the second user; And
The number of questionnaire categories of each user added in the questionnaire category adding step to the ratio of the number of duplicated questionnaire categories calculated in the step of calculating the number of duplicates compared to the number of questionnaire categories of the first user summed in the questionnaire category adding step And calculating a similarity of interest of the second user to the first user by multiplying the relative ratios of the weighted values as a weight.

The method of claim 1,
The interest similarity calculating step,
The similarity propensity analysis method of calculating a similarity of interest of the second user to the first user according to Equation 3 below.
(3)

n (A): Number of categories of questionnaire responded by second user
n (B): Number of categories of questionnaire responded by first user
n (A∩B): Sum of the number of questionnaire categories overlapping between the first user and the second user
s is the smaller of n (A) and n (B)
l: the larger of n (A), n (B)

3. The method according to claim 1 or 2,
After the interest similarity calculation step,
Checking whether the calculated interest similarity exceeds a preset threshold; And
And if the calculated interest similarity exceeds the threshold, extracting contact information of the second user and providing the contact information as social networking recommendation information to the first user. Characteristic similarity analysis method characterized by.

As a method of analyzing the similarity between users in the similarity analysis system,
At least one or more lists are extracted from the category list of the questionnaire responded by the first user and the second user, and the category list of the content accessed by the first user and the second user, respectively, and recorded in the corresponding category list. A category adding step of adding up the number of categories for each user;
Calculating a number of duplicates for calculating the number of categories overlapping between the first user and the second user; And
The ratio of the number of duplicate categories calculated in the duplication number calculation step to the number of categories of the first users summed in the category adding step is weighted by the relative ratio of the number of categories of each user added in the category adding step. And a similarity of interest calculation step of calculating a similarity of interest of the second user to the first user by multiplying.

The method of claim 4, wherein
The interest similarity calculating step,
The similarity propensity analysis method of claim 2, wherein the similarity of interest of the second user to the first user is calculated according to Equation 4 below.
(4)

n (A): Number of categories of content accessed by the second user + Number of categories of questionnaire responded by the second user
n (B): number of categories of content accessed by the first user + number of categories of questionnaire responded by the first user
n (A∩B): Sum of the number of content categories duplicated between the first user and the second user and the number of questionnaire categories
s is the smaller of n (A) and n (B)
l: the larger of n (A), n (B)

As a method of analyzing the similarity between users in the similarity analysis system,
A questionnaire step summing step of extracting questionnaire questions answered by the first user and the second user for each user, and adding the extracted questionnaire questions to each user;
An equal answer calculating step of calculating the number of questions in the questionnaire answered by the first user and the second user equally; And
The ratio of the number of questionnaires of each user summed in the questionnaire question sum step to the ratio of the number of questions calculated in the same answer calculation step to the number of questionnaire questions of the first user summed in the questionnaire question sum step And calculating a consensus similarity of the second user with respect to the first user by multiplying by a weight.

The method according to claim 6,
The consensus similarity calculating step,
A similarity propensity analysis method according to Equation 5 below, calculating the consensus similarity of the second user with respect to the first user.
(5)

n (X): Number of questions in questionnaire responded by second user
n (Y): Number of questions in each questionnaire responded by first user
n (X∩Y): Number of questions answered by the first and second users equally
N: integer that discarded the fractional part from 1 / m × n (X) -n (Y) |, with N = 2 when N <2
(The initial value of m is 1 and is a random constant that increases with the sum of the total number of questions stored in the database.)
s: the smaller of n (X), n (Y)
l: the larger of n (X), n (Y)

The method according to claim 6 or 7,
After the consensus similarity calculating step,
Checking whether the calculated consensus similarity exceeds a preset threshold; And
And if the calculated consensus similarity exceeds the threshold, extracting contact information of the second user and providing the contact information to the first user as contact recommendation information. Characteristic similarity analysis method characterized by.

As a method of analyzing the similarity between users in the similarity analysis system,
Extracts at least one or more lists from the list of questionnaire questions answered by the first user and the second user, and the list of contents accessed by the first user and the second user for each user, and displays the detailed data recorded in the list. A summing step of summing each other;
Calculating at least one or more of the number of contents accessed by the first user and the second user in the same way and the number of questions in the questionnaire answered by the first user and the second user in the same manner;
The ratio of the number calculated in the overlapping number calculating step to the number of detailed data of the first user added in the adding step is multiplied by the relative ratio of the number of detailed data of each user added in the adding step as a weight. A similarity similarity calculating step of calculating a consensus similarity of the second user with respect to the first user.

The method of claim 9,
The consensus similarity calculating step,
The similarity propensity analysis method of calculating a consensus similarity of the second user with respect to the first user according to Equation 6 below.
(6)

n (X): Number of questions in questionnaire responded by second user + Number of content accessed by second user
n (Y): Number of questions in each questionnaire responded by first user + Number of content accessed by first user
n (X∩Y): Number of questions answered by the first and second users equally + Number of contents accessed by the first and second users equally
N: integer that discarded the fractional part from 1 / m × n (X) -n (Y) |, with N = 2 when N <2
(The initial value of m is 1, which is a random constant that increases with the sum of the total number of content stored in the database and the total number of questions answered)
s: the smaller of n (X), n (Y)
l: the larger of n (X), n (Y)

A database storing questionnaire response information for each user; And
Using the questionnaire response information of the first user and the second user extracted from the database, the category of the questionnaire responded by the first user and the second user is checked for each user, and the number of questionnaire categories is summed for each user, and the first user Calculates the number of questionnaire categories overlapping with each other and the second user, and the ratio of the number of questionnaire categories of each summed user to the sum of the number of questionnaire categories compared to the sum of the number of questionnaire categories of the first user is calculated. And similarity calculation means for calculating a similarity of interest of the second user to the first user by multiplying the relative ratio as a weight.