WO2022216034A1

WO2022216034A1 - Device for book recommendations using hybrid method combining collaborative filtering and book-based recommendations, and method therefor

Info

Publication number: WO2022216034A1
Application number: PCT/KR2022/004937
Authority: WO
Inventors: 김덕기; 허소연
Original assignee: 주식회사 피씨엔씨
Priority date: 2021-04-07
Filing date: 2022-04-06
Publication date: 2022-10-13
Also published as: KR20220139140A; KR102576484B1

Abstract

A device for book recommendations is provided. The device comprises: a service-providing unit for providing a menu such that a plurality of users performs post-reading activities; a data collection unit for collecting post-reading activity information of the plurality of users; and a recommendation unit for deriving preferences for a plurality of books on the basis of the post-reading activity information of the plurality of users, calculating a user-based preference and a book-based preference for a target book of a target user from the derived preferences for the plurality of books of the plurality of users, and merging the user-based preference and the book-based preference to calculate a preference for the target book of the target user.

Description

Apparatus and method for a hybrid method of book recommendation combining collaborative filtering and book-based recommendation

The present invention relates to a technology for recommending a book, and more particularly, to an apparatus for recommending a book in a hybrid method combining collaborative filtering and book-based recommendation, and a method therefor.

Collaborative filtering is a method that automatically predicts users' interests according to taste information obtained from many users.

SUMMARY OF THE INVENTION It is an object of the present invention to provide an apparatus for recommending a book in a hybrid manner and a method therefor.

A method for recommending a book according to a preferred embodiment of the present invention for achieving the above object includes the steps of, by a recommender, deriving preferences for a plurality of books based on information on post-reading activities of a plurality of users, and the recommendation unit calculating a user-based preference and a book-based preference for a target user's target book from the derived preference for a plurality of books of the plurality of users; and calculating the user's preference for the target book.

The method includes embedding user information and preference book information for each of the plurality of users in a predetermined vector space by the data collection unit before the step of the recommendation unit deriving preferences for a plurality of books based on the post-reading activity information of the plurality of users. Generating a plurality of user preference vectors in which the user's characteristics and the user's preferred book characteristics are embedded, the data collection unit clustering the plurality of user preference vectors into a plurality of clusters, and the data collection unit clustering deriving a convex hull from each of a plurality of clusters, and the data collection unit sets the center of gravity of a polygon constituting a convex hull in each of the plurality of clusters as the center of the cluster, and sets the center of the cluster and a plurality of user preference vectors It further includes calculating a critical loss based on the.

Calculating the critical loss is performed by the data collection unit using the equation for each of the plurality of clusters.

calculating a loss representing a difference between the center of a cluster and a plurality of user preference vectors through and from the standard deviation

Calculating the critical loss according to Wherein Et is a critical loss, m is the average of the loss representing the difference between the center of the cluster and the plurality of user preference vectors, and D is the standard deviation of the loss representing the difference between the center of the cluster and the plurality of user preference vectors. and said

is a weight for the standard deviation.

The method comprises the steps of calculating a preference for a target book of a target user by merging the user-based preference and the book-based preference, and then confirming, by the recommendation unit, whether the user's preference for the target book is greater than or equal to a predetermined value; , generating a user preference vector using the user's characteristics and the characteristics of the target book when the user's preference for the target book is greater than or equal to a predetermined value by the recommendation unit; The steps of confirming whether or not it exists within a critical loss within any one of the clusters, and as a result of the check, if the generated user preference vector exists within the critical loss within the one of the clusters, the step of recommending the corresponding book is further performed. include

The post-reading activity information includes identification information of a book in a reading quiz in which the user participated, the number of entries and scores obtained for the user's reading quiz, identification information of a book stored by the user through a predetermined menu, and the user through a predetermined menu. It is characterized in that it includes post-reading activity information, such as identification information of the book to which the rating is given, and the user's rating.

The calculating of the user-based preference and the book-based preference may include: calculating, by the recommendation unit, a degree of similarity between users corresponding to the same book based on the derived preference for a plurality of books of the plurality of users; and predicting a user-based preference for a target book of a target user based on the calculated similarity between users.

The calculating of the user-based preference and the book-based preference may include calculating, by the recommendation unit, a degree of similarity between books corresponding to the same user based on the derived preference for a plurality of books of the plurality of users; and predicting the book-based preference of the target user for the target book based on the calculated similarity between the books.

The step of calculating the preference of the target user for the target book by merging the user-based preference and the book-based preference is performed by the recommendation unit using the following equation

calculates the target user's preference for the target book according to , wherein Pt is the target user's preference for the target book, the Pu is the user-based preference, the Pb is the book-based preference, and the wu is the user-based preference. weight, and wb is a weight for book-based preference.

The wu and wb are the formulas

It is characterized in that the wu is proportional to the number of data used when deriving the user-based preference, and the wb is set in proportion to the number of data used when deriving the book-based preference .

In order to achieve the above object, an apparatus for recommending a book according to a preferred embodiment of the present invention includes a service provider that provides a menu for a plurality of users to perform post-reading activities, and collects post-reading activity information of the plurality of users. and a data collection unit that derives a preference for a plurality of books based on the reading activity information of the plurality of users, and a user base for a target book of a target user from the derived preference for a plurality of books of the plurality of users and a recommendation unit for calculating a preference and a book-based preference, and calculating a target user's preference for a target book by merging the user-based preference and the book-based preference.

According to the present invention, by calculating the preference of the target user for the target book using both the user-based preference and the book-based preference, it is possible to calculate a more reliable preference.

1 is a diagram for explaining a system for recommending a book in a hybrid method according to an embodiment of the present invention.

2 is a diagram for explaining the configuration of a recommendation server according to an embodiment of the present invention.

3 is a diagram for explaining a detailed configuration of a recommendation server control unit according to an embodiment of the present invention.

4 is a flowchart illustrating a method of calculating a user's reliability according to an embodiment of the present invention.

5 is a flowchart illustrating a method for clustering a plurality of users according to a book tendency according to an embodiment of the present invention.

6 is a diagram for explaining a method for clustering a plurality of users according to a book tendency according to an embodiment of the present invention.

7 is a view for explaining a method of setting a critical range of a cluster according to the book tendency of a plurality of users according to an embodiment of the present invention.

8 is a flowchart illustrating a method of calculating a book preference based on collaborative filtering according to an embodiment of the present invention.

9 is a flowchart illustrating a method for recommending a book according to a user's preference according to an embodiment of the present invention.

10 is a diagram illustrating a computing device according to an embodiment of the present invention.

Prior to the detailed description of the present invention, the terms or words used in the present specification and claims described below should not be construed as being limited to their ordinary or dictionary meanings, and the inventors should develop their own inventions in the best way. It should be interpreted as meaning and concept consistent with the technical idea of the present invention based on the principle that it can be appropriately defined as a concept of a term for explanation. Accordingly, the embodiments described in this specification and the configurations shown in the drawings are only the most preferred embodiments of the present invention, and do not represent all the technical ideas of the present invention, so various equivalents that can replace them at the time of the present application It should be understood that there may be water and variations.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this case, it should be noted that the same components in the accompanying drawings are denoted by the same reference numerals as much as possible. In addition, detailed descriptions of well-known functions and configurations that may obscure the gist of the present invention will be omitted. For the same reason, some components are exaggerated, omitted, or schematically illustrated in the accompanying drawings, and the size of each component does not fully reflect the actual size.

First, a configuration of a system for recommending a book in a hybrid method according to an embodiment of the present invention will be described. 1 is a diagram for explaining a system for recommending a book in a hybrid method according to an embodiment of the present invention. Referring to FIG. 1 , a recommendation system according to an embodiment of the present invention includes a recommendation server 10 , a book web server 20 , and a user device 30 .

The recommendation server 10 is basically to provide a social network service (SNS: Social Network Service) platform that can exchange and utilize various information about books to users who have subscribed to the service. In particular, the recommendation server 10 may recommend the user's favorite book after identifying the user's preference for a book based on the user's activity on the platform.

The book web server 20 collectively refers to various types of web servers that provide information on books. For example, the book web server 20 may be a server operated by a bookstore for providing information on books and selling the books. As another example, the book web server 20 may be a server operated by a publisher and providing information on books published by the publisher.

The user device 30 is a device used by a user who has subscribed to the service provided by the recommendation server 10 , and can be provided with the service provided by the recommendation server 10 by accessing the recommendation server 10 through a network. . The user device 30 may be exemplified by a personal computer, a notebook computer, a tablet, a phablet, a smart phone, and the like.

Next, the configuration of the recommendation server 10 according to the embodiment of the present invention will be described in more detail. 2 is a diagram for explaining the configuration of a recommendation server according to an embodiment of the present invention. 3 is a diagram for explaining a detailed configuration of a recommendation server control unit according to an embodiment of the present invention.

First, referring to FIG. 2 , the recommendation server 10 according to an embodiment of the present invention includes a communication module 11 , a storage module 12 , and a control module 13 .

The communication module 11 is for communicating with the book web server 20 and the user device 30 through a network. The communication module 11 may include a modem for modulating a signal to be transmitted and demodulating a received signal to transmit/receive data through a network. The communication module 11 may transmit the data received from the control module 13 to the book web server 20 and the user device 30 through the network. In addition, the communication module 11 may transmit data received from any one of the book web server 20 and the user device 30 to the control module 13 .

The storage module 12 serves to store programs and data necessary for the operation of the recommendation server 10 . For example, the storage module 12 may store information related to a user's social and post-reading activities, book information collected in the private and public, cover and inside images of books, metadata of books, and the like. Various types of data stored in the storage module 12 may be registered, deleted, changed, or added according to the operation of the administrator.

The control module 13 may control the overall operation of the recommendation server 10 and the signal flow between internal blocks of the recommendation server 10, and may perform a data processing function of processing data. The control module 13 may be a central processing unit, a digital signal processor, or the like. In addition, the control module 13 may further include an image processor or a graphic processing unit (GPU). As shown in FIG. 3 , the control module 13 includes a data collection unit 100 , a service provision unit 200 , and a recommendation unit 300 .

The data collection unit 100 is for collecting data necessary for the recommendation server 10 to provide a service and data necessary for a recommendation. The data collection unit 100 may collect book information by accessing the book web server 20 through the communication module 11 . In this case, a crawling technique may be used. The book information includes book meta-information such as title, author, translator, publication date, edition, print, etc. of the book, cover and internal images, text of major pages, and the like. In addition, the book information may further include book evaluation information, such as a bestseller published by a publicly trusted subject. Also, the data collection unit 100 may collect information on the user's social activity and post-reading activity. This will be described in more detail below.

The service providing unit 200 is basically to enable the user to engage in social activities and post-reading activities. The service providing unit 200 allocates an account to the user for the user's social activity, and allows the user to upload personal post-reading information, which is information about the book the user has read, to the account allocated through the user device 30 . provides an interface to The personal post-reading information includes, for example, the cover and inside images of books, texts of major pages, and book reviews. In addition, the service providing unit 200 provides a menu that allows the user to follow other accounts, and feeds so that the user can view the personal post-reading information uploaded to the other accounts followed by the user. provides the function to In addition, the service providing unit 200 provides a menu that allows the user to express preference or comment on personal reading information of other accounts, for example, likes, votes, recommendations, and selection of emotional emoticons. Accordingly, the data collection unit 100 determines the number of personal post-reading information uploaded by the user for each user, the number of comments and preferences for the personal post-reading information uploaded by the user, the number of users' following, the number of followers, etc. Activity information can be collected and stored.

Also, the service providing unit 200 may provide a reading quiz, a book drawer, and a Book Ping Talk menu for the user's post-reading activities. The service providing unit 200 may provide a reading quiz in which a score is added if a quiz related to a specific book is issued through the reading quiz menu, and if the quiz is answered correctly. The user may participate in the reading quiz in the corresponding menu through the user device 30 . The service providing unit 200 may provide a function of storing the book information of the book desired by the user through the book drawer menu. The service providing unit 200 provides a function by which the user can numerically evaluate a specific book through the Book Ping Talk menu. Accordingly, the data collection unit 100 receives the identification information of the book of the reading quiz participated by the user, the number of applications and the score obtained by the user, the identification information of the book stored by the user through the book drawer menu, and the Book Ping Talk menu for each user. It is possible to collect and store post-reading activity information, such as identification information of books evaluated by users, and numerical values evaluated by users.

The recommendation unit 300 is for analyzing the user's preference based on the user's social activity and reading activity through collaborative filtering, and recommending a book preferred by the user. The operation of the recommendation unit 300 will be described in more detail below.

Next, a method for a hybrid book recommendation according to an embodiment of the present invention will be described. First, a method for calculating user reliability according to an embodiment of the present invention will be described. 4 is a flowchart illustrating a method of calculating a user's reliability according to an embodiment of the present invention.

Referring to FIG. 4 , the data collection unit 100 collects activity information indicating the degree of the user's social activity and post-reading activity for the book according to the embodiment of the present invention in step S110. Here, the activity information includes the number of likes (A), the number of comments (B), the number of content used (Cu), the number of followings (Fing), and the number of followers (Fwer). Here, the number of likes (A) represents the number of all likes tagged on the content of the corresponding user. The number of comments (B) indicates the number of all comments tagged on the content of the corresponding user. The number of contents used (Cu) indicates the number of all contents used by the corresponding user. The number of following (Fing) indicates the number of following of a corresponding user in a social network for a book according to an embodiment of the present invention. The number of followers (Fwer) indicates the number of followers of the corresponding user in the social network for the book according to the embodiment of the present invention.

Next, the recommendation unit 300 calculates a social behavior index indicating the degree of the user's social behavior by analyzing the user's social behavior based on the activity information in step S120. The recommendation unit 300 may calculate the social behavior index (K) according to Equation 1 below.

Here, K is the social behavior index. Also, A is the number of likes and B is the number of comments.

Next, the recommendation unit 300 analyzes the user's content use based on the activity information in step S130 and calculates a content use index indicating the user's content use degree. At this time, the recommendation unit 300 may calculate the content use index (C) according to the following Equation (2).

Here, C represents the content usage index. Call is the total number of contents, and Cu is the number of contents used by the user. In addition, w is a value preset as a weight.

Next, the recommendation unit 300 calculates a social relationship index indicating the strength of the user's social relationship based on the activity information in step S140. In this case, the recommendation unit 300 may calculate the social relationship index (F) according to the following Equation (3).

Here, F represents the social relationship index. In addition, Fing is the number of following, and Fwer indicates the number of followers.

Next, the recommendation unit 300 calculates the user's reliability (T) in consideration of all of the previously calculated social behavior index (K), content use index (C), and social relationship index (F) in step S150. In this case, the recommendation unit 300 may calculate the reliability T according to Equation 4 below.

Here, K is a social behavior index, and wa is a social behavior weight that is a weight for the social behavior index. C is the content use index, and wb is the content use weight, which is a weight for the content use index. And F is the social relationship index, and wc represents the social relationship weight, which is a weight for the social relationship index.

When the reliability is calculated, the recommendation unit 300 selects users whose reliability is greater than or equal to a predetermined value in step S160 .

According to an embodiment of the present invention, a plurality of users may be clustered according to the book tendency. 5 is a flowchart illustrating a method for clustering a plurality of users according to a book tendency according to an embodiment of the present invention. 6 is a diagram for explaining a method for clustering a plurality of users according to a book tendency according to an embodiment of the present invention. 7 is a view for explaining a method of setting a critical range of a cluster according to the book tendency of a plurality of users according to an embodiment of the present invention.

Referring to FIG. 5 , the data collection unit 100 collects, for each of a plurality of users, user information indicating the characteristics of the user and preference book information indicating the characteristics of the book preferred by the user in step S200 . In this case, the plurality of users may be limited to users whose reliability selected according to the method shown in FIG. 4 is greater than or equal to a predetermined value. User information includes information such as the user's age, gender, region of residence, educational background, and the like, and can be obtained from the information entered at the time of membership registration. The preferred book information may be a genre, a subject, a word, a phrase, etc. of a book preferred by the user. Such preference book information may be obtained from information about the user's social activity and post-reading activity.

The data collection unit 100 embeds user information and preferred book information for each of a plurality of users in a predetermined vector space (VS) in step S210, so that the user's characteristics and the user's preferred book characteristics are embedded. A plurality of user preference vectors are generated. 6 shows a vector space VS in which a plurality of user preference vectors are embedded.

The data collection unit 100 randomly selects a center vector of a predetermined number of clusters from among a plurality of user preference vectors in the vector space VS as shown in step S220. When the center vector of each cluster is selected, the data collection unit 100 selects a plurality of user preference vectors in the vector space VS as shown in FIG. 7 in step S230 according to the distance from the center vector of a predetermined number of clusters. A plurality of clusters (CL1, CL2, CL3, ...) are generated by clustering. That is, in step S230 , the data collection unit 100 calculates a distance to each of the center vectors of a predetermined number of clusters of a plurality of user preference vectors in the vector space VS as shown in FIG. 6 , and the plurality of user preferences Each vector is included in the cluster with the smallest distance from the center vector of the cluster. Next, the data collection unit 100 selects the center vector of the cluster again for each of the plurality of clusters CL1, CL2, CL3, ..., generated in step S240. For example, in step S240 , the data collection unit 100 may select a user preference vector having an intermediate value among a plurality of user preference vectors of each cluster as a center vector.

Subsequently, the data collection unit 100 may check whether all user preference vectors are included in the cluster having the smallest distance from the center vector of the previously selected cluster ( S240 ) in step S250 . As a result of checking in step S250, if all user preference vectors are not included in the cluster having the minimum distance from the center vector of the previously selected cluster again (S240), the data collection unit 100 performs steps S230 to S250 described above. Repeat. That is, the data collection unit 100 repeats steps S230 to S250 until the following Equation 5 becomes the minimum.

In Equation 5, i is the index of the user preference vector, and Si is the i-th user preference vector. Also, j is the index of the cluster, and Cj is the center vector of the j-th cluster. And fij is a flag variable, 1 if the i-th user preference vector belongs to the j-th cluster, and 0 otherwise.

On the other hand, as a result of checking in step S250, if all user preference vectors are included in the cluster having the smallest distance from the center vector of the previously selected cluster (S240), the data collection unit 100 selects the current cluster in step S260. Determine the optimized cluster.

Then, the data collection unit 100 derives a convex hull from each of the plurality of clusters optimized in step S270 . For example, any one cluster is shown in FIG. 7 . As shown, the data collection unit 100 may configure a convex hull by selecting a plurality of convex among a plurality of user preference vectors included in the cluster. Next, the data collection unit 100 sets the center of gravity of the polygon constituting the convex hull as the center of the cluster in step S280. For example, the data collection unit 100 may set the center of gravity of the hexagon constituting the convex hull as the center of the cluster, as shown in FIG. 7 . The user preference vector may not exist at the center of the cluster, but the center of the cluster may be expressed as a user preference vector that is a tensor of the same dimension as the user preference vector.

Next, the data collection unit 100 obtains a loss representing a difference between the center of a cluster and a plurality of user preference vectors in each of the plurality of clusters in step S290, and calculates a critical loss (T) from the average and standard deviation of the loss .

In more detail, the data collection unit 100 calculates a loss E representing the difference between the center Mk of the cluster and the plurality of user preference vectors Sn according to Equation 2 below.

In Equation 6, E denotes a loss, j denotes an index of a cluster, and Mj denotes the center of the j-th cluster. i is the index of the user preference vector, and Si is the i-th user preference vector.

Then, the data collection unit 100 calculates the critical loss (T) according to the following equation (7) from the average and standard deviation of the loss.

In Equation 7, Et means critical loss. And m represents the average of the loss (E) representing the difference between the cluster center (Mj) and the plurality of user preference vectors (Si) calculated according to Equation 6 above. And D denotes the standard deviation of the loss (E) representing the difference between the center (Mj) of the cluster and the plurality of user preference vectors (Si) calculated according to Equation 6 above. In addition,

is the weight for the standard deviation,

It is a hyperparameter that has a range of , and is preset in proportion to the size of the population. The data collection unit 100 stores the threshold loss Et calculated in step S300 for each cluster.

Next, a method for calculating book preference based on collaborative filtering according to an embodiment of the present invention will be described. 8 is a flowchart illustrating a method of calculating a book preference based on collaborative filtering according to an embodiment of the present invention.

Referring to FIG. 8 , the recommendation unit 300 derives a plurality of users' preferences for a plurality of books based on the reading activity information in step S410 . For example, as described above, the post-reading activity information includes identification information of a book in a reading quiz in which the user participated, the number of applications and scores obtained for the user's reading quiz, and identification of a book stored in a specific space by the user through the book drawer menu Information and post-reading activity information such as identification information of a book given a rating by the user through the Bookping Talk menu, and the user's rating are included. Accordingly, the recommendation unit 300 calculates a preference score indicating the user's preference for the corresponding book in inverse proportion to the number of applications based on the number of applications and the scores obtained by the user for the reading quiz and in proportion to the scores obtained. For example, the recommendation unit 300 may calculate the preference according to an equation such as [weight x (the number of times the acquired score is equal to or greater than a predetermined score/number of applications)]. The recommendation unit 300 gives a constant preference score to the book stored by the user through the book drawer menu. For example, the recommendation unit 300 may give one point to a book stored through the book drawer menu. In addition, the recommendation unit 300 gives a constant preference score to a book given a rating by the user through the Book Ping Talk menu, and additionally gives a preference score in proportion to the user's rating. For example, the recommendation unit 300 gives 1 point to the book evaluated by the user through the Book Ping Talk menu, and additionally gives a preference score according to the equation [user rating/maximum value]. Accordingly, the recommendation unit 300 calculates the preference of the corresponding book by adding up the preference scores for the book.

As an example, the recommendation unit 300 shows the preference for a plurality of books (B1, B2, B3, ...) of a plurality of users (U1, U2, U3...) based on the post-reading activity information in the following table. Assume it is equal to 1.

	B1B1	B2B2	B3B3	B4B4	B5B5	......
U1U1	55	44	44	33
U2U2	1One	00	1One		44
U3U3	44	44		55	33
U4U4		22	1One	44	33
U5U5	44		44	44	22
......						......

Next, the recommendation unit 300 calculates a degree of similarity between users corresponding to the same book based on the plurality of users' preferences for the plurality of books in step S420 . Such similarity may be calculated using any one of cosine similarity, adaptive cosine similarity, and Pearson's correlation coefficient. For example, the recommendation unit 300 may calculate the similarity (cosine similarity) between the second user U2 and the fourth user U4 according to Equation 8 below.

Here, S represents the degree of similarity. Also, n represents the number of books read by both the first user and the second user, and i is an index of books read by both the first user and the second user. R1 represents the preference of the first user, and R2 represents the preference of the second user. Accordingly, the degree of similarity between the first user and the second user may be calculated as in Equation 9 below.

It is assumed that the degree of similarity between users corresponding to the same book calculated by the above method is as shown in Table 2 below.

	U1U1	U2U2	U3U3	U4U4	U5U5	......
U1U1	1.001.00	0.840.84	0.960.96	0.820.82	0.980.98
U2U2	0.840.84	1.001.00	0.610.61	0.840.84	0.630.63
U3U3	0.960.96	0.610.61	1.001.00	0.970.97	0.990.99
U4U4	0.820.82	0.840.84	0.970.97	1.001.00	0.850.85
U5U5	0.980.98	0.630.63	0.990.99	0.850.85	1.001.00
......						......

The recommendation unit 300 predicts the user-based preference for the target book of the target user based on the degree of similarity between the users in step S430 . In this case, the recommendation unit 300 may calculate the user-based preference for the target book of the target user by obtaining a weighted sum according to Equation 10 below.

Here, Pu is a predicted value and represents user-based preference. S is the similarity between the target user and other users, and R is the preference of other users for the target book.

For example, the recommendation unit 300 may calculate the user-based preference Pu for the third book B3 of the third user U3 according to Equation 11 below.

Meanwhile, the recommendation unit 300 calculates a degree of similarity between books corresponding to the same user based on the plurality of users' preferences for a plurality of books in step S440 . Such similarity may be calculated using any one of cosine similarity, adaptive cosine similarity, and Pearson's correlation coefficient.

For example, the recommendation unit 300 may calculate the similarity (cosine similarity) between the second book B2 and the third book B3 according to Equation 12 below.

Here, S represents the degree of similarity. Further, n represents the number of users who have read both the second book and the third book, and i is an index of users who have read both the second book B2 and the third book B3. R2 represents the preference for the second book (B2), and R3 represents the preference for the third book (B3). Accordingly, the degree of similarity between the second book B2 and the third book B3 may be calculated as in Equation 13 below.

It is assumed that the degree of similarity between books corresponding to the same user calculated by the above method is as shown in Table 3 below.

	B1B1	B2B2	B3B3	B4B4	B5B5	......
B1B1	1.001.00	0.840.84	0.610.61	0.820.82	0.980.98
B2B2	0.840.84	1.001.00	0.950.95	0.840.84	0.630.63
B3B3	0.610.61	0.950.95	1.001.00	0.970.97	0.990.99
B4B4	0.820.82	0.840.84	0.970.97	1.001.00	0.850.85
B5B5	0.980.98	0.630.63	0.990.99	0.850.85	1.001.00
......						......

The recommendation unit 300 predicts the book-based preference of the target user for the target book based on the similarity between the books in step S450 . In this case, the recommendation unit 300 may calculate the book-based preference of the target user for the target book by obtaining a weighted sum according to Equation 3 below.

Here, Pb is a predicted value and represents a book-based preference. S is the degree of similarity between the target book and other books, and R represents the target user's preference for other books. For example, the recommendation unit 300 may calculate the book-based preference Pb of the third user U3 for the third book B3 according to Equation 15 below.

Next, the recommendation unit 300 calculates the target user's preference for the target book by merging the user-based preference Pu and the book-based preference Pb to which weights are applied in step S460 . That is, the recommendation unit 300 may calculate the target user's preference for the target book according to Equation 16 below.

Here, Pt is the target user's preference for the target book. Also, Pu is a user-based preference, and Pb is a book-based preference. In particular, wu is a weight for user-based preference, and wb is a weight for book-based preference.

wu and wb are hyperparameters, and have a rule as shown in Equation 17 below.

In this case, wu and wb are set in proportion to the number of data used when deriving preference. That is, wu is proportional to the number of data used when deriving the user-based preference Pu, and wb is proportional to the number of data used when deriving the book-based preference Pb.

wu and wb are hyperparameters. wu and wb are set in proportion to the number of data used when deriving preferences. That is, the number of data used when deriving the user-based preference Pu in

Equations

10 and 11 is 4, and the number of data used when deriving the book-based preference Pb in Equations 14 and 15 is also 4 Since wu and wb are both set to 0.5. Accordingly, the final preference of the third user U3 for the third book B3 is calculated according to

Equations

10, 11, 14, 15, 16 and 17, and the following Equation 18.

As described above, according to the present invention, by calculating the preference of the target user for the target book using both the user-based preference and the book-based preference, it is possible to calculate a more reliable preference.

Meanwhile, in the above-described embodiment of FIG. 8 , the user's preference for the target book was calculated by performing calculations for each user and book unit. However, according to an additional embodiment of the present invention, a plurality of users are clustered into a plurality of user clusters, and a plurality of books are clustered into a plurality of book clusters, without performing calculations by dividing each user and book. Thereafter, the user cluster and book cluster preference for the book cluster may be calculated by performing calculations by dividing the user cluster and the book cluster unit.

Next, a method for recommending a book according to a user's preference according to an embodiment of the present invention will be described. 9 is a flowchart illustrating a method for recommending a book according to a user's preference according to an embodiment of the present invention.

Referring to FIG. 9 , when a target book for determining whether to recommend a user is input in step S510 , the recommendation unit 300 calculates the user's preference for the target book in step S520 . The preference calculation may be performed by collaborative filtering as described above with reference to FIG. 8 .

After calculating the preference, the recommendation unit 300 checks whether the user's preference for the target book is greater than or equal to a predetermined value in step S530 . As a result of the check, if the user's preference for the target book is greater than or equal to a predetermined value, the process proceeds to step S540. If it is less than the predetermined value, the process proceeds to step S570 and the corresponding book is not recommended.

Meanwhile, in step S540 , the recommendation unit 300 generates a user preference vector using the user's characteristics and the target book's characteristics. Next, the recommendation unit 300 checks whether the user preference vector generated earlier ( S540 ) exists within a threshold loss Et within a specific cluster in step S550 . For example, assuming that the user preference vector belongs to the cluster as shown in FIG. 7 , when the user preference vector is the first vector V1, it deviates from the threshold loss Et, and when the user preference vector is the second vector V2 , it can be confirmed that it exists within the critical loss (Et).

As a result of checking in step S550, if the generated user preference vector does not exist within the threshold loss Et within a specific cluster, the recommendation unit 300 proceeds to step S570 and does not recommend the corresponding book.

On the other hand, as a result of checking in step S550, if the generated user preference vector exists within a threshold loss Et within a specific cluster, the recommender 300 proceeds to step S560 to recommend the corresponding book.

10 is a diagram illustrating a computing device according to an embodiment of the present invention. The computing device TN100 of FIG. 10 may be a device described herein, for example, the recommendation server 10 , the book web server 20 , or the user device 30 .

In the embodiment of FIG. 10 , the computing device TN100 may include at least one processor TN110 , a transceiver device TN120 , and a memory TN130 . Also, the computing device TN100 may further include a storage device TN140 , an input interface device TN150 , an output interface device TN160 , and the like. Components included in the computing device TN100 may be connected by a bus TN170 to communicate with each other.

The processor TN110 may execute a program command stored in at least one of the memory TN130 and the storage device TN140. The processor TN110 may mean a central processing unit (CPU), a graphics processing unit (GPU), or a dedicated processor on which methods according to an embodiment of the present invention are performed. The processor TN110 may be configured to implement procedures, functions, and methods described in connection with an embodiment of the present invention. The processor TN110 may control each component of the computing device TN100 .

Each of the memory TN130 and the storage device TN140 may store various information related to the operation of the processor TN110 . Each of the memory TN130 and the storage device TN140 may be configured as at least one of a volatile storage medium and a nonvolatile storage medium. For example, the memory TN130 may include at least one of a read only memory (ROM) and a random access memory (RAM).

The transceiver TN120 may transmit or receive a wired signal or a wireless signal. The transceiver TN120 may be connected to a network to perform communication.

Meanwhile, the above-described method according to an embodiment of the present invention may be implemented in the form of a program readable by various computer means and recorded in a computer readable recording medium. Here, the recording medium may include a program command, a data file, a data structure, etc. alone or in combination. The program instructions recorded on the recording medium may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. For example, the recording medium includes magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floppy disks ( magneto-optical media) and hardware devices specially configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of the program instruction may include not only machine language such as generated by a compiler, but also a high-level language that can be executed by a computer using an interpreter or the like. Such hardware devices may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

Although the present invention has been described above using several preferred embodiments, these examples are illustrative and not restrictive. As such, those of ordinary skill in the art to which the present invention pertains will understand that various changes and modifications can be made in accordance with the doctrine of equivalents without departing from the spirit of the present invention and the scope of rights set forth in the appended claims.

Claims

In the method for recommending a book,

deriving, by a recommendation unit, preferences for a plurality of books based on the post-reading activity information of the plurality of users;

calculating, by the recommendation unit, a user-based preference for a target user and a book-based preference for a target user from the derived preference for a plurality of books of a plurality of users; and

calculating, by the recommendation unit, a preference for a target book of a target user by merging the user-based preference and the book-based preference;

characterized in that it comprises

A method for recommending books.
According to claim 1,

Before the step of the recommendation unit deriving preferences for a plurality of books based on the post-reading activity information of a plurality of users,

generating, by a data collection unit, user information and preference book information for each of a plurality of users in a predetermined vector space to generate a plurality of user preference vectors having the characteristics of the user and the characteristics of the book preferred by the user;

clustering, by the data collection unit, a plurality of user preference vectors into a plurality of clusters;

deriving, by the data collection unit, a convex hull from each of a plurality of clustered clusters;

setting, by the data collection unit, a center of gravity of a polygon constituting a convex hull in each of a plurality of clusters as a center of a cluster, and calculating a critical loss based on the set center of the cluster and a plurality of user preference vectors;

characterized in that it further comprises

A method for recommending books.
3. The method of claim 2,

The step of calculating the critical loss is

Equation for each of the plurality of clusters by the data collection unit
calculating a loss representing a difference between the center of a cluster and a plurality of user preference vectors through and

Equation from the mean and standard deviation of the loss indicating the difference between the center of the cluster and the plurality of user preference vectors for each of the plurality of clusters by the data collection unit
calculating a critical loss according to

includes,

Where j is the index of the cluster,

Wherein Mj is the center of the j-th cluster,

where i is the index of the user preference vector,

Si is the i-th user preference vector,

Where Et is the critical loss,

Wherein m is the average of the loss representing the difference between the center of the cluster and the plurality of user preference vectors,

D is the standard deviation of the loss representing the difference between the center of the cluster and the plurality of user preference vectors,

remind
is a weight for the standard deviation, characterized in that

A method for recommending books.
3. The method of claim 2,

After merging the user-based preference and the book-based preference to calculate the target user's preference for the target book,

checking, by the recommendation unit, whether the user's preference for the target book is greater than or equal to a predetermined value;

generating, by the recommendation unit, a user preference vector by using the user's characteristics and the characteristics of the target book, when the user's preference for the target book is greater than or equal to a predetermined value;

checking whether the user preference vector generated by the recommendation unit exists within a threshold loss within any one of the plurality of clusters; and

as a result of the check, if the generated user preference vector exists within a threshold loss within one of the clusters, recommending a corresponding book;

characterized in that it further comprises

A method for recommending books.
According to claim 1,

The post-reading activity information is

Identification information of books in the reading quiz in which the user participated, the number of applications and scores obtained for the user's reading quiz, identification information of books stored by the user through a predetermined menu, and the number of books the user has given a rating through the predetermined menu Identification information, characterized in that it includes post-reading activity information such as the user's rating

A method for recommending books.
According to claim 1,

The step of calculating the user-based preference and the book-based preference

calculating, by the recommendation unit, a degree of similarity between users corresponding to the same book based on the derived preference for a plurality of books of the plurality of users; and

predicting, by the recommendation unit, a user-based preference for a target book of a target user based on the calculated similarity between users;

characterized in that it comprises

A method for recommending books.
According to claim 1,

The step of calculating the user-based preference and the book-based preference

calculating, by the recommendation unit, a degree of similarity between books corresponding to the same user based on the derived preference for a plurality of books of the plurality of users; and

predicting, by the recommendation unit, a book-based preference for a target book of a target user based on the calculated similarity between the books;

characterized in that it comprises

A method for recommending books.
According to claim 1,

The step of calculating a preference for a target book of a target user by merging the user-based preference and the book-based preference

The recommendation part is the formula

Calculates the target user's preference for the target book according to

The Pt is the target user's preference for the target book,

where Pu is a user-based preference,

The Pb is a book-based preference,

wherein wu is a weight for user-based preference,

wherein wb is a weight for book-based preference

A method for recommending books.
9. The method of claim 8,

The wu and the wb are

formula
has the same rules as

The wu is proportional to the number of data used when deriving the user-based preference,

wherein wb is set in proportion to the number of data used when deriving book-based preferences

A method for recommending books.
In the device for recommending a book,

a service providing unit that provides a menu for a plurality of users to do post-reading activities;

a data collection unit for collecting post-reading activity information of the plurality of users;

Preference for a plurality of books is derived based on the post-reading activity information of the plurality of users, and the user-based preference and book-based preference for the target book of the target user are derived from the derived preference for the plurality of books of the plurality of users. a recommendation unit for calculating and merging the user-based preference and the book-based preference to calculate a preference of a target user for a target book;

characterized in that it comprises

A device for recommending books.