CN113590945A - Book recommendation method and device based on user borrowing behavior-interest prediction - Google Patents

Book recommendation method and device based on user borrowing behavior-interest prediction Download PDF

Info

Publication number
CN113590945A
CN113590945A CN202110846763.8A CN202110846763A CN113590945A CN 113590945 A CN113590945 A CN 113590945A CN 202110846763 A CN202110846763 A CN 202110846763A CN 113590945 A CN113590945 A CN 113590945A
Authority
CN
China
Prior art keywords
user
label
behavior
feature
labels
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110846763.8A
Other languages
Chinese (zh)
Other versions
CN113590945B (en
Inventor
赵雪青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Polytechnic University
Original Assignee
Xian Polytechnic University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Polytechnic University filed Critical Xian Polytechnic University
Priority to CN202110846763.8A priority Critical patent/CN113590945B/en
Publication of CN113590945A publication Critical patent/CN113590945A/en
Application granted granted Critical
Publication of CN113590945B publication Critical patent/CN113590945B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a book recommendation method and a book recommendation device based on user borrowing behavior-interest prediction, wherein the method comprises the following steps: acquiring user borrowing behavior data; determining a basic feature label based on the user borrowing behavior data, and determining a prediction type feature label by adopting a weight calculation algorithm TFIDF and cosine similarity method; inputting the basic feature labels and the prediction class feature labels into a neural network model DeepFM constructed based on a factorization machine for Embedding feature vectorization, performing feature intersection on the feature vectors, inputting the feature vectors into a deep neural network, and outputting a recommendation result. The method and the system construct the interest prediction tag on the basis of analyzing the borrowing behavior of the library user, and recommend books to the user by adopting the deep FM. The method effectively constructs a user behavior label system, carries out personalized recommendation by combining the user interests, realizes the core requirement of accurately positioning the user and improves the user satisfaction.

Description

Book recommendation method and device based on user borrowing behavior-interest prediction
Technical Field
The invention relates to the field of big data, in particular to a book recommendation method and device based on user borrowing behavior-interest prediction.
Background
Digital libraries are rising in the big data era, and it is a necessary trend to find reading hobbies of users and recommend books. With the rapid development of mobile internet and self-media, the attention of users is continuously shifting from computer to mobile. How to effectively grasp the focus of a user in the shortest time and continuously improve the satisfaction degree of the user is always an urgent problem to be solved by a recommendation system.
At present, a neural network model (Deep factor Machine, hereinafter referred to as Deep fm) constructed based on a Factorization Machine has been widely used in the CTR (Click-Through-Rate) field of recommendation, advertisement and the like, and because the Deep fm model has the memory capability of logistic regression and the generalization capability of the neural network, the user and resource characteristics can be directly learned by means of the memory capability; by adopting the generalization capability of the neural network, the rare characteristics of the user can be effectively mined and the correlation with the label can be found, and further, the automatic combination of the characteristics is carried out through the neural network, so that a more stable recommendation result is obtained.
In 2010, Massanari et al propose that a user portrait is a portrait model formed by user features, and can effectively analyze important features of a user. Aiming at the borrowing behavior of a library user, because the user behavior is single, the user characteristics are analyzed by adopting the traditional user portrait, and the various behavior characteristics of the user are difficult to be accurately depicted, so that the application of the user portrait in the field of libraries is still in the exploration period. In recent years, people such as the residual civilization and the like propose a behavior-content fusion model, people such as Yao Yuan and the like propose a method for constructing a user portrait by using a knowledge map, and people such as Chendan and the like consider that a user portrait label can be obtained from three ways of user behavior, user social data and a user label set; he et al propose a mixed model of decision trees and logistic regression; kong et al propose a new context-aware attention convolutional neural network; zhou et al propose a description of a deep interest network that addresses the diverse behavioral characteristics of users with different candidate ads generating different representation vectors. Zhang et al propose a time-interval division algorithm to analyze and quantify the interest distribution of users in a time interval. However, the above research cannot precisely locate the core requirements of the user.
Disclosure of Invention
In order to realize more accuracy of a service supply side of a library, the invention provides a book recommendation method and device based on combination of user borrowing behavior analysis and interest prediction.
The embodiment of the invention provides a book recommendation method based on user borrowing behavior-interest prediction, which comprises the following steps:
acquiring user borrowing behavior data;
determining a basic feature tag based on the user borrowing behavior data;
determining a prediction class feature label by adopting a weight calculation algorithm TFIDF and cosine similarity method based on the user borrowing behavior data;
inputting the basic feature labels and the prediction class feature labels into a neural network model DeepFM constructed based on a factorization machine for Embedding feature vectorization, performing feature intersection on the feature vectors, inputting the feature vectors into a deep neural network, and outputting a recommendation result.
In one embodiment, a book recommendation method based on user borrowing behavior-interest prediction further comprises the following steps:
the method comprises the following steps of preprocessing data of the user borrowing behavior, and specifically comprises the following steps: data duplication removal, abnormal value processing, missing value processing and time format normalization; wherein the content of the first and second substances,
the outlier processing includes: normalizing the abnormal data beyond the normal time range;
the missing value processing comprises: deleting the row with the book number being empty;
the time format normalization includes: the data with non-uniform time format is subjected to time format conversion through a strptime () function in python.
In one embodiment, the determining of the basic feature tag includes:
drawing a histogram and a word cloud picture for the preprocessed user borrowing behavior data, and carrying out data visualization basic characteristic analysis; determining a basic feature label according to a basic feature analysis result of data visualization;
the base feature tag, comprising:
a fact type tag containing the total times of actions generated by the same user;
the rule class label is used for setting a threshold value for the fact class label of the user by combining manual experience on the basis of the fact class label, and different threshold values correspond to different user characteristics;
the text type label comprises text type information generated by a user; wherein the text class label comprises: gender, occupation, address and book name of the user, and a jieba word segmentation method is selected for feature extraction of the text labels.
In one embodiment, the weight calculation algorithm TFIDF includes:
the weight calculation algorithm TFIDF has two logical parts: interaction depth and TFIDF label total weight value;
the interaction depth refers to the depth of part of features used for measuring the user behavior under each interaction behavior: the user behavior type weight, the user behavior times and the time-dependent attenuation change of the behavior; the total weight value of the TFIDF label reflects the importance degree of different labels to the prediction result, the higher the importance degree is, the larger the weight is, and the calculation formula of the total weight of the TFIDF label is as follows:
W=Bi×It×Ci×TFIDF
wherein, BiRepresenting a weight of a type of behavior, CiRepresenting the user behavior times, and representing the total behavior times generated by the user; i istThe interest degree of time attenuation is shown, and TFIDF shows the assigned weight of the user behavior label.
In one embodiment, the formula for calculating the user behavior label assignment weight is as follows:
TFIDF(U,L)=TF(U,L)×IDF(U,L)
wherein TFIDF (U, L) represents the objective weight of the label L with respect to the user U, i.e. the product of the importance of each label L to the user U (TF (U, L)) and the importance of the label among all the labels of the user (IDF (U, L));
the formula for TF (U, L) is as follows:
Figure BDA0003180901850000031
where S (U, L) represents the number of times the tag L marks the user U, Σ S (U, L)i) The number of all the tags on the user U is represented, and TF (U, L) represents the proportion of the marking times of the tags L in the marking times of all the tags of the user U;
the formula for IDF (U, L) is as follows:
Figure BDA0003180901850000041
wherein, Σ S (U)i,Li) Sum of all tags, Σ S (U), representing all usersiL) represents the sum of all L-tagged users, and IDF (U, L) represents the degree of scarcity of tag L in all tags of user U, i.e., the probability of occurrence of this tag.
In one embodiment, the time decay interest level is calculated as follows:
It=e-λΔt
where Δ t denotes the number of days from the observation point at the moment of occurrence of the behavior t, λ denotes the attenuation factor, ItIndicating the level of interest at each time instant.
In one embodiment, the cosine similarity method specifically includes:
the input of the cosine similarity during weight classification is user borrowing behavior data, the output is a label with the highest recommendation score, and the calculation process is as follows:
calculating the similarity of every two labels, and orthogonalizing the two labels to obtain the combination of every two labels under each user;
calculating the number of users corresponding to each label, namely the number of the labels appearing in different users;
calculating the similarity of every two labels by using the cosine similarity, and finally obtaining a cosine similarity weight p;
calculating the relative labels recommended to the user, and making the user correspond to all the relative labels, wherein the recommendation score calculation formula is as follows:
R=W×p
wherein, W is the total weight of the label; p is cosine similarity weight, and the calculation formula is as follows:
Figure BDA0003180901850000042
the above sigma represents the number of users simultaneously paying attention to the resource X and the resource Y, eta represents the number of users paying attention to the resource X, lambda represents the number of users paying attention to the resource Y, and p is a similarity coefficient for measuring the two resources concerned by the users.
In one embodiment, a book recommendation method based on user borrowing behavior-interest prediction further comprises the following steps:
carrying out normalization processing on numerical type characteristics in the basic characteristic labels and the prediction type characteristic labels by adopting a logarithmic function conversion method; and converting the class type characteristics in the basic characteristic label and the prediction class characteristic label into numerical vectors by adopting one-hot coding.
In one embodiment, the determining of the recommendation result includes:
DeepFM model: the device comprises an input layer, an embedded layer, a model layer and an output layer;
inputting the basic feature label and the prediction class feature label into a deep FM model, and carrying out Embedding feature vectorization through an Embedding layer; splicing the linear characteristics together by the model layer through a linear model, wherein the linear model is the sum of the characteristics and the corresponding characteristic weights; and the output layer outputs the recommendation result through the two fully-connected layers.
A book recommendation apparatus based on user borrowing behavior-interest prediction, comprising:
the data acquisition module is used for acquiring data of the user borrowing behaviors;
the basic tag determining module is used for determining a basic feature tag based on the user borrowing behavior data;
the category label determining module is used for determining a prediction category feature label by adopting a weight calculation algorithm TFIDF and cosine similarity method based on the user borrowing behavior data;
and the recommendation result determining module is used for inputting the basic feature labels and the prediction class feature labels into a neural network model DeepFM constructed based on a factorization machine for Embedding feature vectorization, performing feature intersection on the feature vectors, inputting the feature vectors into a deep neural network, and outputting a recommendation result.
Compared with the prior art, the book recommendation method and device based on the user borrowing behavior-interest prediction provided by the embodiment of the invention have the following beneficial effects:
the invention provides a book recommendation method based on user borrowing behavior-interest prediction, which aims at the problem that the result is not accurate enough when books are recommended in a library. The method effectively constructs a user behavior label system, carries out personalized recommendation by combining the user interests, realizes the core requirement of accurately positioning the user and improves the user satisfaction.
Drawings
FIG. 1 is a flow diagram illustrating a method for user borrowing behavior-interest prediction provided in one embodiment;
FIG. 2 is a user type distribution histogram provided in one embodiment;
FIG. 3 is a family readership word cloud provided in one embodiment;
FIG. 4 is an individual user word cloud provided in one embodiment;
FIG. 5 is a comparison graph of the Accuracy change curves of the book recommendation method without the prediction class label and the book recommendation method of the present invention (including the book recommendation method with the prediction class label) in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In one embodiment, a book recommendation method based on user borrowing behavior-interest prediction is provided, and the method comprises the following steps:
and step 1, reading data.
And 2, preprocessing data. And (3) preprocessing the data read in the step (1).
And 3, performing data visualization analysis and constructing a basic label. And (3) performing visual analysis on the data after preprocessing in the step (2) by drawing a histogram and a word cloud picture, and constructing 3 types of basic labels of a fact type, a rule type and a text type.
And 4, constructing a prediction class label. And (3) constructing a prediction class label for the data preprocessed in the step (2) by adopting a TFIDF and cosine similarity method.
And 5, normalizing the feature labels. And 4, normalizing the numerical characteristic labels and the category characteristic labels generated in the steps 3 and 4.
And 6, recommending books. Inputting the feature labels obtained in the step 5 into a deep FM model for Embedding feature vectorization, performing feature intersection on the feature vectors after Embedding, then inputting the feature vectors into a deep neural network, and outputting a recommendation result.
And 7, comparing and analyzing book recommendation results. And comparing and analyzing book recommendation results by using recommendation Accuracy (Accuracy) as an objective evaluation index according to a book recommendation method without a prediction type label and the method (including the book recommendation method with the prediction type label).
The data preprocessing in the step 2 specifically comprises:
and (3) preprocessing the data read in the step (1), specifically comprising data deduplication, abnormal value processing, missing value processing and time format normalization. Wherein the abnormal value processing is mainly used for normalizing abnormal data beyond a normal time range; missing value processing refers to deleting rows with book numbers being empty; time format normalization refers to time format conversion by the strptime () function in python for data with non-uniform time format.
The data visualization analysis in the step 3 specifically comprises:
the data preprocessed in the step 2 are visualized by a method in a third-party library word cluster library and a drawing library matplotlib library which display word cloud pictures in python; the basic label construction in the step 3 specifically comprises the following steps:
on the basis of data visualization analysis, 3 types of basic labels of fact class, rule class and text class are constructed by utilizing preprocessed data. Fact class labels contain the total number of actions taken by the same user. The rule class label is that a threshold value is set for a statistic class label of a user by combining manual experience on the basis of a fact class label, and different threshold values correspond to different user characteristics. The text label comprises text information generated by the user, such as the sex, occupation, address, book name and the like of the user, and a jieba word segmentation method is selected for feature extraction of the text label.
The process of constructing the prediction class label in the step 4 specifically comprises the following steps:
and (3) constructing a prediction class label for the data preprocessed in the step (2) by adopting a TFIDF and cosine similarity method, wherein the specific process is as follows:
the TFIDF weight calculation logic is mainly divided into two parts: depth of interaction, TFIDF tag total weight value. The interaction depth refers to the depth of user behavior under each interaction behavior, and some features can measure the depth of the user behavior, such as user behavior type weight, user behavior frequency and the decay change of the behavior over time. And (4) calculating the importance degree of the different labels on the prediction result by using the TFIDF weight, wherein the higher the importance degree is, the higher the weight is. The TFIDF tag weight is calculated as follows:
TFIDF(U,L)=TF(U,L)×IDF(U,F) (1)
where TFIDF (U, L) represents the objective weight of the label L with respect to the user U, i.e. the product of the importance of each label L to the user U (TF (U, L)) and the importance of the label among all the labels of the user (IDF (U, L)).
The formula for TF (U, L) is as follows:
Figure BDA0003180901850000081
where S (U, L) represents the number of times the tag L marks the user U, Σ S (U, L)i) The number of all tags on the user U is indicated, and TF (U, L) indicates the proportion of the number of times of tagging of the tag L to the number of times of tagging of all tags of the user U.
The formula for IDF (U, L) is as follows:
Figure BDA0003180901850000082
wherein, Σ S (U)i,Li) Sum of all tags, Σ S (U), representing all usersiL) represents the sum of all L-tagged users, and IDF (U, L) represents the degree of scarcity of tag L in all tags of user U, i.e., the probability of occurrence of this tag. If a label L appears with little chance and is used to label user U, this makes the relationship between user U and label L tighter.
The interestingness at each moment in the time attenuation function in the analysis of the attenuation change of the behavior along with the time is represented by the following formula:
It=e-λΔt (4)
where Δ t denotes the number of days from the observation point at the moment of occurrence of the behavior t, λ denotes the attenuation factor, ItIndicating the level of interest at each time instant.
The calculation formula of the total weight of the user label is as follows:
W=Bi×It×Ci×TFIDF (5)
wherein, BiRepresenting a weight of a type of behavior, CiRepresenting the user behavior times, and representing the total behavior times generated by the user; i istThe interest degree of time attenuation is shown, and TFIDF shows the assigned weight of the user behavior label.
The method for utilizing the cosine similarity in the step 4 specifically comprises the following steps:
cosine similarity is a measure for measuring the difference between two vectors by using the cosine value of the included angle of the two vectors, the closer the value is to 1, the more the included angle tends to 0 degree, the greater the correlation is, the calculation formula is as follows,
Figure BDA0003180901850000091
wherein, σ represents the number of users paying attention to the resource X and the resource Y at the same time, η represents the number of users paying attention to the resource X only, λ represents the number of users paying attention to the resource Y only, and p is a similarity coefficient for measuring the two resources concerned by the user. The larger the value of p, the higher the probability that the user will focus on two resources at the same time.
And when the cosine similarity is subjected to weight classification, inputting the preprocessed data feature list, and outputting the label with the highest recommendation score. The calculation process is as follows:
1. and calculating the similarity of every two labels, and orthogonalizing the two tables to obtain the combination of every two labels under each user.
2. And calculating the number of users corresponding to each label, namely the number of the labels appearing in different users.
3. And calculating the similarity of every two labels by using the cosine similarity, and finally obtaining the cosine similarity weight p.
4. Calculating the relative labels recommended to the user, and making the user correspond to all the labels related to the user, wherein a recommendation score calculation formula is as follows:
R=W×p (7)
wherein, W is the total weight of the labels, and p is the weight of cosine similarity.
The normalization of the feature tag in the step 5 is specifically as follows:
and 4, normalizing the numerical characteristic labels and the category characteristic labels generated in the steps 3 and 4. And (3) carrying out normalization processing on the numerical characteristics (including user numbers, book numbers, ages and the like) by adopting a logarithmic function conversion method, wherein the logarithmic function conversion is as shown in a formula (8), and the one-hot codes are used for converting the category type characteristics (including the gender, occupation, personal information and resource categories of the users) into numerical vectors.
f(a)=log10(a) (8)
Wherein, a is a specific value corresponding to the numerical characteristic to be processed, and f (a) is a numerical value normalized by a logarithmic function with a base 10 as a base.
The book recommendation in the step 6 comprises the following specific processes:
and the deep FM model comprises an input layer, an Embedding layer, a model layer and an output layer, and the feature tag obtained in the step 5 is input into the deep FM model to be subjected to Embedding feature vectorization through the Embedding layer.
The model layer firstly splices linear features together through a generalized linear model, the linear model is as formula (9), feature crossing is carried out on the feature vectors after Embedding, and then the feature vectors are input into a deep neural network:
y=ω1x12x2+...+ωnxn (9)
wherein x isiRefer to individual characteristics of the input, ωiThe weight of each feature is referred to, and the weight can reach a value which is in accordance with the prediction effect of the model through back propagation learning of the model.
And the output layer outputs the recommendation result through the two fully-connected layers.
Step 7, the specific process of comparing and analyzing the book recommendation result is as follows:
the recommendation Accuracy (Accuracy) is used as an objective evaluation index, and the book recommendation result is contrastively analyzed by calculating the Accuracy (Accuracy) value and drawing an Accuracy curve according to the book recommendation method without the prediction type label and the book recommendation method (including the book recommendation method with the prediction type label).
Example 1
And (3) executing steps 1 and 2:
10 ten thousand user behavior records and 60 ten thousand book resource data of 1 ten thousand users are randomly screened out as an experimental data set by adopting offline borrowing data of a certain provincial library. The user has attributes of user number, age, gender, occupation and the like, and the book has attributes of book number, book name, category, author, publishing company and the like. The preprocessing of the data comprises duplication elimination, abnormal value processing, missing value processing and time format normalization.
And (3) executing the step:
the method comprises the steps of enabling preprocessed data to be visualized through a method in a third-party library word cluster library and a drawing library matplotlib library which are used for displaying word cloud pictures in python, basically labeling the data on the basis of visualization, dividing the data into a fact type label, a text type label and a rule type label 3 type basic label according to label generation types, and storing the label of a user in a format of a two-dimensional table structure.
And (4) executing:
constructing a prediction class label for the preprocessed data by using a TFIDF and cosine similarity method, adding a time attenuation function when calculating the label weight by using the TFIDF, and indirectly reflecting the interest change of a user on corresponding resources through the attenuation change of user behavior along with time; and calculating the similarity of every two labels by using cosine similarity, finally obtaining cosine similarity weight, and outputting the label with the highest recommendation score.
And (5) executing steps and 6:
carrying out normalization processing on the numerical characteristic by adopting a logarithmic function conversion method, and carrying out one-hot coding processing on the category characteristic to convert the category characteristic into a numerical vector; carrying out Embedding feature vectorization on the processed feature tag through an Embedding layer; meanwhile, in order to enable the model to learn various characteristics in data, the linear characteristics are spliced together by utilizing a generalized linear model, then characteristic crossing is carried out on the characteristic vectors after Embedding, meanwhile, according to the two characteristics of the age and the occupation of the user, the book recommendation effect of the user is far better than the recommendation effect only depending on single age or occupation characteristics, finally, the characteristic vectors after Embedding are input into a deep neural network for training, and an output result is obtained through two full-connection layers.
And 7, executing the step:
when the deep FM model is used for testing, the basic label and the label result added with the prediction class label are selected, 80% of the basic label and the label result added with the prediction class label are used as training sets for training samples, 20% of the basic label and the prediction class label are used as testing sets for testing the result of the model. 20% of the training sets were used as validation sets, which showed the results of each training. The recommendation effect of the method (the book recommendation method containing the prediction type label) is better than that of the book recommendation method without the prediction type label through the Accuracy change curve.
Here, 10 ten thousand user behavior records and 60 ten thousand book resource data of 1 ten thousand users are randomly screened out as an experimental data set by using offline borrowing data of a certain provincial library.
In the aspect of objective evaluation, a common Accuracy (Accuracy) index is selected as a verification index of an experimental result, a calculation formula of the Accuracy is as follows,
Figure BDA0003180901850000121
wherein TP represents the number of samples labeled as positive samples and predicted to be also positive samples; TN represents the number of samples labeled as negative and predicted to be also negative; FP represents the number of samples labeled negative and predicted positive; FN represents the number of samples labeled as positive samples and predicted as negative samples.
As can be seen from the Accuracy variation curve in fig. 5. Compared with the book recommendation method without the prediction type tag, the method has better recommendation effect and can be used for book recommendation service of a digital library.
In one embodiment, a book recommendation apparatus based on user borrowing behavior-interest prediction is provided, the apparatus comprising:
and the data acquisition module is used for acquiring the data of the borrowing behavior of the user.
And the basic tag determining module is used for determining the basic feature tag based on the user borrowing behavior data.
And the category label determining module is used for determining the prediction category characteristic label by adopting a weight calculation algorithm TFIDF and cosine similarity method based on the user borrowing behavior data.
And the recommendation result determining module is used for inputting the basic feature labels and the prediction class feature labels into a neural network model DeepFM constructed based on a factorization machine for Embedding feature vectorization, performing feature intersection on the feature vectors, inputting the feature vectors into a deep neural network, and outputting a recommendation result.
For specific limitations of a book recommendation device based on user borrowing behavior-interest prediction, reference may be made to the above limitations of a book recommendation method based on user borrowing behavior-interest prediction, which are not described herein again. The various modules in the above-mentioned book recommendation device based on user borrowing behavior-interest prediction can be realized in whole or in part by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features. Furthermore, the above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A book recommendation method based on user borrowing behavior-interest prediction is characterized by comprising the following steps:
acquiring user borrowing behavior data;
determining a basic feature tag based on the user borrowing behavior data;
determining a prediction class feature label by adopting a weight calculation algorithm TFIDF and cosine similarity method based on the user borrowing behavior data;
inputting the basic feature labels and the prediction class feature labels into a neural network model DeepFM constructed based on a factorization machine for Embedding feature vectorization, performing feature intersection on the feature vectors, inputting the feature vectors into a deep neural network, and outputting a recommendation result.
2. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 1, further comprising:
the method comprises the following steps of preprocessing data of the user borrowing behavior, and specifically comprises the following steps: data duplication removal, abnormal value processing, missing value processing and time format normalization; wherein the content of the first and second substances,
the outlier processing includes: normalizing the abnormal data beyond the normal time range;
the missing value processing comprises: deleting the row with the book number being empty;
the time format normalization includes: the data with non-uniform time format is subjected to time format conversion through a strptime () function in python.
3. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 2, wherein the determination of the base feature tag comprises:
drawing a histogram and a word cloud picture for the preprocessed user borrowing behavior data, and carrying out data visualization basic characteristic analysis; determining a basic feature label according to a basic feature analysis result of data visualization;
the base feature tag, comprising:
a fact type tag containing the total times of actions generated by the same user;
the rule class label is used for setting a threshold value for the fact class label of the user by combining manual experience on the basis of the fact class label, and different threshold values correspond to different user characteristics;
the text type label comprises text type information generated by a user; wherein the text class label comprises: gender, occupation, address and book name of the user, and a jieba word segmentation method is selected for feature extraction of the text labels.
4. The book recommendation method based on user borrowing behavior-interest prediction according to claim 3, wherein the weight calculation algorithm TFIDF comprises:
the weight calculation algorithm TFIDF has two logical parts: interaction depth and TFIDF label total weight value;
the interaction depth refers to the depth of part of features used for measuring the user behavior under each interaction behavior: the user behavior type weight, the user behavior times and the time-dependent attenuation change of the behavior; the total weight value of the TFIDF label reflects the importance degree of different labels to the prediction result, the higher the importance degree is, the larger the weight is, and the calculation formula of the total weight of the TFIDF label is as follows:
W=Bi×It×Ci×TFIDF
wherein, BiRepresenting a weight of a type of behavior, CiRepresenting the user behavior times, and representing the total behavior times generated by the user; i istThe interest degree of time attenuation is shown, and TFIDF shows the assigned weight of the user behavior label.
5. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 3, wherein the calculation formula of the user behavior tag assignment weight is as follows:
TFIDF(U,L)=TF(U,L)×IDF(U,L)
wherein TFIDF (U, L) represents the objective weight of the label L with respect to the user U, i.e. the product of the importance of each label L to the user U (TF (U, L)) and the importance of the label among all the labels of the user (IDF (U, L));
the formula for TF (U, L) is as follows:
Figure FDA0003180901840000021
where S (U, L) represents the number of times the tag L marks the user U, Σ S (U, L)i) The number of all the tags on the user U is represented, and TF (U, L) represents the proportion of the marking times of the tags L in the marking times of all the tags of the user U;
the formula for IDF (U, L) is as follows:
Figure FDA0003180901840000031
wherein, Σ S (U)i,Li) Sum of all tags, Σ S (U), representing all usersiL) represents the sum of all L-tagged users, and IDF (U, L) represents the degree of scarcity of tag L in all tags of user U, i.e., the probability of occurrence of this tag.
6. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 3, wherein the time decay interestingness calculation formula is as follows:
It=e-λΔt
where Δ t denotes the number of days from the observation point at the moment of occurrence of the behavior t, λ denotes the attenuation factor, ItIndicating the level of interest at each time instant.
7. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 3, wherein the cosine similarity method specifically comprises:
the input of the cosine similarity during weight classification is user borrowing behavior data, the output is a label with the highest recommendation score, and the calculation process is as follows:
calculating the similarity of every two labels, and orthogonalizing the two labels to obtain the combination of every two labels under each user;
calculating the number of users corresponding to each label, namely the number of the labels appearing in different users;
calculating the similarity of every two labels by using the cosine similarity, and finally obtaining a cosine similarity weight p;
calculating the relative labels recommended to the user, and making the user correspond to all the relative labels, wherein the recommendation score calculation formula is as follows:
R=W×p
wherein, W is the total weight of the label; p is cosine similarity weight, and the calculation formula is as follows:
Figure FDA0003180901840000032
the above sigma represents the number of users simultaneously paying attention to the resource X and the resource Y, eta represents the number of users paying attention to the resource X, lambda represents the number of users paying attention to the resource Y, and p is a similarity coefficient for measuring the two resources concerned by the users.
8. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 4, further comprising:
carrying out normalization processing on numerical type characteristics in the basic characteristic labels and the prediction type characteristic labels by adopting a logarithmic function conversion method; and converting the class type characteristics in the basic characteristic label and the prediction class characteristic label into numerical vectors by adopting one-hot coding.
9. The book recommendation method based on user borrowing behavior-interest prediction as claimed in claim 4, wherein the determination of recommendation result comprises:
DeepFM model: the device comprises an input layer, an embedded layer, a model layer and an output layer;
inputting the basic feature label and the prediction class feature label into a deep FM model, and carrying out Embedding feature vectorization through an Embedding layer; splicing the linear characteristics together by the model layer through a linear model, wherein the linear model is the sum of the characteristics and the corresponding characteristic weights; and the output layer outputs the recommendation result through the two fully-connected layers.
10. A book recommendation apparatus based on user borrowing behavior-interest prediction, comprising:
the data acquisition module is used for acquiring data of the user borrowing behaviors;
the basic tag determining module is used for determining a basic feature tag based on the user borrowing behavior data;
the category label determining module is used for determining a prediction category feature label by adopting a weight calculation algorithm TFIDF and cosine similarity method based on the user borrowing behavior data;
and the recommendation result determining module is used for inputting the basic feature labels and the prediction class feature labels into a neural network model DeepFM constructed based on a factorization machine for Embedding feature vectorization, performing feature intersection on the feature vectors, inputting the feature vectors into a deep neural network, and outputting a recommendation result.
CN202110846763.8A 2021-07-26 2021-07-26 Book recommendation method and device based on user borrowing behavior-interest prediction Active CN113590945B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110846763.8A CN113590945B (en) 2021-07-26 2021-07-26 Book recommendation method and device based on user borrowing behavior-interest prediction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110846763.8A CN113590945B (en) 2021-07-26 2021-07-26 Book recommendation method and device based on user borrowing behavior-interest prediction

Publications (2)

Publication Number Publication Date
CN113590945A true CN113590945A (en) 2021-11-02
CN113590945B CN113590945B (en) 2023-07-28

Family

ID=78250153

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110846763.8A Active CN113590945B (en) 2021-07-26 2021-07-26 Book recommendation method and device based on user borrowing behavior-interest prediction

Country Status (1)

Country Link
CN (1) CN113590945B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826850A (en) * 2019-09-24 2020-02-21 西安科成新果信息科技有限公司 Smart campus library management method and system based on information processing
CN117035245A (en) * 2023-10-10 2023-11-10 湖北中文在线数字出版有限公司 Book borrowing method and system based on digital person

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719145A (en) * 2009-11-17 2010-06-02 北京大学 Individuation searching method based on book domain ontology
US20150278910A1 (en) * 2014-03-31 2015-10-01 Microsoft Corporation Directed Recommendations
CN110619084A (en) * 2019-08-29 2019-12-27 西安工程大学 Method for recommending books according to borrowing behaviors of readers in library
CN111444428A (en) * 2020-03-27 2020-07-24 腾讯科技(深圳)有限公司 Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
CN112163161A (en) * 2020-10-14 2021-01-01 上海交通大学 Recommendation method and system for college library, readable storage medium and electronic equipment
CN112765339A (en) * 2021-01-21 2021-05-07 山东师范大学 Personalized book recommendation method and system based on reinforcement learning
US20210174257A1 (en) * 2019-12-04 2021-06-10 Cerebri AI Inc. Federated machine-Learning platform leveraging engineered features based on statistical tests

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719145A (en) * 2009-11-17 2010-06-02 北京大学 Individuation searching method based on book domain ontology
US20150278910A1 (en) * 2014-03-31 2015-10-01 Microsoft Corporation Directed Recommendations
CN110619084A (en) * 2019-08-29 2019-12-27 西安工程大学 Method for recommending books according to borrowing behaviors of readers in library
US20210174257A1 (en) * 2019-12-04 2021-06-10 Cerebri AI Inc. Federated machine-Learning platform leveraging engineered features based on statistical tests
CN111444428A (en) * 2020-03-27 2020-07-24 腾讯科技(深圳)有限公司 Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium
CN112163161A (en) * 2020-10-14 2021-01-01 上海交通大学 Recommendation method and system for college library, readable storage medium and electronic equipment
CN112765339A (en) * 2021-01-21 2021-05-07 山东师范大学 Personalized book recommendation method and system based on reinforcement learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MARIA SOLEDAD PERA 等: "Analyzing Book-Related Features to Recommend Books for Emergent Readers", 《HT \'15: PROCEEDINGS OF THE 26TH ACM CONFERENCE ON HYPERTEXT & SOCIAL MEDIA》, pages 221 - 230 *
夏小娜 等: "兴趣驱动的用户借阅行为分析及启发式借阅流程模型构建", 《 图书馆理论与实践 》, pages 57 - 64 *
曾子明 等: "基于用户兴趣变化的数字图书馆知识推荐服务研究", 《图书馆论坛》, pages 94 - 99 *
胡代平 等: "基于读者偏好变化的高校图书个性化推荐方法", 《系统管理学报》, pages 824 - 829 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826850A (en) * 2019-09-24 2020-02-21 西安科成新果信息科技有限公司 Smart campus library management method and system based on information processing
CN117035245A (en) * 2023-10-10 2023-11-10 湖北中文在线数字出版有限公司 Book borrowing method and system based on digital person
CN117035245B (en) * 2023-10-10 2023-12-26 湖北中文在线数字出版有限公司 Book borrowing method and system based on digital person

Also Published As

Publication number Publication date
CN113590945B (en) 2023-07-28

Similar Documents

Publication Publication Date Title
CN107908740B (en) Information output method and device
US11734566B2 (en) Systems and processes for bias removal in a predictive performance model
WO2019175571A1 (en) Combined methods and systems for online media content
CN112860841B (en) Text emotion analysis method, device, equipment and storage medium
CN112214670A (en) Online course recommendation method and device, electronic equipment and storage medium
CN110544155A (en) User credit score acquisition method, acquisition device, server and storage medium
US9524526B2 (en) Disambiguating authors in social media communications
US11367117B1 (en) Artificial intelligence system for generating network-accessible recommendations with explanatory metadata
US11599927B1 (en) Artificial intelligence system using deep neural networks for pairwise character-level text analysis and recommendations
CN113590945B (en) Book recommendation method and device based on user borrowing behavior-interest prediction
CN115392237B (en) Emotion analysis model training method, device, equipment and storage medium
CN109740156B (en) Feedback information processing method and device, electronic equipment and storage medium
CN111310065A (en) Social contact recommendation method and device, server and storage medium
CN113327132A (en) Multimedia recommendation method, device, equipment and storage medium
CN117216393A (en) Information recommendation method, training method and device of information recommendation model and equipment
CN115757799B (en) Data storage method and system based on artificial intelligence and cloud platform
CN115687790B (en) Advertisement pushing method and system based on big data and cloud platform
CN116680481A (en) Search ranking method, apparatus, device, storage medium and computer program product
CN110825847B (en) Method and device for identifying intimacy between target people, electronic equipment and storage medium
Karim et al. Classification of Google Play Store Application Reviews Using Machine Learning
CN112035740A (en) Project use duration prediction method, device, equipment and storage medium
CN112508615A (en) Feature extraction method, feature extraction device, storage medium, and electronic apparatus
CN114547435A (en) Content quality identification method, device, equipment and readable storage medium
CN114866818B (en) Video recommendation method, device, computer equipment and storage medium
CN113676505B (en) Information pushing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant