CN116204721A - Concept lattice recommendation method and device based on user record feedback and search content - Google Patents

Concept lattice recommendation method and device based on user record feedback and search content Download PDF

Info

Publication number
CN116204721A
CN116204721A CN202310228763.0A CN202310228763A CN116204721A CN 116204721 A CN116204721 A CN 116204721A CN 202310228763 A CN202310228763 A CN 202310228763A CN 116204721 A CN116204721 A CN 116204721A
Authority
CN
China
Prior art keywords
user
recommendation
search
keywords
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310228763.0A
Other languages
Chinese (zh)
Inventor
赵学健
陈宇
孙知信
孙哲
曹亚东
宫婧
汪胡青
胡冰
徐玉华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202310228763.0A priority Critical patent/CN116204721A/en
Publication of CN116204721A publication Critical patent/CN116204721A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a concept lattice recommendation method and a device based on user record feedback and search content, wherein the method comprises the following steps: constructing a concept lattice of related text content according to search information input by a user; constructing a form background of hidden keywords based on search content input by a user, and screening list keywords according to similarity among the hidden keywords; constructing a concept lattice according to user record feedback and user characteristics, grouping and reducing to construct a recommendation scoring matrix of the user item; carrying out prediction rating on recommended items and user comments obtained based on the screened list keywords, substituting the rating into a recommendation rating matrix, and scoring unrated items of a prediction target user, so that recommendation is completed; in the current network environment, the invention can play an important role in helping users select recommended content with lower cost, higher quality and more suitability for the uncertainty condition in the management service, the isomerism and the dimension of the recommended system.

Description

Concept lattice recommendation method and device based on user record feedback and search content
Technical Field
The invention relates to the technical field of Internet of things recommendation, in particular to a concept lattice recommendation method and device based on user record feedback and search content.
Background
The amount of data on the network and the number of internet users have been increasing at an unprecedented rate. With this particular growth, it has become critical to increase the ability of network users to distinguish between relevant information and irrelevant information.
Recommendation systems have recently been widely used in various fields to recommend items of interest to users. These systems help users find the most relevant information they need by spending less time and effort.
Recommendation algorithms are susceptible to epidemic deviations, and the quality of recommendations may vary from demographic group to demographic group. Recommendation systems and search tools increasingly guide us to obtain information online, including news, entertainment, academic resources, and social connections. When evaluating the quality of these results, the average performance of all users is typically reported. Thus, most populations tend to dominate the overall statistics in measuring the effectiveness of search and recommendation tools, but the effectiveness may also vary from individual to individual and demographic population.
Recommendation systems play an important role in providing users with useful information, particularly in e-commerce applications. Many proposed models use a user's rating history to predict an unknown rating. In recent years, user comments are a valuable knowledge source, and are attracting attention of researchers in the field, and a new comment-based recommendation system is developed. Collaborative filtering recommendation is used as a single-field recommendation technology which is most widely applied at present, and a collaborative filtering algorithm is focused on in academia and industry. Matrix decomposition is one of the most popular collaborative filtering methods, and is also a focus of attention, and at present, some problems still exist in these technologies, as follows: data sparsity problem: firstly, a user scores one article rarely, and the scoring data commonly used for matrix decomposition has the defect of sparsity due to various articles, and the sparsity reduces the scoring prediction precision of the traditional collaborative filtering technology to a certain extent; cold start problem: collaborative filtering requires a large amount of history of interaction with the website to give high quality recommendations, thus creating a user cold start problem. In a new area, users typically have little or no history of interaction with websites. Therefore, conventional recommendation methods often fail to provide high quality recommendations for cold-start users; noise problem: noise in the recommendation system refers to data that affects the prediction of the score value in the data set, and noise in the recommendation system data set is classified into malicious noise and non-malicious noise (natural noise), both types of noise are important, and can adversely affect recommendation performance.
A variety of sophisticated recommendation algorithms have been developed, and can be broadly classified into collaborative filtering recommendation algorithms, formal Concept Analysis (FCA) based on content recommendation algorithms and hybrid recommendation algorithms, which is an important technique for analyzing and extracting formal knowledge from a group of objects and their attributes. The extracted ontology is called formal concept, which explains the closed relation of objects and attributes, and can be further organized into an intuitive hierarchical structure called concept lattice due to the close connection with ordered theory. Therefore, it is widely used for data interpretation, visualization and analysis.
Disclosure of Invention
This section is intended to outline some aspects of embodiments of the invention and to briefly introduce some preferred embodiments. Some simplifications or omissions may be made in this section as well as in the description summary and in the title of the application, to avoid obscuring the purpose of this section, the description summary and the title of the invention, which should not be used to limit the scope of the invention.
The present invention has been made in view of the above-described problems.
Therefore, the technical problems solved by the invention are as follows: traditional recommender methods fail to utilize ever-increasing, dynamic and heterogeneous internet of things data when building internet of things recommender systems.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, an embodiment of the present invention provides a concept lattice recommendation method based on user record feedback and search content, including:
constructing a concept lattice of related text content according to search information input by a user;
constructing a form background of hidden keywords based on search content input by a user, and screening list keywords according to similarity among the hidden keywords;
constructing a concept lattice according to user record feedback and user characteristics, grouping and reducing to construct a recommendation scoring matrix of the user item;
and carrying out prediction rating on the recommended items and user comments obtained based on the screened list keywords, substituting the rating into a recommendation rating matrix, and scoring the unrated items of the prediction target user, thereby completing recommendation.
As a preferred scheme of the concept lattice recommendation method based on user record feedback and search content, wherein: the building of the concept lattice of the related text content comprises the following steps: constructing a form background T (U, G, I) of hidden keywords based on search content input by a user, wherein U is a set of users, G is a set of keywords, and I is a binary relation between U and G;
let the association matrix of the category label set B and the keyword set G be t= { T ij When B (B) i Contains keyword G j At time t ij =1, otherwise t ij =0, T is performedTranspose to obtain the incidence matrix T of the keyword set G and the category label set B B
Incidence matrix T based on keyword set G and category label set B B Obtaining vocabulary characterization based on category labels:
Figure BDA0004119407400000033
wherein v ij The value represents the keyword g i Whether or not to be included in category label B j In (B), when j Does not contain g i When v ij =0, otherwise ν ij =1, expressed as:
g i ={B j |g i ∈B j }
set keyword g 1 And g 2 Wherein g 1 Corresponding to m category labels, denoted g 1 ={B 11 ,B 12 ,...,B 1m },g 2 Corresponding to n category labels, denoted g 2 ={B 21 ,B 22 ,...,B 2n }。
As a preferred scheme of the concept lattice recommendation method based on user record feedback and search content, wherein: the screening list keywords comprise: calculating to obtain category label B of keyword input by user 1 Category label B with highest similarity 2 And calculate the value of B 2 The similarity between other keywords of the keyword(s) and the keywords input by the user is greater than or equal to a threshold value beta, and a word compared with the keywords input by the user is reserved, otherwise, the word is deleted from the list;
the similarity calculation formula between keywords can be expressed as:
Figure BDA0004119407400000031
wherein B is 1i G is g 1 Corresponding ith class label, B 2j G is g 2 Corresponding j-th class label, sim(B 1i ,B 2j ) Representing class label pairs (B) 1i ,B 2j ) Similarity of (2);
category label B i And B is connected with j The similarity between can be expressed as:
Figure BDA0004119407400000032
/>
wherein B is i And B is connected with j Nodes respectively representing two category labels in the concept lattice structure; LCS means B i And B is connected with j Is the closest public parent node of (a); dep (LCS) represents the depth at which the LCS is located, dis (B) i ,B j ) Representing the path length.
As a preferred scheme of the concept lattice recommendation method based on user record feedback and search content, wherein: the user record feedback comprises a user search state and user history behavior; the user search state includes: search times, selection times, collection times and browsing times; the user history behavior comprises user history project comments and scores; the user characteristics include a search status value;
classifying the users in groups according to the behavior state of the users in the search recommendation page, if the user search time value exceeds 70% of the users, defining frequent use, if the user search time value exceeds 40% of the users, defining general use, otherwise defining little use;
if the user has interactive behaviors including 2 or more items on the recommended page content, the user is considered to be satisfied with the recommended content, and the score m is 5 points; if the user has interaction behavior containing 1 item on the recommended page content, the user is considered to have general satisfaction degree on the recommended content, and the score is [2,4]; if the user does not have interactive behavior on the recommended page content, the user is not satisfied with the recommended content, and the score m is less than 2; the interaction behavior comprises: clicking is performed; the stay time is longer than 30 seconds; and (5) collecting, selecting and copying page contents.
As a preferred scheme of the concept lattice recommendation method based on user record feedback and search content, wherein: the packet reduction includes: use A T (g) Representing the attributes that object g has in formal background T, using O T (m) represents an object having an attribute m in the formal background T;
if (g, m '). Epsilon.T, then m'. Epsilon.A T (g) If (g ', m) ∈T, g' ∈O T (M) for G, M represents a set of objects/attributes, A T (G) Is defined as
Figure BDA0004119407400000042
A T (g i ),O T (M) is defined as->
Figure BDA0004119407400000043
O T (m i );
When there is no ambiguity, the subscript T is omitted; furthermore, x is used ij Such double subscript letters represent a binary relation (g i ,m j ) Incidence in the corresponding formal background; setting a formal background (G, M, I) in which there are n objects and M attributes in M, in a particular packet-based reduction, the object set G is divided into k disjoint groups S (1) ,S (2) ,S (3) ,...,S (k) Wherein S is (1) ∪S (2) ∪S (3) ∪...∪S (k) =g; objects from the same group are considered similar, and all g.epsilon.S (i) After reduction, use an object r i Alternatively, the reduced background fidelity is defined as:
Figure BDA0004119407400000041
solving a conceptual lattice reduction based on grouping by utilizing an integer linear programming technology, wherein an integer linear programming model is defined as follows:
maximize c T x
subject to ax.ltoreq.b, x.gtoreq.0 and x.epsilon.Z n
Wherein x is called variable, is a vector to be determined, Z represents an integer, and A, b and c are coefficient matrixes or vectors forming constraints;
the global goal of the integer linear programming model is to maximize T, and the distribution variable rule is as follows: for object g i The rate of change x of (2) ij =b, if there is another object g k The method comprises the following steps:
|A(g i )-A(g k ))Y(A(g k )-A(g i ))|≤ε r
wherein ε is r Is a set threshold value; and x is kj =1-B, x will be ij The marks being changeable, called c ij And assigning a variable v ij To represent x ij Whether or not modified; wherein B is set to true for the separation rule and false for the connection rule;
the basic constraint conditions of the integer linear programming model are as follows:
Figure BDA0004119407400000051
Figure BDA0004119407400000052
in addition, the following constraints need to be added to maintain formal background fidelity:
Figure BDA0004119407400000053
wherein ε is m Is a set threshold.
As a preferred scheme of the concept lattice recommendation method based on user record feedback and search content, wherein: the constructing a recommendation scoring matrix for user items includes: converting the recommendation scoring matrix into a binary scoring matrix, defining the scores of 4-5 as recommendation, setting the score to 1, defining the rest scores as non-recommendation, setting the score to 0, and regarding the matrix as a formal background and representing the matrix as C (U, I, S), wherein U, I, S respectively represents a user set, a project set and a relationship between a user and a project; after the user interest form background is obtained, a conceptual lattice structure model is built according to the binary relation between the object and the attribute in the user interest form background C.
As a preferred scheme of the concept lattice recommendation method based on user record feedback and search content, wherein: the prediction rating includes: the membership of a given comment r in the ith item is scored as μ based on the score i (r) and is calculated by the following simple heuristic method:
Figure BDA0004119407400000054
wherein the actual ratio (r) represents an actual rating score corresponding to the censor r in the dataset;
the kth fuzzy set belonging to a user u and an item i is denoted as f k (u) and f k (i) K is more than or equal to 1 and less than or equal to 5, each comment is converted into a semantic vector, and each f k (u) and f k (i) Has a set of vectors corresponding to f k Averaging the vectors contained in the list to obtain a single vector, and finally obtaining 5 vectors comprising 5 vectors of the user u and the item i, wherein each vector corresponds to one possible rating value and is marked as V k (u) and V k (i) The method comprises the steps of carrying out a first treatment on the surface of the For V k (u) and V k (i) The calculation is as follows:
V k (u)=Avg(μ k (r,u)×V k (r,u))
V k (i)=Avg(μ k (r,i)×V k (r,i))
wherein V is k (r, u) is the information pertaining to user u for f k Semantic vector, μ of the r comment of (u) k (r, u) is f k Membership in (u);
comparing the user u and item i vectors for each corresponding rating value, consider the rating value yielding the greatest similarity as the predictive rating for the u and i pair:
Figure BDA0004119407400000061
/>
wherein V is k (u) and V k (i) The vectors belonging to user u and item i, respectively, the score corresponding to k), r is the predicted score of u for i.
In a second aspect, an embodiment of the present invention provides a concept lattice recommendation system based on user record feedback and search content, which is characterized by comprising:
the text content concept lattice construction module is used for constructing concept lattices of related text content according to search information input by a user;
the screening module is used for constructing a form background of the hidden keywords based on search contents input by a user, and screening list keywords according to the similarity among the hidden keywords;
the user concept lattice construction module is used for constructing concept lattices according to user record feedback and user characteristics, and constructing a recommendation scoring matrix of user items by grouping reduction;
and the prediction recommendation module is used for carrying out prediction rating on the recommended items and user comments obtained based on the screened list keywords, substituting the rating into a recommendation scoring matrix and scoring the unscored items of the prediction target user so as to finish recommendation.
In a third aspect, embodiments of the present invention provide a computing device comprising:
a memory and a processor;
the memory is configured to store computer-executable instructions that, when executed by the one or more processors, cause the one or more processors to implement a concept lattice recommendation method based on user record feedback and search content according to any embodiment of the present invention.
In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium storing computer executable instructions that when executed by a processor implement the concept lattice recommendation method based on user record feedback and search content.
The invention has the beneficial effects that: the invention provides a concept lattice recommendation method based on user record feedback and search content, which is characterized in that concept lattices are constructed by the user record feedback and the search content, user objects are grouped and the concept lattices are reduced, so that the effect of improving recommendation precision is achieved; in the current network environment, for uncertainty in the recommendation system management services, heterogeneity, and dimensions, it can play an important role in helping users select lower cost, higher quality, more appropriate recommended content.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
FIG. 1 is a general flow chart of a concept lattice recommendation method based on user record feedback and search content according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram showing MAE comparison of different methods in simulation and control experiments of a concept lattice recommendation method based on user record feedback and search content according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of the results based on CC metrics in simulation and control experiments of a concept lattice recommendation method based on user record feedback and search content according to a second embodiment of the present invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
While the embodiments of the present invention have been illustrated and described in detail in the drawings, the cross-sectional view of the device structure is not to scale in the general sense for ease of illustration, and the drawings are merely exemplary and should not be construed as limiting the scope of the invention. In addition, the three-dimensional dimensions of length, width and depth should be included in actual fabrication.
Also in the description of the present invention, it should be noted that the orientation or positional relationship indicated by the terms "upper, lower, inner and outer", etc. are based on the orientation or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first, second, or third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The terms "mounted, connected, and coupled" should be construed broadly in this disclosure unless otherwise specifically indicated and defined, such as: can be fixed connection, detachable connection or integral connection; it may also be a mechanical connection, an electrical connection, or a direct connection, or may be indirectly connected through an intermediate medium, or may be a communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.
Example 1
Referring to fig. 1, a first embodiment of the present invention provides a concept lattice recommendation method based on user record feedback and search content, including:
s1: constructing a concept lattice of related text content according to search information input by a user;
specifically, the construction of the concept lattice of the related text content includes: constructing a form background T (U, G, I) of hidden keywords based on search content input by a user, wherein U is a set of users, G is a set of keywords, and I is a binary relation between U and G;
let the association matrix of the category label set B and the keyword set G be t= { T ij When B (B) i Contains keyword G j At time t ij =1, otherwise t ij Transpose T to obtain an association matrix T of the keyword set G and the category label set B B
Incidence matrix T based on keyword set G and category label set B B Obtaining vocabulary characterization based on category labels:
Figure BDA0004119407400000091
wherein v ij The value represents the keyword g i Whether or not to be included in category label B j In (B), when j Does not contain g i When v ij =0, otherwise ν ij =1, expressed as:
g i ={B j |g i ∈B j }
set keyword g 1 And g 2 Wherein g 1 Corresponding to m category labels, denoted g 1 ={B 11 ,B 12 ,...,B 1m },g 2 Corresponding to n category labels, denoted g 2 ={B 21 ,B 22 ,...,B 2n }。
It should be noted that since the recommender system requires enough information to generate an accurate recommendation, their performance in predicting the unseen items depends largely on the keywords searched by each user. In general, since keywords of a user's search are too simple, the accuracy of recommended information may be lowered. In this step, we propose an efficient strategy to enrich the content of the user search.
When a user enters a query, the recommendation information associated with the query field must be retrieved from a large number of recommendations available in the registry, which has heretofore been a challenging task. Therefore, a large-scale research work is being conducted in the field of recommendation systems.
Formal concept analysis may be viewed as analyzing the numerical structure of data by concept. A set of objects reflecting a common set of attributes is referred to as a concept. The group of objects in a concept that have similar attributes is referred to as the extension of the concept, while the attribute itself that builds the concept hierarchy is referred to as the connotation.
It should also be noted that the contact of the user input text may be measured by classifying tags. When these keywords co-appear in a label, it means that these keywords have a certain relevance. When the words are simultaneously displayed in a plurality of classification labels, the keywords are indicated to have strong relevance. Keyword g 1 And g is equal to 2 The similarity between the keywords is influenced by the number of category labels of the keywords and the semantic association degree between different labels in the category labels, and the similarity between the keywords is in direct proportion to the number of category labels and the large association degree between the labels.
S2: constructing a form background of hidden keywords based on search content input by a user, and screening list keywords according to similarity among the hidden keywords;
still further, the filtering list keywords include: calculating to obtain category label B of keyword input by user 1 Category label B with highest similarity 2 And calculate the value of B 2 The similarity between other keywords of the keyword(s) and the keywords input by the user is greater than or equal to a threshold value beta, and a word compared with the keywords input by the user is reserved, otherwise, the word is deleted from the list;
the similarity calculation formula between keywords can be expressed as:
Figure BDA0004119407400000101
wherein B is 1i G is g 1 Corresponding ith class label, B 2j G is g 2 Corresponding j-th category label, sim (B 1i ,B 2j ) Representing class label pairs (B) 1i ,B 2j ) Similarity of (2);
category label B i And B is connected with j The similarity between can be expressed as:
Figure BDA0004119407400000102
wherein B is i And B is connected with j Nodes respectively representing two category labels in the concept lattice structure; LCS means B i And B is connected with j Is the closest public parent node of (a); dep (LCS) represents the depth at which the LCS is located, dis (B) i ,B j ) Representing the path length.
S3: constructing a concept lattice according to user record feedback and user characteristics, grouping and reducing to construct a recommendation scoring matrix of the user item;
specifically, the user record feedback comprises a user search state and user history behavior; the user search state includes: search times, selection times, collection times and browsing times; the user history behavior comprises user history project comments and scores; the user characteristics include a search status value;
classifying the users in groups according to the behavior state of the users in the search recommendation page, if the user search time value exceeds 70% of the users, defining frequent use, if the user search time value exceeds 40% of the users, defining general use, otherwise defining little use;
if the user has interactive behaviors including 2 or more items on the recommended page content, the user is considered to be satisfied with the recommended content, and the score m is 5 points; if the user has interaction behavior containing 1 item on the recommended page content, the user is considered to have general satisfaction degree on the recommended content, and the score is [2,4]; if the user does not have interactive behavior on the recommended page content, the user is not satisfied with the recommended content, and the score m is less than 2; the interaction behavior comprises: clicking is performed; the stay time is longer than 30 seconds; and (5) collecting, selecting and copying page contents.
It should be noted that comments provided by different users for a particular item may represent key characteristics of the item; the history of user-written comments may suggest user preferences. The present invention recognizes that measuring the similarity between a comment written by a user and a comment provided for an item may reveal the similarity between the user's preference and the item's content.
It should also be noted that a reliable method is proposed in the present invention to analyze packet-based conceptual reduction, mainly considering two aspects: formal background fidelity and lattice fidelity.
Specifically, the packet reduction includes: use A T (g) Representing the attributes that object g has in formal background T, using O T (m) represents an object having an attribute m in the formal background T;
if (g, m '). Epsilon.T, then m'. Epsilon.A T (g) If (g ', m) ∈T, g' ∈O T (M) for G, M represents a set of objects/attributes, A T (G) Is defined as
Figure BDA0004119407400000113
A T (g i ),O T (M) is defined as->
Figure BDA0004119407400000114
O T (m i );
When there is no ambiguity, the subscript T is omitted; furthermore, x is used ij Such double subscript letters represent a binary relation (g i ,m j ) Incidence in the corresponding formal background; setting a formal background (G, M, I) in which there are n objects and M attributes in M, in a particular packet-based reduction, the object set G is divided into k disjoint groups S (1) ,S (2) ,S (3) ,...,S (k) Wherein S is (1) ∪S (2) ∪S (3) ∪...∪S (k) =g; objects from the same group are considered similar, and all g.epsilon.S (i) After restoration, replaced with an object ri, defining the reduced background fidelity as:
Figure BDA0004119407400000111
solving a conceptual lattice reduction based on grouping by utilizing an integer linear programming technology, wherein an integer linear programming model is defined as follows:
maximize c T x
subject to ax.ltoreq.b, x.gtoreq.0 and x.epsilon.Z n
Wherein x is called variable, is a vector to be determined, Z represents an integer, and A, b and c are coefficient matrixes or vectors forming constraints;
it should be noted that the main goal of the alternate form of packet-based reduction is to maximize the number of pairs o with the same opponent i Of the object o, i.e. a (o) =a (o i ). The target may be represented by the following global objective function. Here I ij Is a variable indicating whether modified i and j are the same or not, n is the size of the original object set.
Figure BDA0004119407400000112
This function calculation has at least one identical element o j Wherein j is the number of objects of (1)>i. It can be easily deduced that this is equivalent to a reduced number of objects based on the original form of the packet. Thus, the global goal of the ILP model is to maximize T.
Theoretically, all variable elements in the formal background can be modified, so we can allocate a total of n×m variables for a formal background with n objects and m attributes. However, we only tend to modify objects that are already similar. However, excessive modification of the formal background compromises the fidelity of the formal background and can result in significant computational costs.
The preference can be formally described as only when |A T (g 1 )-A(g 2 ) Only if i is smaller than the threshold epsilon will we consider modifying object g 1 And g 2 Let A (g) 1 )=A(g 2 ). That is, for one object g 1 If for any other object g in the original form background 2 We have |A T (g 1 )-A(g 2 ) The object is marked as unmodified, | > ε, we will not be expressing g 1 The event of an attribute assigns any variable.
Specifically, the global objective of the integer linear programming model is to maximize T, and the allocation variable rule is as follows: for object g i The rate of change x of (2) ij =b, if there is another object g k The method comprises the following steps:
|(A(g i )-A(g k ))Y(A(g k )-A(g i ))|≤ε r
wherein ε is r Is a set threshold value; and x is kj =1-B, x will be ij The marks being changeable, called c ij And assigning a variable v ij To represent x ij Whether or not modified; wherein B is set to true for the separation rule and false for the connection rule;
the basic constraint conditions of the integer linear programming model are as follows:
Figure BDA0004119407400000121
Figure BDA0004119407400000122
in addition, the following constraints need to be added to maintain formal background fidelity:
Figure BDA0004119407400000123
wherein ε is m To set upIs set to a threshold value of (2).
It should be noted that to ensure concept lattice fidelity, the method strictly follows a packet-merge flow, merges with a split or join policy, and applies the merge policy on a global scale.
For any formal form background t= (G, M, I). If we have g 1 ,g 2 E G and A (G) 1 )=A(g 2 ) Then G is removed from G 1 Or g 2 Without causing a change in the conceptual lattice structure. That is to say T' = (G- { G 1 },M,I),T″=(G-{g 2 -M, I), whereby L (T), L (T'), L (T ") are identical in structure.
The constructing a recommendation scoring matrix for user items includes: converting the recommendation scoring matrix into a binary scoring matrix, defining the scores of 4-5 as recommendation, setting the score to 1, defining the rest scores as non-recommendation, setting the score to 0, and regarding the matrix as a formal background and representing the matrix as C (U, I, S), wherein U, I, S respectively represents a user set, a project set and a relationship between a user and a project; after the user interest form background is obtained, a conceptual lattice structure model is built according to the binary relation between the object and the attribute in the user interest form background C.
S4: and carrying out prediction rating on the recommended items and user comments obtained based on the screened list keywords, substituting the rating into a recommendation rating matrix, and scoring the unrated items of the prediction target user, thereby completing recommendation.
Specifically, the prediction rating includes: the membership of a given comment r in the ith item is scored as μ based on the score i (r) and is calculated by the following simple heuristic method:
Figure BDA0004119407400000131
wherein the actual ratio (r) represents an actual rating score corresponding to the censor r in the dataset;
it should be noted that due to the uncertainty of assigning scores, a fuzzy segmentation process is employed to divide the entire review into five groups according to the score. In this approach, one review may belong to multiple collections, but with different degrees of membership.
It should also be noted that in this approach we construct a semantic vector for each review using the Word2vec embedding model. Word2vec is similar to an automatic encoder, encoding each Word in a vector, but rather than training the input Word by reconstruction, it trains with other words adjacent to the corpus. In this model, a training corpus is used, a target word and each word in its context are converted into a hot vector of vocabulary size or a determined subset thereof. In this vector, there is only one "1" at the position corresponding to the target word, and there is only 0 at all other positions. These vector sets will serve as input layers for the neural network. The output of the network is a single vector containing, for each word, a probability that a randomly selected nearby word is that vocabulary word.
After creating a vector for each word, an aggregate vector representing the entire review text should be constructed. There are various ways to obtain the final vector of the document, including the sum, average and average of the vectors. The average value of the vectors is used in the method to obtain the final vector.
Since there need only be two single vectors representing the user and the item, respectively. For this purpose, the vectors contained in one set should be aggregated into a single vector. Since the membership of general vectors is not equal in the set, as a first step, each vector is required to multiply its membership. Thus, a lower degree of review has less impact on constructing the final vector. Then, for f k The vectors contained in the vector are averaged to obtain a single vector.
Specifically, the kth fuzzy set belonging to a user u and an item i is denoted as f k (u) and f k (i) K is more than or equal to 1 and less than or equal to 5, each comment is converted into a semantic vector, and each f k (u) and f k (i) Has a set of vectors corresponding to f k Averaging the vectors contained therein to obtain a single vector, and finally obtaining the vector includingUser u's 5 vectors and item i's 5 vectors, each vector corresponding to one of the possible rating values, noted V k (u) and V k (i) The method comprises the steps of carrying out a first treatment on the surface of the For V k (u) and V k (i) The calculation is as follows:
V k (u)=Avg(μ k (r,u)×V k (r,u))
V k (i)=Avg(μ k (r,i)×V k (r,i))
wherein V is k (r, u) is the information pertaining to user u for f k Semantic vector, μ of the r comment of (u) k (r, u) is f k Membership in (u);
comparing the user u and item i vectors for each corresponding rating value, consider the rating value yielding the greatest similarity as the predictive rating for the u and i pair:
Figure BDA0004119407400000141
wherein V is k (u) and V k (i) The vectors belonging to user u and item i, respectively, the score corresponding to k), r is the predicted score of u for i.
Example 2
Referring to fig. 2-3, for one embodiment of the present invention, a concept lattice recommendation method based on user record feedback and search content is provided, and in order to verify the beneficial effects of the present invention, scientific demonstration is performed through simulation and comparison experiments.
The dataset used in this study was Amazon dataset (https:// www.amazon.com/movements-tv-dvd-bluray/b/ref = sd_alcat_movement = UTF8& node = 2625373011). Mean Absolute Error (MAE), directory coverage (CC), diversity and novelty were used as four well known evaluation metrics.
Table 1 reports the experimental results using the cinematic dataset, where the method achieved the best MAE, CC, diversity and novelty values compared to the other methods. The comparison of the performance of the Amazon-based dataset recommendation method, with best results shown in bold and second best results underlined, clearly shows that the method can significantly improve the quality of the recommendation from different evaluation indicators.
Table 1 experimental results using movie dataset
Figure BDA0004119407400000151
To verify the effectiveness of the methods presented herein in sparse scenarios, the available ratings in the sample dataset were randomly altered to zero to form 5 data sets with different sparsities, 93%, 96%, 97%, 98% and 99%, respectively. Then, the present method and the existing PCC re-prediction method were tested on 5 data sets with different sparsities, respectively, and fig. 2 compares MAEs of the different methods, respectively.
The number of items recommended to the user in the recommendation list (N in top-N) will have an impact on the diversity, novelty and the obtained value of the CC index. Therefore, the influence of different parameter N values on recommender performance was investigated experimentally. Fig. 3 is a result based on CC metrics. The figure shows that CC increases with increasing n.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.

Claims (10)

1. The concept lattice recommendation method based on user record feedback and search content is characterized by comprising the following steps:
constructing a concept lattice of related text content according to search information input by a user;
constructing a form background of hidden keywords based on search content input by a user, and screening list keywords according to similarity among the hidden keywords;
constructing a concept lattice according to user record feedback and user characteristics, grouping and reducing to construct a recommendation scoring matrix of the user item;
and carrying out prediction rating on the recommended items and user comments obtained based on the screened list keywords, substituting the rating into a recommendation rating matrix, and scoring the unrated items of the prediction target user, thereby completing recommendation.
2. The method for recommending concept grids based on user record feedback and search content according to claim 1, wherein constructing concept grids of related text content comprises: constructing a form background T (U, G, I) of hidden keywords based on search content input by a user, wherein U is a set of users, G is a set of keywords, and I is a binary relation between U and G;
let the association matrix of the category label set B and the keyword set G be t= { T ij When B (B) i Contains keyword G j At time t ij =1, otherwise t ij Transpose T to obtain an association matrix T of the keyword set G and the category label set B B
Incidence matrix T based on keyword set G and category label set B B Obtaining vocabulary characterization based on category labels:
Figure FDA0004119407380000011
wherein v ij The value represents the keyword g i Whether or not to be included in category label B j In (B), when j Does not contain g i When v ij =0, otherwise ν ij =1, expressed as:
g i ={B j |g i ∈B j }
set keyword g 1 And g 2 Wherein g 1 Corresponding to m category labels, denoted g 1 ={B 11 ,B 12 ,...,B 1m },g 2 Corresponding to n category labels, denoted g 2 ={B 21 ,B 22 ,...,B 2n }。
3. The conceptual lattice recommendation method based on user record feedback and search content according to claim 1 or 2, wherein the filtering list keywords comprise: calculating to obtain category label B of keyword input by user 1 Category label B with highest similarity 2 And calculate the value of B 2 The similarity between other keywords of the keyword(s) and the keywords input by the user is greater than or equal to a threshold value beta, and a word compared with the keywords input by the user is reserved, otherwise, the word is deleted from the list;
the similarity calculation formula between keywords can be expressed as:
Figure FDA0004119407380000021
wherein B is 1i G is g 1 Corresponding ith class label, B 2j G is g 2 Corresponding j-th category label, sim (B 1i ,B 2j ) Representing class label pairs (B) 1i ,B 2j ) Similarity of (2);
category label B i And B is connected with j The similarity between can be expressed as:
Figure FDA0004119407380000022
wherein B is i And B is connected with j Nodes respectively representing two category labels in the concept lattice structure; LCS means B i And B is connected with j Is the closest public parent node of (a); dep (LCS) represents the depth at which the LCS is located, dis (B) i ,B j ) Representing the path length.
4. The conceptual lattice recommendation method based on user recorded feedback and search content of claim 1, wherein the user recorded feedback comprises user search status and user history behavior; the user search state includes: search times, selection times, collection times and browsing times; the user history behavior comprises user history project comments and scores; the user characteristics include a search status value;
classifying the users in groups according to the behavior state of the users in the search recommendation page, if the user search time value exceeds 70% of the users, defining frequent use, if the user search time value exceeds 40% of the users, defining general use, otherwise defining little use;
if the user has interactive behaviors including 2 or more items on the recommended page content, the user is considered to be satisfied with the recommended content, and the score m is 5 points; if the user has interaction behavior containing 1 item on the recommended page content, the user is considered to have general satisfaction degree on the recommended content, and the score is [2,4]; if the user does not have interactive behavior on the recommended page content, the user is not satisfied with the recommended content, and the score m is less than 2; the interaction behavior comprises: clicking is performed; the stay time is longer than 30 seconds; and (5) collecting, selecting and copying page contents.
5. The conceptual lattice recommendation method based on user record feedback and search content of claim 1 or 4, wherein the grouping reduction comprises: use A T (g) Representing the attributes that object g has in formal background T, using O T (m) represents an object having an attribute m in the formal background T;
if (g, m '). Epsilon.T, then m'. Epsilon.A T (g) If (g ', m) ∈T, g' ∈O T (M) for G, M represents a set of objects/attributes, A T (G) Is defined as
Figure FDA0004119407380000023
O T (M) is defined as->
Figure FDA0004119407380000024
When there is no ambiguity, the subscript T is omitted; furthermore, x is used ij Such double subscript letters represent a binary relation (g i ,m j ) Incidence in the corresponding formal background; setting form background (G, M, I), n objects in G and M objects in MIn a particular packet-based reduction, the object set G is divided into k disjoint groups S (1) ,S (2) ,S (3) ,..., S(k) Wherein S is (1) ∪S (2) ∪S (3) ∪...∪S (k) =g; objects from the same group are considered similar, and all g.epsilon.S (i) After reduction, use an object r i Alternatively, the reduced background fidelity is defined as:
Figure FDA0004119407380000031
solving a conceptual lattice reduction based on grouping by utilizing an integer linear programming technology, wherein an integer linear programming model is defined as follows:
maximize c T x
subject toax.ltoreq.b, x.ltoreq.0 and x.epsilon.Z n
Wherein x is called variable, is a vector to be determined, Z represents an integer, and A, b and c are coefficient matrixes or vectors forming constraints;
the global goal of the integer linear programming model is to maximize T, and the distribution variable rule is as follows: for object g i The rate of change x of (2) ij =b, if there is another object g k, The method meets the following conditions:
|A(g i )-A(g k ))Y(A(g k )-A(g i ))|≤ε r
wherein ε is r Is a set threshold value; and x is kj =1-B, x will be ij The marks being changeable, called c ij And assigning a variable v ij To represent x ij Whether or not modified; wherein B is set to true for the separation rule and false for the connection rule;
the basic constraint conditions of the integer linear programming model are as follows:
Figure FDA0004119407380000032
Figure FDA0004119407380000033
in addition, the following constraints need to be added to maintain formal background fidelity:
Figure FDA0004119407380000034
wherein ε is m Is a set threshold.
6. The conceptual lattice recommendation method based on user record feedback and search content of claim 5, wherein constructing a recommendation scoring matrix for user items comprises: converting the recommendation scoring matrix into a binary scoring matrix, defining the scores of 4-5 as recommendation, setting the score to 1, defining the rest scores as non-recommendation, setting the score to 0, and regarding the matrix as a formal background and representing the matrix as C (U, I, S), wherein U, I, S respectively represents a user set, a project set and a relationship between a user and a project; after the user interest form background is obtained, a conceptual lattice structure model is built according to the binary relation between the object and the attribute in the user interest form background C.
7. The user record feedback and search content based concept lattice recommendation method of claim 6, wherein the prediction rating comprises: the membership of a given comment r in the ith item is scored as μ based on the score i (r) and is calculated by the following simple heuristic method:
Figure FDA0004119407380000041
wherein the actual ratio (r) represents an actual rating score corresponding to the censor r in the dataset;
the kth fuzzy set belonging to a user u and an item i is denoted as f k (u) and f k (i) K is more than or equal to 1 and less than or equal to 5, each comment is converted into a semantic vector, and each f k (u) and f k (i) Has a set of vectors corresponding to f k Averaging the vectors contained in the list to obtain a single vector, and finally obtaining 5 vectors comprising 5 vectors of the user u and the item i, wherein each vector corresponds to one possible rating value and is marked as V k (u) and V k (i) The method comprises the steps of carrying out a first treatment on the surface of the For V k (u) and V k (i) The calculation is as follows:
V k (u)=Avg(μ k (r,u)×V k (r,u))
V k (i)=Avg(μ k (r,i)×V k (r,i))
wherein V is k (r, u) is the information pertaining to user u for f k Semantic vector, μ of the r comment of (u) k (r, u) is f k Membership in (u);
comparing the user u and item i vectors for each corresponding rating value, consider the rating value yielding the greatest similarity as the predictive rating for the u and i pair:
Figure FDA0004119407380000042
wherein V is k (u) and V k (i) Vectors belonging to user u and item i, respectively, score scores corresponding to k,
Figure FDA0004119407380000043
is the predictive score of u for i.
8. A concept lattice recommendation system based on user record feedback and search content, comprising:
the text content concept lattice construction module is used for constructing concept lattices of related text content according to search information input by a user;
the screening module is used for constructing a form background of the hidden keywords based on search contents input by a user, and screening list keywords according to the similarity among the hidden keywords;
the user concept lattice construction module is used for constructing concept lattices according to user record feedback and user characteristics, and constructing a recommendation scoring matrix of user items by grouping reduction;
and the prediction recommendation module is used for carrying out prediction rating on the recommended items and user comments obtained based on the screened list keywords, substituting the rating into a recommendation scoring matrix and scoring the unscored items of the prediction target user so as to finish recommendation.
9. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer executable instructions, and the processor is configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the concept lattice recommendation method based on user record feedback and search content according to any one of claims 1 to 7.
10. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the user record feedback and search content based concept lattice recommendation method of any one of claims 1 to 7.
CN202310228763.0A 2023-03-10 2023-03-10 Concept lattice recommendation method and device based on user record feedback and search content Pending CN116204721A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310228763.0A CN116204721A (en) 2023-03-10 2023-03-10 Concept lattice recommendation method and device based on user record feedback and search content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310228763.0A CN116204721A (en) 2023-03-10 2023-03-10 Concept lattice recommendation method and device based on user record feedback and search content

Publications (1)

Publication Number Publication Date
CN116204721A true CN116204721A (en) 2023-06-02

Family

ID=86509380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310228763.0A Pending CN116204721A (en) 2023-03-10 2023-03-10 Concept lattice recommendation method and device based on user record feedback and search content

Country Status (1)

Country Link
CN (1) CN116204721A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521159A (en) * 2023-07-05 2023-08-01 中国科学院文献情报中心 Knowledge service platform zero code construction method and system based on scene driving
CN116756431A (en) * 2023-08-14 2023-09-15 西南石油大学 Information or article recommendation method based on approximate concepts under incomplete form background
CN117708437A (en) * 2024-02-05 2024-03-15 四川日报网络传媒发展有限公司 Recommendation method and device for personalized content, electronic equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521159A (en) * 2023-07-05 2023-08-01 中国科学院文献情报中心 Knowledge service platform zero code construction method and system based on scene driving
CN116521159B (en) * 2023-07-05 2023-09-01 中国科学院文献情报中心 Knowledge service platform zero code construction method and system based on scene driving
CN116756431A (en) * 2023-08-14 2023-09-15 西南石油大学 Information or article recommendation method based on approximate concepts under incomplete form background
CN116756431B (en) * 2023-08-14 2023-10-31 西南石油大学 Information or article recommendation method based on approximate concepts under incomplete form background
CN117708437A (en) * 2024-02-05 2024-03-15 四川日报网络传媒发展有限公司 Recommendation method and device for personalized content, electronic equipment and storage medium
CN117708437B (en) * 2024-02-05 2024-04-16 四川日报网络传媒发展有限公司 Recommendation method and device for personalized content, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Berahmand et al. A modified DeepWalk method for link prediction in attributed social network
CN110162706B (en) Personalized recommendation method and system based on interactive data clustering
Gasparetti et al. Community detection in social recommender systems: a survey
Luo et al. Personalized recommendation by matrix co-factorization with tags and time information
Abdollahi et al. Explainable restricted boltzmann machines for collaborative filtering
Xu et al. Web mining and social networking: techniques and applications
Chen et al. General functional matrix factorization using gradient boosting
CN116204721A (en) Concept lattice recommendation method and device based on user record feedback and search content
Anand et al. Folksonomy-based fuzzy user profiling for improved recommendations
JP5318034B2 (en) Information providing apparatus, information providing method, and information providing program
Chung et al. Categorization for grouping associative items using data mining in item-based collaborative filtering
Zhu Topic recommendation system using personalized fuzzy logic interest set
Alsalama A hybrid recommendation system based on association rules
Najafabadi et al. Tag recommendation model using feature learning via word embedding
Kakisim Enhancing attributed network embedding via enriched attribute representations
Liu et al. CSPM: Discovering compressing stars in attributed graphs
Zhang et al. Multi-view dynamic heterogeneous information network embedding
Sangeetha et al. Predicting personalized recommendations using GNN
CN113159976B (en) Identification method for important users of microblog network
De Bonis et al. Graph-based methods for Author Name Disambiguation: a survey
Ye et al. An interpretable mechanism for personalized recommendation based on cross feature
CN114970854A (en) Multi-view dynamic network link prediction method and system based on time sequence neighborhood aggregation
Faisal et al. Prediction of Movie Quality via Adaptive Voting Classifier
Chantamunee et al. Deep autoencoder on personalized facet selection
Dai et al. Contrastive Learning for User Sequence Representation in Personalized Product Search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination