Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and the detailed embodiments. It should be understood that the specific embodiments described herein are merely configured to illustrate the invention and are not configured to limit the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the invention by showing examples of the invention.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The concept of smart cities is more and more popular, and when information is overloaded, how to intelligently and intelligently find objects liked by users quickly and accurately is paid more and more attention. The recommendation algorithm can recommend commodities, stocks, news, film and television works and the like to the user, so that the consumption experience of the user is greatly improved. However, the current recommended articles to the user still have the problem of low accuracy.
The development of sparse linearity goes through several stages, and in order to better describe the sparse linear algorithm involved in the present invention, a plurality of derivative types of the sparse linear algorithm are described in turn below.
Firstly, a Top-N recommendation system is introduced, and the solution algorithms of the Top-N recommendation system are generally divided into two main categories: based on similarity (neighbor-based) and based on model (model-based). The proximity-based can calculate the approximation degree between users or between articles by various distance measurement modes, and the recommendation can be made according to k users (articles) which are the most similar to the users (articles), and the algorithm only uses the user behavior data, is easy to realize and has very fast operation. model-based generally refers to that matrix decomposition is carried out according to a user behavior data matrix or hidden variables of users and articles are learned by a model, the learned user matrix and the learned article matrix are multiplied to predict a result, and the neighbor-based has the advantages of high calculation speed and unsatisfactory recommendation effect. model-based algorithms recommend better results, but longer algorithm training times.
Based on the problems presented in the recommendation system described above, SLIM algorithms have been developed. The sparse linear expression algorithm SLIM combines the advantages of two types (matrix decomposition technology and learning based on object similarity) of collaborative filtering algorithms, has quality and efficiency, calculates an Item-Item aggregation coefficient matrix W by minimizing a loss function, and then generates a Top-N recommendation candidate set. The method not only improves the effect of the neighbor-based algorithm, but also improves the running time of the model-based algorithm.
The SSLIM algorithm combines the SLIM algorithm with the auxiliary information, and four combination methods and four text representation algorithms are adopted respectively, so that the recommendation generates a better effect; NSVD is combined with SLIM, and a matrix decomposition method is used to decompose the object-object similarity into two low-dimensional potential factor matrices when the object-object similarity is calculated, so that a good effect is achieved, but time is sacrificed; the Hoslim algorithm starts High-relevance (High-Order) item information, respectively learns a representative item-item sparse aggregation coefficient matrix S and an item-High-relevance item set S', and the result shows that the higher the item-item set relevance is, the better the recommendation effect is; the GLSLIM algorithm divides users into several subsets of users with higher relevance, makes decisions from both Global (Global) and Local (Local) angles, and optimizes the weight duty cycle by minimizing the loss function.
Aiming at the problems of data sparsity and model fitting, the invention provides improvement on the basis of a GLSLIM algorithm.
Fig. 1 is a flow chart of a method for determining a target object according to an embodiment of the present invention. As shown in fig. 1, the execution subject of the method is a server, and the method may include S101-S104, specifically as follows:
s101, acquiring evaluation information of a user on a first target object, wherein the evaluation information comprises comment text information and grading information.
For example, the scoring information may be used to represent a user's preference for the first target object, the comment text information may include text information having an emotional color, and the comment text information may further include information embodying the user's preference and the item attribute.
S102, determining an evaluation matrix of the first target object according to the evaluation information.
Optionally, in one embodiment, the evaluation matrix includes a word vector matrix and a scoring matrix, and determining the evaluation matrix of the first target object according to the evaluation information includes: determining a word vector matrix of the first target object according to the comment text information; and determining a scoring matrix of the first target object according to the scoring information.
On the one hand, according to the scoring information of the first target object from the user, a first user-merchant scoring matrix, namely a scoring matrix of the first target object, is established; on the other hand, according to comment text information of the user on the first target object, a word vector matrix of comment text corresponding to the merchant, namely, a word vector matrix of the first target object is established.
Here, the evaluation matrix is determined through comment text information and grading information, so that the problem of data sparsity is solved, and the recommendation effect is improved.
The step of determining the word vector matrix of the first target object according to the evaluation paper information may specifically include:
preprocessing the evaluation paper information, and determining emotion word samples and attribute word samples, wherein the emotion word samples comprise emotion word samples of positive emotion types and emotion word samples of negative emotion types; and determining a word vector matrix of the first target object according to the emotion word sample and the attribute word sample.
Illustratively, the emotion word samples of the positive emotion type may include "happy", "beautiful", and "like", etc.; the emotion word samples of negative emotion types may include "hard to eat", "bad", and "garbage", among others. The attribute word sample may include "skirt", "chafing dish" and "rock" etc.
S103, according to the linear sparse model and the evaluation matrix, determining the similarity between the target objects in a preset database, wherein the preset database comprises a first target object.
It will be appreciated that a plurality of target objects are pre-stored in the preset database, including the first target object. And determining the similarity between any two target objects in the preset database according to the linear sparse model and the evaluation matrix of the first target object.
Optionally, in one embodiment, determining the similarity between the target objects in the preset database according to the linear sparse model and the evaluation matrix includes:
first, a first similarity aggregate coefficient matrix and a second similarity aggregate coefficient matrix are determined for a word vector matrix and a scoring matrix according to a linear sparse model.
The step of determining the first similarity aggregate coefficient matrix and the second similarity aggregate coefficient matrix according to the word vector matrix and the scoring matrix by using the linear sparse model may specifically include:
determining a third similarity aggregation coefficient matrix and a fourth similarity aggregation coefficient matrix according to the word vector matrix and the scoring matrix; and iteratively updating the third similarity aggregation coefficient matrix and the fourth similarity aggregation coefficient matrix until a preset condition is met, so as to obtain a first similarity aggregation coefficient matrix and a second similarity aggregation coefficient matrix.
Specifically, a global merchant-merchant similarity aggregate coefficient matrix (i.e., a third similarity aggregate coefficient matrix) and a local merchant-merchant similarity aggregate coefficient matrix (i.e., a fourth similarity aggregate coefficient matrix) are predicted from the word vector matrix and the scoring matrix according to the linear sparse model. Clustering users into an initial P group by utilizing a tool CLUTO to form K user setsEssentially a method for obtaining the similarity between two users; wherein the third similarity aggregate coefficient matrix and the fourth similarity aggregate coefficient matrix are obtained by solving the following formula:
subject to s i ≥0
s ii =0
in the above formula, the first part is the true score r i And predictive scoringThe optimization idea of the model is realized by continuously reducing the difference between the two values; the second part and the third part are the similarity matrix S for the global object li Is characterized by selecting a regular constraint; the fourth part is a word vector matrix D based on comment text, which is constrained, and the fifth part and the sixth part are the similarity matrix +.>The characteristic selection of (a) is a regular constraint and the parameters are used for adjusting the proportion of each part.
And carrying out N times of iterative updating on the third similarity aggregation coefficient matrix and the fourth similarity aggregation coefficient matrix to obtain updated aggregation coefficient matrices, namely a first similarity aggregation coefficient matrix and a second similarity aggregation coefficient matrix, wherein N is a positive integer.
And then, determining the similarity between target objects in a preset database according to the first similarity aggregation coefficient matrix and the second similarity aggregation coefficient matrix.
Specifically, column vectors of the updated global model aggregate coefficient matrix are combined with weights g u Multiplying the column vectors of the aggregated coefficient matrix of the local model after updating with the weights 1-g u To determine the first target object and the preset numberSimilarity between target objects in the database. And determining the similarity between the first target object and the target object in the preset database according to the first similarity aggregation coefficient matrix and the second similarity aggregation coefficient matrix.
Illustratively, byAnd calculating the predictive score of the object by the user, wherein the higher the similarity between the first target object and the target object is, the higher the predictive score of the target object is, and the predictive score is arranged from large to small to obtain the recommendation score of the target object for each user.
S104, determining a second target object from a preset database according to the similarity.
After the similarity between the target objects in the preset database is determined, a second target object meeting the preset screening condition can be determined from the preset database according to the similarity. Wherein the preset screening condition may be an object with a specific feature.
Optionally, in an embodiment, the second target object is determined from the preset database according to a preset recommended number and the similarity.
The second target objects with the preset recommended number can be selected from a plurality of objects included in the preset database according to the sequence of the similarity from large to small.
Illustratively, as shown in fig. 2, the horizontal axis is used to represent a first target object recommended to the user, the vertical axis is used to represent recommendation accuracy, and is an index of evaluation object recommendation. It can be seen that the case where the target objects are 5, 10, 15 and 20 is represented by 4 sets of bar charts in the figure, respectively. Each set of bar graphs includes 3 bar graphs respectively representing recommendation accuracy of the target object determined based on the HFT, the GLSLIM and the target object determining method based on the formula (1) provided by the embodiment of the invention.
As is clear from fig. 2, the recommendation accuracy based on GLSLIM is greater than the recommendation accuracy based on HFT, and the recommendation accuracy based on the method for determining a target object provided by the embodiment of the present invention is greater than the recommendation accuracy based on GLSLIM. And the gap between the three is gradually increased along with the increase of the target objects with recommendation. Therefore, in the process of generating the object-object similarity matrix, the similarity of comment texts evaluated by all users of a certain object is considered, the linear relation between unscored objects and scored objects in the traditional similarity model is compared, and the comment text word vector matrix is constrained in the same form, so that the problem of data sparsity is solved, and the recommendation effect is improved.
As an implementation manner of the present invention, in order for the user to acquire the required information in time, after S104, the following steps may further be included:
and sending the second target object to the user equipment.
Pushing the determined second target object to the user equipment so as to enable the user to acquire the target object possibly interested in the second target object. The target object herein may include a merchant and/or an item.
According to the method for determining the target object, provided by the embodiment of the invention, the similarity between the object to be recommended and the first target object is determined according to the linear sparse model and the evaluation information of the user on the first target object, and the second target object to be recommended to the user is determined according to the similarity, so that the data sparseness problem is improved, and the accuracy of information recommendation is improved.
In addition, based on the above method for determining the target object, the embodiment of the invention further provides a device for determining the target object, which is specifically described in detail with reference to fig. 3.
Fig. 3 is a schematic structural diagram of a target object determining apparatus according to an embodiment of the present invention, and as shown in fig. 3, the apparatus 300 may include:
the obtaining module 310 is configured to obtain rating information of the first target object by the user, where the rating information includes comment text information and rating information.
The first determining module 320 is configured to determine an evaluation matrix of the first target object according to the evaluation information.
The second determining module 330 is configured to determine, according to the linear sparse model and the evaluation matrix, a similarity between the target objects in a preset database, where the preset database includes the first target object.
The third determining module 340 is configured to determine the second target object from the preset database according to the similarity.
As one example, the first determining module 320 is specifically configured to determine a word vector matrix of the first target object according to the comment text information; and determining a scoring matrix of the first target object according to the scoring information.
As an example, the second determining module 330 is specifically configured to determine the first similarity aggregate coefficient matrix and the second similarity aggregate coefficient matrix according to the linear sparse model on the word vector matrix and the scoring matrix; and determining the similarity between target objects in a preset database according to the first similarity aggregation coefficient matrix and the second similarity aggregation coefficient matrix.
As an example, the second determining module 330 is specifically configured to determine a third similarity aggregate coefficient matrix and a fourth similarity aggregate coefficient matrix according to the word vector matrix and the scoring matrix; and iteratively updating the third similarity aggregation coefficient matrix and the fourth similarity aggregation coefficient matrix until a preset condition is met, so as to obtain a first similarity aggregation coefficient matrix and a second similarity aggregation coefficient matrix.
As an example, the third determining module 340 is specifically configured to determine the second target object from the preset database according to a preset recommended number and the similarity.
The third determining module 340 is further configured to send the second target object to the user equipment.
The modules of the target object determining apparatus provided in this embodiment may implement the method in the example shown in fig. 1, and are not described herein for brevity.
According to the target object determining device provided by the embodiment of the invention, the similarity between the object to be recommended and the first target object is determined according to the linear sparse model and the evaluation information of the user on the first target object, and the second target object to be recommended to the user is determined according to the similarity, so that the data sparse problem is solved, and the accuracy of information recommendation is improved.
Fig. 4 is a schematic diagram of a hardware architecture according to an embodiment of the present invention.
The processing device may include a processor 401 and a memory 402 in which computer program instructions are stored.
The processor 401 may include a central processing unit (Central Processing Unit, PU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured as one or more integrated circuits implementing embodiments of the present invention.
Memory 402 may include mass storage for data or instructions. By way of example, and not limitation, memory 402 may comprise a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or universal serial bus (Universal Serial Bus, USB) Drive, or a combination of two or more of the foregoing. Memory 402 may include removable or non-removable (or fixed) media, where appropriate. Memory 402 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 402 is a non-volatile solid state memory. In a particular embodiment, the memory 402 includes Read Only Memory (ROM). The ROM may be mask programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these, where appropriate.
The processor 401 implements the method of determining the target object in the example shown in fig. 1 described above by reading and executing the computer program instructions stored in the memory 402.
In one example, the processing device may also include a communication interface 403 and a bus 410. As shown in fig. 3, the processor 401, the memory 402, and the communication interface 403 are connected by a bus 410 and perform communication with each other.
The communication interface 403 is mainly used to implement communication between each module, device, unit and/or apparatus in the embodiment of the present invention.
Bus 410 includes hardware, software, or both, coupling the components of the device to one another. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. Bus 410 may include one or more buses, where appropriate. Although embodiments of the invention have been described and illustrated with respect to a particular bus, the invention contemplates any suitable bus or interconnect.
The processing device may perform the method of the embodiments of the invention, thereby implementing the method described in connection with the example shown in fig. 1.
In addition, in conjunction with the method in the above embodiments, embodiments of the present invention may be implemented by providing a computer storage medium. The computer storage medium has stored thereon computer program instructions; which when executed by a processor, performs any of the methods of the above embodiments.
It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and shown, and those skilled in the art can make various changes, modifications and additions, or change the order between steps, after appreciating the spirit of the present invention.
Functional blocks shown in the above-described structural block diagrams may be implemented in software, and elements of the present invention are programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should also be noted that the exemplary embodiments mentioned in this disclosure describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, or may be performed in a different order from the order in the embodiments, or several steps may be performed simultaneously.
In the foregoing, only the specific embodiments of the present invention are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present invention is not limited thereto, and any equivalent modifications or substitutions can be easily made by those skilled in the art within the technical scope of the present invention, and they should be included in the scope of the present invention.