CN117973545A - Recommendation method, device, equipment and storage medium based on large language model - Google Patents
Recommendation method, device, equipment and storage medium based on large language model Download PDFInfo
- Publication number
- CN117973545A CN117973545A CN202410370340.7A CN202410370340A CN117973545A CN 117973545 A CN117973545 A CN 117973545A CN 202410370340 A CN202410370340 A CN 202410370340A CN 117973545 A CN117973545 A CN 117973545A
- Authority
- CN
- China
- Prior art keywords
- recommendation
- data
- training
- models
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 238000003860 storage Methods 0.000 title claims abstract description 10
- 230000004931 aggregating effect Effects 0.000 claims abstract description 19
- 238000012549 training Methods 0.000 claims description 113
- 239000013598 vector Substances 0.000 claims description 33
- 239000011159 matrix material Substances 0.000 claims description 27
- 238000004220 aggregation Methods 0.000 claims description 21
- 230000002776 aggregation Effects 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 13
- 238000012795 verification Methods 0.000 claims description 12
- 238000009826 distribution Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 abstract description 7
- 238000005516 engineering process Methods 0.000 abstract description 6
- 230000000694 effects Effects 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 238000000638 solvent extraction Methods 0.000 description 6
- 238000000354 decomposition reaction Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000010200 validation analysis Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000006116 polymerization reaction Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 108010001267 Protein Subunits Proteins 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/041—Abduction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- Bioethics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Medical Informatics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to a recommendation method and device based on a large language model, electronic equipment and a storage medium, wherein the recommendation method based on the large language model comprises the following steps: acquiring target data; invoking a plurality of pre-trained recommendation sub-models, and distributing attention weights to the adapters of each recommendation sub-model according to semantic understanding of the target data; the adapters are preconfigured in the corresponding recommendation sub-models; aggregating the recommendation sub-models according to the attention weight to determine a target recommendation model; and predicting the target data through the target recommendation model, and determining whether to recommend the target data to a user according to a prediction result. The application only needs to execute the reasoning output process once, has higher efficiency, and solves the problem of longer reasoning time of the recommendation model in the prior related technology.
Description
Technical Field
The present application relates to the field of data recommendation technologies, and in particular, to a recommendation method and apparatus based on a large language model, an electronic device, and a storage medium.
Background
Large language models (referred to as large models) exhibit excellent capabilities in text understanding and generation, enabling rapid alignment to specific tasks downstream. As a main channel of personalized recommendation, a recommendation system can understand users and articles by using a large model, and the recommendation paradigm based on the large language model is promoted. At present, efficient fine tuning of parameters by using recommendation data is a standard method for large language model recommendation. However, integrating the recommended data into a large language model increases the risk of personal data leakage through the vulnerability of the language model. In order to protect the privacy of users, especially vulnerable people, forgetting learning of large language model recommendations becomes crucial.
At present, research of large model recommended forgetting learning is not solved, and only part of research is aimed at large model forgetting learning. The method comprises the steps of modifying tags of forgetting data, and performing fine tuning on the modified tags to perform forgetting learning of a large model, or realizing approximate forgetting learning by reversing the tags of the forgetting data by utilizing the technology of contextual forgetting learning. There is also work to use the gradient upper body to eliminate the effect of forgetting data on the trained model. These methods are all approximate forgetting learning and cannot guarantee that the influence of the model caused by the complete deletion of the forgetting data is eliminated. And the large model recommends forgetting learning to accurately forget, and a certain section of history interaction information of the user is completely removed from the model.
Conventional accurate forgetting learning typically utilizes a retraining paradigm based on data partitioning. The training set is divided into a plurality of groups, corresponding sub-models are trained for each group, and finally the output of each sub-model is aggregated. However, because of the characteristics of the large model, the reasoning stage time is longer than that of the traditional model, so that the calculation output of a plurality of sub-models can cause serious time expenditure.
Aiming at the problem of long reasoning time of a recommendation model in the related technology, no effective solution is proposed at present.
Disclosure of Invention
In this embodiment, a recommendation method, device, electronic device and storage medium based on a large language model are provided, so as to solve the problem of long reasoning time of the recommendation model in the related technology.
In a first aspect, the present invention provides a recommendation method based on a large language model, including:
Acquiring target data;
invoking a plurality of pre-trained recommendation sub-models, and distributing attention weights to the adapters of each recommendation sub-model according to semantic understanding of the target data; the adapters are preconfigured in the corresponding recommendation sub-models;
Aggregating the recommendation sub-models according to the attention weight to determine a target recommendation model;
and predicting the target data through the target recommendation model, and determining whether to recommend the target data to a user according to a prediction result.
In some of these embodiments, the step of training the recommended sub-model comprises:
Acquiring a training set;
dividing the training set to determine a plurality of training data sets;
Acquiring a plurality of initial recommendation sub-models, and training the corresponding initial recommendation sub-models through a plurality of training data sets to obtain a plurality of recommendation sub-models; the number of initial recommended sub-models is the same as the number of training data sets.
In some of these embodiments, the acquiring the training set includes:
Acquiring sample data;
Deleting the data needing to be forgotten from the sample data, and determining a training set.
In some of these embodiments, partitioning the training set to determine a plurality of training data sets includes:
Obtaining a large language model, wherein the instruction of the large language model is to input a corresponding hidden vector according to the large language model, and the hidden vector is used for representing semantic understanding of the large language model to the model input;
inputting each sample data in the training set to the large language model to obtain a hidden vector of each sample data;
Dividing the training set according to the hidden vector of the sample data in the training set, and determining a plurality of training data sets.
In some embodiments, the training set is divided according to hidden vectors of sample data in the training set, and after determining a plurality of training data sets, the method includes:
For each sample data in the training set, determining a distance pair formed by the sample data and a central hidden vector of each training data set; the distance pairs are used for representing the distance between the sample data and the central hidden vector of the corresponding training data set;
And traversing all the distance pairs of the sample data from small to large according to the distance, and if the sample data in the current distance pair is not included in any training data set and the number of the sample data in the training data set corresponding to the central hidden vector in the current distance pair is smaller than the preset number, including the sample data in the current distance pair in the corresponding training data set.
In some of these embodiments, assigning an attention weight to the adapter of each of the recommendation sub-models based on semantic understanding of the target data includes:
obtaining sample data, and determining a similar verification set from the sample data; the similarity verification set comprises a plurality of sample data which are closest to semantic understanding of the target data;
and distributing attention weights to the adapters of each recommendation sub-model according to the prediction errors of each recommendation sub-model to the similarity verification set.
In some of these embodiments, the adapter includes a first low rank matrix and a second low rank matrix;
Aggregating the recommended submodels according to the attention weight, including:
Respectively aggregating the first low-rank matrix and the second low-rank matrix of the adapter in all the recommended submodels according to the attention weight;
Or, aggregating products of the first low-rank matrix and the second low-rank matrix of the adapters in all the recommended submodels according to the attention weight.
In a second aspect, the present invention provides a recommendation device based on a large language model, including:
the acquisition module is used for acquiring target data;
The distribution module is used for calling a plurality of pre-trained recommendation sub-models and distributing attention weights to the adapters of each recommendation sub-model according to semantic understanding of the target data; the adapters are preconfigured in the corresponding recommendation sub-models;
The aggregation module is used for aggregating the recommendation sub-models according to the attention weight and determining a target recommendation model;
And the prediction module is used for predicting the target data through the target recommendation model and determining whether to recommend the target data to the user according to a prediction result.
In a third aspect, the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the recommendation method based on the large language model according to the first aspect.
In a fourth aspect, the present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the large language model based recommendation method described in the first aspect.
Compared with the related art, the recommendation method based on the large language model is provided, the training set is divided, a plurality of recommendation sub-models are correspondingly trained, and then attention weights are distributed to the adapters of the plurality of recommendation sub-models according to semantic understanding of target data, so that the recommendation sub-models which are good at processing and are the same as or similar to the type of the target data can be distributed with higher attention weights. And then, the recommendation sub-models are aggregated according to the distributed attention weight, so that a target recommendation model is obtained. According to the method, on the basis of keeping the traditional grouping training of the recommendation sub-models according to the semantic understanding of data in the past learning, the characteristics of the recommendation sub-models using the efficient parameter learning adapter are utilized, a plurality of recommendation sub-models are aggregated into one target recommendation model according to different weights, and the target data is only inferred through the target recommendation model.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the other features, objects, and advantages of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a block diagram of a terminal hardware architecture for performing a large language model based recommendation method provided in the present invention;
FIG. 2 is a flow chart of a large language model based recommendation method of the present invention;
Fig. 3 is a block diagram showing the structure of a large language model-based recommender in accordance with the present invention.
Detailed Description
The present application will be described and illustrated with reference to the accompanying drawings and examples for a clearer understanding of the objects, technical solutions and advantages of the present application.
Unless defined otherwise, technical or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terms "a," "an," "the," "these" and similar terms in this application are not intended to be limiting in number, but may be singular or plural. The terms "comprising," "including," "having," and any variations thereof, as used herein, are intended to encompass non-exclusive inclusion; for example, a process, method, and system, article, or apparatus that comprises a list of steps or modules (units) is not limited to the list of steps or modules (units), but may include other steps or modules (units) not listed or inherent to such process, method, article, or apparatus. The terms "connected," "coupled," and the like in this disclosure are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as used herein means two or more. "and/or" describes an association relationship of an association object, meaning that there may be three relationships, e.g., "a and/or B" may mean: a exists alone, A and B exist together, and B exists alone. Typically, the character "/" indicates that the associated object is an "or" relationship. The terms "first," "second," "third," and the like, as referred to in this disclosure, merely distinguish similar objects and do not represent a particular ordering for objects.
The method embodiments provided in the present invention may be performed in a terminal, a computer or similar computing device. For example, the method is run on a terminal, and fig. 1 is a block diagram of a terminal hardware structure for executing a recommendation method based on a large language model provided in the present invention. As shown in fig. 1, the terminal may include one or more (only one is shown in fig. 1) processors 120 and a memory 140 for storing data, wherein the processors 120 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA. The terminal may further include a transmission device 160 for a communication function and an input-output device 180. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and is not intended to limit the structure of the terminal. For example, the terminal may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1.
The memory 140 may be used to store computer programs, for example, software programs of application software and modules, such as those corresponding to the large language model-based recommendation method in the present invention, and the processor 120 performs various functional applications and data processing, i.e., implements the above-described method, by running the computer programs stored in the memory 140. Memory 140 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 140 may further include memory located remotely from processor 120, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 160 is used to receive or transmit data via a network. The network includes a wireless network provided by a communication provider of the terminal. In one example, the transmission device 160 includes a network adapter (Network Interface Controller, simply referred to as a NIC) that may be connected to other network devices via a base station to communicate with the internet. In one example, the transmission device 160 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
In the present invention, there is provided a large language model based recommendation method, and fig. 2 is a flowchart of the large language model based recommendation method of the present invention, as shown in fig. 2, the flowchart includes the steps of:
step S201, target data is acquired.
Step S202, a plurality of pre-trained recommendation sub-models are called, and attention weights are distributed to the adapters of each recommendation sub-model according to semantic understanding of target data; the adapters are preconfigured in the corresponding recommendation sub-model.
Step S203, aggregating the recommendation sub-models according to the attention weight to determine a target recommendation model.
And S204, predicting the target data through the target recommendation model, and determining whether to recommend the target data to the user according to the prediction result.
In the above method, target data for a user is acquired, the target data representing an item that needs to be predicted whether to recommend to the user. Then, a plurality of pre-trained recommendation sub-models, which may be trained large language recommendation models, are invoked, after which different recommendation sub-models each are good at predicting different types of data (types being partitioned based on semantic understanding of the data). According to semantic understanding of the target data, attention weights are assigned to the adapters of the plurality of recommendation sub-models, so that recommendation sub-models that are good at handling the same type as or close to the target data can be assigned higher attention weights. Then, the recommendation sub-models are aggregated according to the assigned attention weights to obtain a target recommendation model (the nature of the aggregation of the recommendation sub-models is actually the aggregation of the adapters, and the final target recommendation model is formed according to the adapters formed by aggregation), and the target recommendation model can also be good at processing data with the same or similar types of target data. And finally, predicting the target data through the target recommendation model, and determining whether to recommend the target data to the user according to the prediction result. According to the method, on the basis of keeping the traditional grouping training of the recommendation sub-models according to the semantic understanding of data in the past learning, the characteristics of the recommendation sub-models using the efficient parameter learning adapter are utilized, a plurality of recommendation sub-models are aggregated into one target recommendation model according to different weights, and the target data is only inferred through the target recommendation model.
In some of these embodiments, the step of training the recommended sub-model includes: acquiring a training set; dividing the training set to determine a plurality of training data sets; acquiring a plurality of initial recommendation sub-models, and training the corresponding initial recommendation sub-models through a plurality of training data sets to obtain a plurality of recommendation sub-models; the number of initial recommended sub-models is the same as the number of training data sets.
Specifically, dividing the training set to determine a plurality of training data sets includes: obtaining a large language model, wherein the instruction of the large language model is to input a corresponding hidden vector according to the large language model, and the hidden vector is used for representing semantic understanding of the large language model to the model input; inputting each sample data in the training set into a large language model to obtain a hidden vector of each sample data; the training set is divided according to hidden vectors of sample data in the training set, and a plurality of training data sets are determined.
The specific way of dividing the training set according to the hidden vector of the sample data in the training set may include: and clustering and dividing hidden vectors of sample data in the training set by using a Kmeans algorithm to determine a plurality of training data sets.
In this embodiment, the training set for model training is classified in order to enable different recommended sub-models to be more adept at handling different types of data. Specifically, the semantic recognition can be performed on the sample data in the training set through a traditional large language model to obtain the hidden vector of the sample data, and then the sample data in the training set is divided according to the hidden vector. According to the divided multiple training data sets, training a corresponding number of initial recommendation sub-models, wherein the initial recommendation sub-models can be traditional large language recommendation models. And finally, a corresponding number of recommended sub-models are obtained, and the accuracy of the recommended sub-models is higher when the data with the same or similar training data set types are processed during self training.
Further, obtaining the training set includes: acquiring sample data; and determining and deleting the data needing to be forgotten in the sample data, and determining a training set.
The collection of acquired sample data is shown as D, and each sample data in the D is shown as (x, y), wherein x represents an input instruction in the training fine adjustment process, including historical interaction information of a user and sample articles, and y represents an output instruction in the training fine adjustment process, including real preference of the user on the sample articles. When the training set is acquired, the sample data is acquired first, the sample data needing to be forgotten is determined, and then the sample data needing to be forgotten is deleted to obtain the training set D r,Dr=D-D-r, wherein D -r represents a set of sample data needing to be forgotten. In the scheme, before training the recommendation sub-model, sample data needing to be forgotten is directly deleted, the effect of complete forgetting can be achieved, and a certain section of historical interaction information of a user is completely removed from the recommendation sub-model.
Further, the dividing the training set according to the hidden vector of the sample data, after determining a plurality of training data sets, further includes: for each sample data in the training set, determining a distance pair formed by the sample data and a central hidden vector of each preliminary training data set; the distance pairs are used for representing the distance between the central hidden vectors of the sample data and the corresponding training data set; and traversing all the distance pairs of the sample data according to the distances from small to large, and if the sample data in the current distance pair are not included in any training data set and the number of the sample data in the training data set corresponding to the central hidden vector in the current distance pair is smaller than the preset number, including the sample data in the current distance pair in the corresponding training data set.
In the process of classifying the training set, the quantity of sample data of different types may not be consistent, so that the quantity of sample data used for training different recommended sub-models is unbalanced, and a certain gap exists between the training effects of the different recommended sub-models. Thus, in this embodiment, after the training data sets are determined, a center hidden vector is determined from each of the already determined training data sets, the center hidden vector being most capable of reflecting the type of sample data in the training data set. Then, the distance between the hidden vector of each sample data in the training set and the central hidden vector of each training data set is calculated as follows:
Wherein, Input instruction representing ith sample data in training set,/>A central hidden vector representing the kth training data set. Taking the training set D r as an example, when classifying, the training set D r is divided into K training data setsTherefore, when the distance calculation is performed, it is possible to generate/>The distance pairs are stored in the set F in ascending order according to the size of the distance therein. Traversing from the first distance pair in set F while satisfying the/>Individual packet size/>Less than a preset number/>And/>Bar sample data/>Without any grouping, the piece of sample data is included in the training data group/>, which is the most similar in categoryIn, i.e./>. Through the operation, on one hand, the sample data of the falling bill can be combined into the training data set for training, and on the other hand, the number of the sample data in the training data set with fewer original samples can be gradually increased, so that all recommended submodels can obtain a relatively balanced training effect.
Different target data requires different levels (types) of knowledge from the adapters in different recommendation sub-models to make accurate predictions, indicating that adaptive weight allocation is required when aggregating recommendation sub-models. To avoid introducing additional training and forgetting learning, in some of these embodiments, assigning attention weights to the adapters of each recommended sub-model based on semantic understanding of the target data includes: acquiring sample data, and determining a similar verification set from the sample data; the similarity verification set comprises a plurality of sample data which are closest to semantic understanding of the target data; and distributing attention weights to the adapters of each recommendation sub-model according to the prediction errors of each recommendation sub-model to the similarity verification set.
Specifically, for a certain target dataWe first identify from the verification set the most similar front/>The method for calculating the similarity adopts cosine similarity and is similar to a method based on semantic division to obtain a similarity verification set. Wherein the validation set is derived from the sample data. The adapters for each recommended sub-model are then measured for their similarity validation set/>The error in this is calculated as follows:
Wherein, Representing a similarity validation set/>Size,/>Representing the average prediction error of the adapter in the kth recommended sub-model,/>Representing the original parameter weight of a large language model,/>Representing the parameter weights of the adapters in the kth recommendation sub-model. In order to be able to improve the final prediction accuracy of the target data, a higher attention weight is assigned to the adapter of the recommended submodel with lower prediction error, expressed as follows:
Wherein, Attention weight representing the k-th recommendation sub-model score,/>Representing temperature parameters for controlling the intensity of the weighting method, when/>When all the adapters of the recommendation sub-model distribute equal attention weights; /(I)Represents the/>Average prediction error of the adapters in the recommended sub-model.
In some of these embodiments, the adapter employs a low rank adapter (LoRA) comprising a first low rank matrixAnd a second low rank matrix/>Wherein/>,/>Respectively representing the original input and output dimensions of the model, and r represents the middle dimension of the low-rank matrix. The first low-rank matrix and the second low-rank matrix can act on each layer of the transducer network, and input of a certain layer is changed, so that the effect of style migration is achieved. Aggregating the recommended submodels according to the attention weight, including: respectively aggregating the first low-rank matrix and the second low-rank matrix of the adapter in all the recommended submodels according to the attention weight; or, aggregating the products of the first low rank matrix and the second low rank matrix of the adapters in all recommendation sub-models according to the attention weight.
Specifically, obtainAfter the adapter is used, the embodiment proposes an aggregation policy of two levels, the first level is a decomposition level, on which each attention weight of the adapter serves as a unit of model aggregation, and a specific aggregation expression is as follows:
Wherein, A second low rank matrix representing aggregated adapters,/>A second low rank matrix representing adapters in the kth recommendation sub-model,/>First low rank matrix representing aggregated adapters,/>First low rank matrix representing adapters in kth recommendation sub-model,/>Indicating the attention weight assigned by the adapter in the kth recommendation sub-model, the higher the attention weight, the higher the attention of the recommendation sub-model.
For the target recommendation model obtained after aggregation, after the instruction x is input, the output result can be expressed as the following expression:
Wherein, Output of target recommendation model representing adapters added with aggregationRepresenting the weight of the large model itself.
The second aggregation level is a non-decomposition level, on this level, the large language model weight unit acts as a unit for adapter aggregation, then, the aggregation result is obtained by weighting and aggregating the output of adapters of different recommendation sub-models at a certain level, and the expression of the aggregation process is as follows:
Wherein, Representing the product of the first low rank matrix and the second low rank matrix.
For the target recommendation model obtained after aggregation, after the instruction x is input, the output result can be expressed as the following expression:
Wherein the method comprises the steps of Output of target recommendation model representing adapters added with aggregationRepresenting the weight of the large model itself.
To verify the effectiveness of the invention in forgetting learning, experiments were performed in MovieLens-100K public dataset and Bookcrossing public dataset. Both types of data sets randomly select 1024 pieces of user data as training sets. The verification test adopts TALLREC model as skeleton model, and the method is compared with other forgetting learning methods. The method of comparison includes SISA, graphEraser and RECERASER, where SISA is the earliest method of data partitioning forgetting learning by randomly partitioning data and aggregating submodels by averaging or majority voting, etc. GRAPHERASER is a forgetting learning method specially designed for the data of the graph structure, which adopts the point clustering technology for graph data division. RECERASER is a forgetting learning method specifically designed for a recommendation system, similar to SISA, but employing a unique partitioning strategy RECERASER to preserve collaborative information when partitioning data. The forgetting learning method is RETRAINING, sample data needing to be forgotten is directly deleted, the recommended sub-model is retrained, but RETRAINING adopts a single adapter for reasoning. The final results are shown in table 1:
TABLE 1 Performance of different forgetting learning recommendation methods on public data sets
In this experiment, the performance of each forgetting learning method was evaluated using the AUC index, and in table 1, ours (D) represents a target recommendation model obtained after polymerization at the decomposition level using the present method, and Ours (ND) represents a target recommendation model obtained after polymerization at the non-decomposition level using the present method. As can be seen from table 1, compared with the traditional method, the recommendation method provided by the invention can achieve a better recommendation effect in a scene of large model recommendation forgetting learning.
In addition, the aggregation methods of the two layers provided by the invention have similar effects, and then, in order to verify the performance improvement of the invention on the reasoning efficiency, only one of the aggregated target recommendation models of the layers (the decomposition layers) is adopted to compare with other recommendation methods (SISA). The test uses 500 test samples to make statistics of reasoning time, and the results are shown in Table 2:
TABLE 2 inference efficiency manifestation of different recommendation methods on public data sets
As can be seen from table 2, the time consumed by the aggregated target recommendation model of the present invention is much less than that of the existing recommendation method, and although the time is longer than that of the RETRAINING method for directly reasoning by a single adapter, the delay of the target recommendation model is only about 0.02s for each sample, and in practical application, the delay is completely acceptable.
In conclusion, the recommendation sub-models are aggregated according to different attention weights on the basis of grouping the training sets, so that the target recommendation model is obtained, and the prediction reasoning is carried out through the target recommendation model, so that the time required by the reasoning can be effectively reduced, and the reasoning efficiency is improved. In addition, the invention directly deletes the sample data which needs to be forgotten, can achieve the effect of accurate forgetting, and can provide more accurate and efficient forgetting learning schemes for the fields of electronic commerce platforms, social media platforms, online video platforms, music streaming media platforms and the like on the premise of protecting the privacy of users.
It should be noted that the steps illustrated in the above-described flow or flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order other than that illustrated herein.
The invention also provides a recommendation device based on a large language model, which is used for realizing the above embodiment and the preferred implementation, and the description is omitted. The terms "module," "unit," "sub-unit," and the like as used below may refer to a combination of software and/or hardware that performs a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware, are also possible and contemplated.
FIG. 3 is a block diagram of a large language model based recommendation apparatus of the present invention, as shown in FIG. 3, comprising:
an acquisition module 301, configured to acquire target data;
The allocation module 302 is configured to invoke a plurality of pre-trained recommendation sub-models, and allocate attention weights to the adapters of each recommendation sub-model according to semantic understanding of the target data; the adapters are preconfigured in the corresponding recommendation sub-models;
the aggregation module 303 is configured to aggregate the recommendation sub-models according to the attention weight, and determine a target recommendation model;
and the prediction module 304 is configured to predict the target data through the target recommendation model, and determine whether to recommend the target data to the user according to the prediction result.
In the device, a plurality of pre-trained recommendation sub-models are called for target data input by a user, the recommendation sub-models can be trained large language recommendation models, and after training, different recommendation sub-models are good at predicting data of different types (types are divided according to semantic understanding of the data). According to semantic understanding of the target data, attention weights are assigned to the adapters of the plurality of recommendation sub-models, so that recommendation sub-models that are good at handling the same type as or close to the target data can be assigned higher attention weights. Then, the recommendation sub-models are aggregated according to the assigned attention weights to obtain a target recommendation model (the nature of the aggregation of the recommendation sub-models is actually the aggregation of the adapters, and the final target recommendation model is formed according to the adapters formed by aggregation), and the target recommendation model can also be good at processing data with the same or similar types of target data. And finally, predicting the target data through the target recommendation model, and determining whether to recommend the target data to the user according to the prediction result. In the method, on the basis of keeping the traditional grouping training of the recommendation sub-models according to the semantic understanding of data in the past learning, the characteristics of the recommendation sub-models using the efficient parameter learning adapter are utilized, a plurality of recommendation sub-models are aggregated into one target recommendation model according to different weights, and the target data is only inferred through the target recommendation model.
The above-described respective modules may be functional modules or program modules, and may be implemented by software or hardware. For modules implemented in hardware, the various modules described above may be located in the same processor; or the above modules may be located in different processors in any combination.
There is also provided in the invention an electronic device comprising a memory in which a computer program is stored and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
Optionally, the electronic device may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.
Alternatively, in one embodiment, the processor may be arranged to perform the following steps by a computer program:
S1, acquiring target data.
S2, a plurality of pre-trained recommendation sub-models are called, and attention weights are distributed to the adapters of each recommendation sub-model according to semantic understanding of target data; the adapters are preconfigured in the corresponding recommendation sub-model.
And S3, aggregating the recommendation sub-models according to the attention weight, and determining a target recommendation model.
S4, predicting the target data through the target recommendation model, and determining whether to recommend the target data to the user according to a prediction result.
It should be noted that, the specific examples of the present electronic device may refer to examples described in the embodiments and the optional implementations of the method, and are not described in detail in this embodiment.
In addition, in combination with the large language model-based recommendation method provided in the present invention, a computer-readable storage medium may be provided in the present invention. The computer readable storage medium has a computer program stored thereon; the computer program, when executed by a processor, implements any of the large language model based recommendation methods of the above embodiments.
It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to be limiting. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure in accordance with the embodiments provided herein.
It is to be understood that the drawings are merely illustrative of some embodiments of the present application and that it is possible for those skilled in the art to adapt the present application to other similar situations without the need for inventive work. In addition, it should be appreciated that while the development effort might be complex and lengthy, it would nevertheless be a routine undertaking of design, fabrication, or manufacture for those of ordinary skill having the benefit of this disclosure, and thus should not be construed as a departure from the disclosure.
The term "embodiment" in this disclosure means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive. It will be clear or implicitly understood by those of ordinary skill in the art that the embodiments described in the present application can be combined with other embodiments without conflict.
Claims (10)
1. A large language model based recommendation method, comprising:
Acquiring target data;
invoking a plurality of pre-trained recommendation sub-models, and distributing attention weights to the adapters of each recommendation sub-model according to semantic understanding of the target data; the adapters are preconfigured in the corresponding recommendation sub-models;
Aggregating the recommendation sub-models according to the attention weight to determine a target recommendation model;
and predicting the target data through the target recommendation model, and determining whether to recommend the target data to a user according to a prediction result.
2. The large language model based recommendation method according to claim 1, wherein the step of training the recommendation sub-model comprises:
Acquiring a training set;
dividing the training set to determine a plurality of training data sets;
Acquiring a plurality of initial recommendation sub-models, and training the corresponding initial recommendation sub-models through a plurality of training data sets to obtain a plurality of recommendation sub-models; the number of initial recommended sub-models is the same as the number of training data sets.
3. The large language model based recommendation method according to claim 2, wherein said obtaining a training set comprises:
Acquiring sample data;
Deleting the data needing to be forgotten from the sample data, and determining a training set.
4. The large language model based recommendation method of claim 2, wherein dividing the training set to determine a plurality of training data sets comprises:
Obtaining a large language model, wherein the instruction of the large language model is to input a corresponding hidden vector according to the large language model, and the hidden vector is used for representing semantic understanding of the large language model to the model input;
inputting each sample data in the training set to the large language model to obtain a hidden vector of each sample data;
and dividing the training set according to the hidden vector of the sample data to determine a plurality of training data sets.
5. The large language model based recommendation method according to claim 4, wherein said dividing said training set by hidden vectors of said sample data, after determining a plurality of training data sets, further comprises:
For each sample data in the training set, determining a distance pair formed by the sample data and a central hidden vector of each training data set; the distance pairs are used for representing the distance between the sample data and the central hidden vector of the corresponding training data set;
And traversing all the distance pairs of the sample data from small to large according to the distance, and if the sample data in the current distance pair is not included in any training data set and the number of the sample data in the training data set corresponding to the central hidden vector in the current distance pair is smaller than the preset number, including the sample data in the current distance pair in the corresponding training data set.
6. The large language model based recommendation method according to claim 1, wherein assigning attention weights to the adapters of each of the recommendation sub-models according to semantic understanding of the target data comprises:
obtaining sample data, and determining a similar verification set from the sample data; the similarity verification set comprises a plurality of sample data which are closest to semantic understanding of the target data;
and distributing attention weights to the adapters of each recommendation sub-model according to the prediction errors of each recommendation sub-model to the similarity verification set.
7. The large language model based recommendation method of claim 1, wherein said adapter comprises a first low rank matrix and a second low rank matrix;
Aggregating the recommended submodels according to the attention weight, including:
Respectively aggregating the first low-rank matrix and the second low-rank matrix of the adapter in all the recommended submodels according to the attention weight;
Or, aggregating products of the first low-rank matrix and the second low-rank matrix of the adapters in all the recommended submodels according to the attention weight.
8. A large language model based recommendation device, comprising:
the acquisition module is used for acquiring target data;
The distribution module is used for calling a plurality of pre-trained recommendation sub-models and distributing attention weights to the adapters of each recommendation sub-model according to semantic understanding of the target data; the adapters are preconfigured in the corresponding recommendation sub-models;
The aggregation module is used for aggregating the recommendation sub-models according to the attention weight and determining a target recommendation model;
And the prediction module is used for predicting the target data through the target recommendation model and determining whether to recommend the target data to the user according to a prediction result.
9. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the large language model based recommendation method of any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor realizes the steps of the large language model based recommendation method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410370340.7A CN117973545B (en) | 2024-03-29 | 2024-03-29 | Recommendation method, device, equipment and storage medium based on large language model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410370340.7A CN117973545B (en) | 2024-03-29 | 2024-03-29 | Recommendation method, device, equipment and storage medium based on large language model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117973545A true CN117973545A (en) | 2024-05-03 |
CN117973545B CN117973545B (en) | 2024-08-06 |
Family
ID=90858440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410370340.7A Active CN117973545B (en) | 2024-03-29 | 2024-03-29 | Recommendation method, device, equipment and storage medium based on large language model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117973545B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115630136A (en) * | 2022-10-11 | 2023-01-20 | 合肥讯飞数码科技有限公司 | Semantic retrieval and question-answer processing method and device for long text and electronic equipment |
KR20230129875A (en) * | 2022-03-02 | 2023-09-11 | 네이버 주식회사 | Method and system for goods recommendation |
WO2023196758A1 (en) * | 2022-04-04 | 2023-10-12 | Bespoke Analytics, Llc | Evidence-referenced recommendation engine to provide lifestyle guidance and to define health metrics |
CN117194986A (en) * | 2023-09-22 | 2023-12-08 | 北京三快网络科技有限公司 | Information recommendation model training method and device, storage medium and electronic equipment |
CN117273173A (en) * | 2023-11-21 | 2023-12-22 | 中国科学技术大学 | Entity recommendation method, device, equipment and storage medium based on large language model |
-
2024
- 2024-03-29 CN CN202410370340.7A patent/CN117973545B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20230129875A (en) * | 2022-03-02 | 2023-09-11 | 네이버 주식회사 | Method and system for goods recommendation |
WO2023196758A1 (en) * | 2022-04-04 | 2023-10-12 | Bespoke Analytics, Llc | Evidence-referenced recommendation engine to provide lifestyle guidance and to define health metrics |
CN115630136A (en) * | 2022-10-11 | 2023-01-20 | 合肥讯飞数码科技有限公司 | Semantic retrieval and question-answer processing method and device for long text and electronic equipment |
CN117194986A (en) * | 2023-09-22 | 2023-12-08 | 北京三快网络科技有限公司 | Information recommendation model training method and device, storage medium and electronic equipment |
CN117273173A (en) * | 2023-11-21 | 2023-12-22 | 中国科学技术大学 | Entity recommendation method, device, equipment and storage medium based on large language model |
Non-Patent Citations (2)
Title |
---|
EDWARD HU 等: "LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS", ARXIV:2106.09685V2, 16 October 2021 (2021-10-16), pages 1 - 26 * |
骆仕杰 等: "采用低秩编码优化大语言模型的高校基础知识问答研究", 计算机科学与探索, 11 September 2023 (2023-09-11), pages 1 - 16 * |
Also Published As
Publication number | Publication date |
---|---|
CN117973545B (en) | 2024-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021179838A1 (en) | Prediction method and system based on heterogeneous graph neural network model | |
US20210256403A1 (en) | Recommendation method and apparatus | |
CN110119474B (en) | Recommendation model training method, prediction method and device based on recommendation model | |
CN110009486B (en) | Method, system, equipment and computer readable storage medium for fraud detection | |
CN111445020B (en) | Graph-based convolutional network training method, device and system | |
US20200202430A1 (en) | Recommending shared products | |
CN110020022A (en) | Data processing method, device, equipment and readable storage medium storing program for executing | |
US10147103B2 (en) | System and method for a scalable recommender system using massively parallel processors | |
CN112639841B (en) | Sampling scheme for policy searching in multiparty policy interactions | |
Li et al. | Resource scheduling based on improved spectral clustering algorithm in edge computing | |
CN109325530B (en) | Image classification method, storage device and processing device | |
WO2023087914A1 (en) | Method and apparatus for selecting recommended content, and device, storage medium and program product | |
CN114764471A (en) | Recommendation method, recommendation device and storage medium | |
Jin et al. | Neighborhood-aware web service quality prediction using deep learning | |
CN111339435B (en) | Matrix decomposition completion hybrid recommendation method based on potential factors | |
CN113159213A (en) | Service distribution method, device and equipment | |
CN117973545B (en) | Recommendation method, device, equipment and storage medium based on large language model | |
CN110942345B (en) | Seed user selection method, device, equipment and storage medium | |
CN114092162B (en) | Recommendation quality determination method, and training method and device of recommendation quality determination model | |
CN110321476B (en) | Parallel matrix decomposition calculation implementation method based on locality sensitive hashing | |
CN114357180A (en) | Knowledge graph updating method and electronic equipment | |
CN114676311A (en) | Recommendation method, recommendation device, storage medium and terminal equipment | |
CN112036418A (en) | Method and device for extracting user features | |
CN111191675A (en) | Pedestrian attribute recognition model implementation method and related device | |
CN111414535B (en) | Method and device for recommending target object to user |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |