CN115982425A - Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium - Google Patents

Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115982425A
CN115982425A CN202211609979.3A CN202211609979A CN115982425A CN 115982425 A CN115982425 A CN 115982425A CN 202211609979 A CN202211609979 A CN 202211609979A CN 115982425 A CN115982425 A CN 115982425A
Authority
CN
China
Prior art keywords
recommended
data
attribute
semantic
recommendation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211609979.3A
Other languages
Chinese (zh)
Inventor
于敬
陈运文
刘文海
石京京
李文聪
熊凡
纪达麒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Daguan Data Suzhou Co ltd
Original Assignee
Daguan Data Suzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daguan Data Suzhou Co ltd filed Critical Daguan Data Suzhou Co ltd
Priority to CN202211609979.3A priority Critical patent/CN115982425A/en
Publication of CN115982425A publication Critical patent/CN115982425A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a recommendation model training method, a recommendation device, an electronic device and a storage medium, wherein the recommendation model training method comprises the following steps: acquiring sample data of an object to be recommended including multi-attribute dimensional information of the object to be recommended and current object data of a current object; performing semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics; the comprehensive semantic embedded features consist of basic attribute semantic embedded features and attribute cross semantic embedded features of a plurality of single attributes; generating a semantic embedding feature matrix of sample data according to the comprehensive semantic embedding features, wherein the sample data is formed by splicing sample data of an object to be recommended and the sample data of the current object; and inputting the semantic embedded features serving as training data into a recommendation model for training. According to the technical scheme of the embodiment of the invention, the accuracy of the recommendation model is improved, the comprehensiveness of the recommendation result is ensured, and the overall effect of the recommendation result is improved.

Description

Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of recommendation, in particular to a recommendation model training method, a recommendation device, electronic equipment and a storage medium.
Background
With the rapid development of information technology and the rise of mobile internet, people have entered the mass data age. When facing various services such as full-fledged and various goods, movies, songs, videos, and news flashes every day, there is a case where information of interest cannot be found quickly. The intelligent recommendation technology can recommend some recommended objects similar or related to the content of the user based on the object currently browsed by the user in the form of an algorithm model.
At present, technicians usually adopt a recommendation method based on text content matching, a method for calculating similarity based on text vectors, and a matching method for calculating correlation based on text features when performing intelligent recommendation.
In the process of implementing the invention, the inventor finds that the prior art has the following problems: although the recommendation method based on text content matching can ensure high correlation of recommended content, the surprise of the recommendation result is poor due to high content matching, a reliable recommendation result cannot be provided for a new user, and meanwhile, the recommendation result of problems such as sparse data and complex attribute processing is not ideal. The method for calculating similarity based on text vectors usually generates text vectors based on methods such as a topic model and the like, and is mostly realized uniformly based on a single vector generation model, but the text lengths in attribute fields of recommendation objects are different, generally the text length is greater than the title length and greater than the label length, so that a differential generation mode is needed to bring better recommendation results. The matching method for calculating the relevance based on the text features only considers the text features of the recommended objects and ignores other features related to the text features.
Disclosure of Invention
The embodiment of the invention provides a recommendation model training method, a recommendation device, electronic equipment and a storage medium, which improve the accuracy of a recommendation model, ensure the comprehensiveness of relevant recommendation results and improve the overall effect of the recommendation results.
According to an aspect of the present invention, there is provided a method of recommending model training, including:
acquiring sample data of an object to be recommended including multi-attribute dimensional information of the object to be recommended and current object data of a current object;
performing semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics; the comprehensive semantic embedded features consist of basic attribute semantic embedded features and attribute cross semantic embedded features of a plurality of single attributes;
generating a semantic embedding feature matrix of sample data according to the comprehensive semantic embedding feature, wherein the sample data is formed by splicing the sample data of the object to be recommended and the sample data of the current object;
and inputting the semantic embedded features serving as training data into a recommendation model for training.
According to another aspect of the present invention, there is provided a recommendation method including:
acquiring current object associated data of a current object and alternative object associated data of an alternative recommended object;
inputting the current object associated data of the current object and the alternative object associated data of the alternative recommended objects into a recommendation model to obtain the similarity of each alternative recommended object and the current object;
determining a target recommendation object matched with the current object according to the similarity of each candidate recommendation object and the current object;
the recommendation model is obtained by training through the recommendation model training method in any embodiment of the invention.
According to another aspect of the present invention, there is provided a recommendation model training apparatus including:
the system comprises a module for acquiring sample data of an object to be recommended, a module for acquiring sample data of the object to be recommended, and current object data of a current object, wherein the sample data of the object to be recommended comprises multi-attribute dimension information of the object to be recommended;
the semantic embedding processing module is used for carrying out semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics; the comprehensive semantic embedded features consist of basic attribute semantic embedded features and attribute cross semantic embedded features of a plurality of single attributes;
the semantic embedded characteristic matrix generating module is used for generating a semantic embedded characteristic matrix of sample data according to the comprehensive semantic embedded characteristic, wherein the sample data is formed by splicing the sample data of the object to be recommended and the sample data of the current object;
and the recommendation model training module is used for inputting the semantic embedded features serving as training data into a recommendation model for training.
According to another aspect of the present invention, there is provided a recommendation apparatus including:
the object associated data acquisition module is used for acquiring current object associated data of a current object and alternative object associated data of alternative recommended objects;
a similarity obtaining module, configured to input current object association data of the current object and candidate object association data of the candidate recommended objects into a recommendation model, so as to obtain a similarity between each of the candidate recommended objects and the current object;
the target recommended object determining module is used for determining a target recommended object matched with the current object according to the similarity between each candidate recommended object and the current object;
the recommendation model is obtained by training through the recommendation model training method in any embodiment of the invention.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the recommendation model training method or recommendation method as described in any embodiment of the invention.
According to another aspect of the present invention, a computer-readable storage medium is provided, which stores computer instructions for causing a processor to implement the recommendation model training method or the recommendation method described in the embodiments of the present invention when executed.
According to the technical scheme of the embodiment of the invention, sample data of an object to be recommended and current object data of the current object are obtained, wherein the sample data of the object to be recommended comprises multi-attribute dimension information of the object to be recommended, semantic embedding processing is carried out on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics, a semantic embedding characteristic matrix of the sample data is generated according to the comprehensive semantic embedding characteristics, and the semantic embedding characteristics are input into a recommendation model as training data to be trained. Correspondingly, after the training of the recommendation model is completed, current object associated data of the current object and alternative object associated data of the alternative recommendation objects are obtained, the current object associated data of the current object and the alternative object associated data of the alternative recommendation objects are input into the recommendation model, the similarity between each alternative recommendation object and the current object is obtained, and finally, a target recommendation object matched with the current object is determined according to the similarity between each alternative recommendation object and the current object. According to the technical scheme of the embodiment of the invention, comprehensive semantic embedding learning representation is carried out on the sample data of the object to be recommended and the current object data of the current object, so that the expression capability of the characteristics is greatly expanded, the problems of single text characteristics and no consideration of text length difference in the prior art are solved, the accuracy of a recommendation model is improved, the comprehensiveness of relevant recommendation results is ensured, and the overall effect of the recommendation results is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present invention, nor do they necessarily limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart of a recommendation model training method according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for training a recommendation model according to a second embodiment of the present invention;
fig. 3 is a flowchart of a recommendation method provided in the third embodiment of the present invention;
fig. 4 is an overall schematic diagram of a recommendation method provided in the third embodiment of the present invention;
FIG. 5 is a schematic diagram of a recommendation module according to a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of a recommended model training apparatus according to a fourth embodiment of the present invention;
fig. 7 is a schematic structural diagram of a recommendation device according to a fifth embodiment of the present invention;
fig. 8 is a schematic structural diagram of an electronic device according to a sixth embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example one
Fig. 1 is a flowchart of a recommendation model training method provided in an embodiment of the present invention, where this embodiment is applicable to a case where a recommendation model is trained using recommendation object attribute data in a semantic embedding form, and the method may be executed by a recommendation model training apparatus, and the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device, where the electronic device may be a terminal device or a server device, and the embodiment of the present invention does not limit a specific device type of the electronic device. Accordingly, as shown in fig. 1, the method comprises the following operations:
s110, obtaining sample data of the object to be recommended including multi-attribute dimension information of the object to be recommended and current object data of the current object.
The sample data of the object to be recommended can contain multi-attribute dimensional information and can recommend and display related data of resources. For example, audio and video, pictures, songs, movies, commodities and news about the internet webpage can be used as the object to be recommended, and correspondingly, the related data of the object to be recommended can be used as sample data of the object to be recommended. The current object data can also be multi-attribute dimensional information, and can recommend related data for showing resources. For example, audio video, pictures, songs, movies, merchandise, and newsletters that a user is browsing on an internet web page can be the current object.
For example, the sample data of the object to be recommended may be commodity sample data of an e-commerce platform. Specifically, in the process of displaying the commodities on the e-commerce platform, the commodity information including multi-attribute descriptions such as labels, titles, texts, pictures and videos can be used as sample data of the object to be recommended. The current object data may be the commodity sample data that the user is browsing at the e-commerce platform. Specifically, in the browsing process of the user on the e-commerce platform, the commodity information including multi-attribute descriptions such as tags, titles, texts, pictures and videos can be used as current object data.
For example, the sample data of the object to be recommended may be news flash sample data in a webpage. Specifically, in the process of displaying news on an internet webpage, news newsletter information including multiple attribute descriptions such as titles, keywords and texts can be used as sample data of an object to be recommended. The current object data may be news flash sample data in a web page. Specifically, news information which is browsed by a user on an internet news page and contains multiple attribute descriptions such as labels, titles, texts, pictures and videos can be used as current object data.
For example, the sample data of the object to be recommended may be audio sample data in music software. Specifically, in the process of displaying the song ranking list in the music software, the audio information including the multiple attribute descriptions of the song title, the singer name, the lyrics and the like can be used as sample data of an object to be recommended. The current object data may be audio sample data in music software. Specifically, the song being played by the user in the music software includes audio information described by multiple attributes, such as the name of the song, the name of the singer, and the lyrics, and can be used as the current object data.
S120, performing semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics; the comprehensive semantic embedded features are composed of basic attribute semantic embedded features of a plurality of single attributes and attribute cross semantic embedded features.
The comprehensive semantic embedded feature can be a semantic embedded representation generated by performing feature processing on sample data of the object to be recommended. The basic attribute semantic embedded feature may be a feature that generates a basic attribute of the sample data of the object to be recommended through training and learning, and may include, for example and without limitation, a title attribute semantic embedded feature, a tag attribute semantic embedded feature, a text attribute semantic embedded feature, a picture attribute semantic embedded feature, a video attribute semantic embedded feature, and other attribute semantic embedded features. The attribute cross semantic embedded feature may be an attribute cross feature between sample data of two objects generated by training and learning, and may include, but is not limited to, similarity between titles of two objects, similarity of tags, and the like.
In the embodiment of the invention, the sample data of an object to be recommended is taken as input data and is sent into a semantic embedded learning network for training and learning to obtain basic attribute semantic embedded features and attribute cross semantic embedded features of a plurality of single attributes of the sample data of the object to be recommended, and then comprehensive semantic embedded features of the sample data of the object to be recommended are obtained by the basic attribute semantic embedded features and the attribute cross semantic embedded features of the plurality of single attributes of the sample data of the object to be recommended; the current object data is used as input data and sent into a semantic embedding learning network for training and learning to obtain basic attribute semantic embedding features and attribute cross semantic embedding features of a plurality of single attributes of the current object data, and then comprehensive semantic embedding features of the current object data are obtained through the basic attribute semantic embedding features and the attribute cross semantic embedding features of the plurality of single attributes of the current object data.
S130, generating a semantic embedding feature matrix of sample data according to the comprehensive semantic embedding feature, wherein the sample data is formed by splicing the sample data of the object to be recommended and the sample data of the current object.
The semantic embedded feature matrix can be formed by integrating semantic embedded features.
In the embodiment of the invention, assuming that the dimensionality of the comprehensive semantic embedding characteristics of the current object and the object to be recommended is M-dimensional, each sample can obtain the semantic embedding characteristics with the dimensionality of 2M through splicing, and then the Nx 2M semantic embedding characteristic matrix S is obtained.
And S140, inputting the semantic embedded features serving as training data into a recommendation model for training.
Wherein the recommendation model may be a model for making object recommendations. The optional recommendation model may be a recommendation method based on text content matching, and the method matches the text content by using techniques in the field of information retrieval, such as a correlation feedback method of a KNN (k-nearest neighbor, k-neighbor) algorithm and a Rocchio algorithm, and further generates a recommendation result.
Optionally, the recommendation model may also be a method for calculating similarity based on a text vector, and usually, the text vector is generated based on a method such as a topic model, and similarity calculation is performed on the text vector, so as to generate a recommendation result.
Optionally, the recommendation model may also be a method for calculating a correlation based on a text feature, and the method performs correlation evaluation on the text feature to generate a recommendation result.
In the embodiment of the invention, the semantic embedded features S are used as input data of a recommendation model and are sent into the recommendation model for training and learning, so that a recommendation result is generated.
According to the technical scheme of the embodiment of the invention, firstly, sample data of an object to be recommended including multi-attribute dimension information of the object to be recommended and current object data of the current object are obtained, then semantic embedding processing is carried out on the sample data of the object to be recommended and the current object data of the current object to obtain comprehensive semantic embedding characteristics, a semantic embedding characteristic matrix of the sample data is generated according to the comprehensive semantic embedding characteristics, and the semantic embedding characteristics are used as training data to be input into a recommendation model for training. According to the technical scheme of the embodiment of the invention, comprehensive semantic embedding learning representation is carried out on the sample data of the object to be recommended and the current object data of the current object, so that the expression capability of the characteristics is greatly expanded, the problems of single text characteristics and no consideration of text length difference in the prior art are solved, the accuracy of a recommendation model is improved, the comprehensiveness of relevant recommendation results is ensured, and the overall effect of the recommendation results is improved.
Example two
Fig. 2 is a flowchart of another recommendation model training method provided in the embodiment of the present invention, which is embodied based on the above embodiment of the present invention, and in this embodiment, a specific optional implementation manner is provided for performing semantic embedding processing on sample data of an object to be recommended, which is used for acquiring multi-attribute dimensional information of the object to be recommended, current object data of the current object, the sample data of the object to be recommended, and the current object data of the current object, so as to obtain comprehensive semantic embedding characteristics. Correspondingly, as shown in fig. 2, the method of the present embodiment may include:
s210, using the recommendation log data of the object to be recommended, the user click behavior data and the basic attribute data of the object to be recommended as original sample data.
The object to be recommended can be movie resources, news updates, music resources, articles and any resource which can be recommended. When the object to be recommended is a movie resource, the recommendation log data may be data for recording what type of movie data is recommended to the user at what time. When the object to be recommended is a music resource, the recommendation log data may be music data for recording what genre is recommended to the user at what time. The user click behavior data may be data that records when and what scenes the user clicked on which recommended items. The basic attribute data of the object to be recommended can be field identification basic information used for recording the object to be recommended, and different recommendation scenes have differences.
In the embodiment of the present invention, the recommendation log data may include fields such as recommended user unique identification ID (rec _ userid), current object unique identification ID (cur _ itemid), recommended object unique identification ID (rec _ itemid), recommended date (rec _ date), recommended identification ID (recid), and the like, and may be divided into pieces by days and stored in rows in the form of < rec _ userid, src _ itemid, rec _ itemid, recid, rec _ date >. For example, assuming that the object to be recommended is an item, the current object unique identification ID may be a current display item unique identification ID, and the recommended object unique identification ID may be a recommended item unique identification ID.
In the embodiment of the present invention, the user click behavior data may include fields such as a user unique identification ID (click _ userid), a clicked recommendation object unique identification ID (click _ item), a source recommendation log identification ID (src _ record) of a clicked object, a behavior occurrence date (click _ date), and the like, and may be segmented by days and stored by rows in the form of < click _ userid, click _ item, src _ record, click _ date >. For example, assuming that the object to be recommended is an item, the clicked unique identifier ID of the recommended object may be the unique identifier ID of the recommended item, and the source recommendation log identifier ID of the clicked object may be the source recommendation log identifier ID of the clicked item.
In the embodiment of the present invention, the basic attribute data of the object to be recommended may include a unique identifier ID (itemid), a title (title), a tag (tag), a category (category), a text (parent), picture information (picture), video information (video), and the like of the object to be recommended, where there may be a plurality of tags, pictures, and the like. For example, assuming that the object to be recommended is an item, the unique identifier ID of the object to be recommended may be the unique identifier ID of the item to be recommended.
And S220, performing data preprocessing on the original sample data to obtain preprocessed original sample data.
The preprocessed original sample data can be data obtained by data cleaning, examining and verifying the original sample data.
In the embodiment of the invention, original sample data is cleaned, examined and checked, recognizable errors in the data are found and corrected, wrong or conflicting data are removed according to a certain rule, meanwhile, the consistency of the data is checked, invalid values and missing values are processed, and finally, preprocessed original sample data is obtained.
And S230, extracting key sample data according to the preprocessed original sample data to obtain the sample data of the object to be recommended.
The key sample data may be data composed of key fields in the original sample data.
In the embodiment of the invention, an object to be recommended takes an article as an example, joint query is performed on recommended log data and user click behavior data, query is performed by using a query condition of "rec _ userid = click _ userid and rec _ itemid = click _ itemid and rec _ date = click _ date and rec = src _ recid", it can be found in the user click behavior data that the set label is 1 and the set label is 0 if the set label is not found, and then fragment and row storage is performed according to the rec _ date to obtain key sample data including fields such as user unique identification ID (rec _ userid), current article unique identification ID (cur _ itemid), recommended article unique identification ID (rec _ itemid), recommended date (rec _ date), clicked label identification (click is 1 and unchecked is 0) to be used. And selecting the data of the previous N days as a training sample, and selecting the data of the (N + 1) th day as a test sample to finally obtain sample data of the object to be recommended.
S240, performing semantic embedding processing on the single-attribute object sample data of each attribute dimension of the object sample data to be recommended to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the object sample data to be recommended.
The single-attribute object sample data may represent the sample data of the object to be recommended with a single attribute.
In the embodiment of the present invention, the single attribute object sample data may be title attribute object sample data of the object to be recommended. Specifically, firstly, preprocessing a title text of an object to be recommended, eliminating punctuations and stop words in the title text, then counting the length of the title text of the object, and if the length of the title text is less than L 1 Processing by padding, if it exceeds L 1 If the deletion exceeds L 1 Token of (2) ensures that the title text length is maintained at L 1 . Then the length is L 1 As BERT (bidirectionality)l Encoder reproduction from transformations, bidirectional coding Representation) model input data are sent into the BERT model for training and learning to obtain a title attribute semantic embedding feature E with the dimension K title1 . In the training process of the BERT model, the model is usually trained by means of randomly removing part of tokens, and meanwhile, the model is applied to an actual business scene recommended by an article, and the processing of the mask focuses more on removing tokens.
In the embodiment of the present invention, the single attribute object sample data may also be the tag attribute object sample data of the object to be recommended. The tag text data of the object sample to be recommended is a vocabulary list of item semantic information highly summarized by service personnel based on material data, and in the embodiment of the invention, a Word2Vec (Word to Vector) model is used for obtaining tag attribute semantic embedding characteristics. Specifically, the sample data of the tag attribute object is used as input data of a Word2Vec model, and is sent into the Word2Vec model for training and learning to obtain vectorization representation of the tag attribute. Because the number of labels depends on the setting of label dimensions, in order to consider dimension reduction and semantic representation capability, when the number of labels is N 1 Determining the output vector dimension of the label as
Figure BDA0003999113850000101
Practice proves that the importance of the labels is different, so that the obtained vectorization representation of the label attributes is used as input data of the self-attention network, the input data is sent to the self-attention network to train and learn the weights of different label vectors, finally, the label attribute semantic embedding feature with the dimension K is obtained, and the feature is represented as E tag1
In the embodiment of the present invention, the single attribute object sample data may also be text attribute object sample data of the object to be recommended. The text data of the body of the object sample to be recommended contains richer and more detailed semantic content, and compared with the title text and the label text data, the text of the body is generally longer, from hundreds to thousands of words. Based on text length considerations, in embodiments of the present inventionIn the method, a text attribute semantic embedding feature is obtained by using an LDA (Latent Dirichlet Allocation) model. Specifically, text attribute object sample data is used as LDA model input data and is sent to an LDA model for training and learning, and text attribute semantic embedding characteristics with the dimensionality of K, which are expressed as E, are obtained co ent1
In the embodiment of the present invention, the single attribute object sample data may also be picture attribute object sample data of the object to be recommended. The image text data of the object sample to be recommended contains a large amount of valuable characteristic information, and the image text data of the object sample to be recommended is fully utilized, so that the recommendation effect is improved. Specifically, firstly, data enhancement (including rotation, interception and highlighting) is performed on an existing picture (including a cover page and a detail display picture of the sample of the object to be recommended) of the sample of the object to be recommended to obtain K 1 Sending the pictures as input data of a ResNet (deep Neural Network) Network into the ResNet Network for training and learning to obtain vector representation of each picture, sending the vectorized representation of the pictures as input data of a self-attention Network for training and learning the weights of different picture vectors, and finally generating the weight with the dimensionality K 1 Picture attribute semantic embedded feature of (1), denoted as E picture1
In the embodiment of the present invention, the single attribute object sample data may also be video attribute object sample data of the object to be recommended. Specifically, a first frame of the video is taken as a first picture, and a last frame of the video is taken as a second picture. Suppose the total duration of the video is T 1 (in seconds), the maximum number of pictures that the machine allows to process is k 1 (k>2) If, if
Figure BDA0003999113850000111
Performing frame cutting according to the minimum unit per second to obtain T pictures, and filling the rest K-T-2 pictures in a data enhancement mode; if/or>
Figure BDA0003999113850000112
Then according to>
Figure BDA0003999113850000113
Integer number of seconds to obtain k 1 And (5) opening a picture. After the frame taking operation of the video is completed, the obtained picture is subjected to learning of embedding generation according to the method of the embodiment, and finally, a video attribute semantic embedding feature with the dimension of K is generated and is represented as E video1
In the embodiment of the present invention, the single attribute object sample data may also be associated attribute object sample data of the object to be recommended. Besides the attribute features of the title, the label, the text, the picture, the video and the like, the object to be recommended also has a unique representation ID and possibly attributes of a publisher, a category and the like, and the attributes characterize the object to be recommended from other dimensions. The attribute information is generally processed in a one-hot or multi-hot coding mode, wherein when the attribute of the object is a unique value and has a one-to-one relationship (for example, a publisher has only one publisher for one object to be recommended, and otherwise, the relationship is not established, and one publisher can publish a plurality of objects to be recommended), the one-hot coding mode is used; if the article attribute is a multi-valued relation (for example, a category, one object to be recommended may belong to multiple categories), a multi-hot encoding mode is used. Processing each kind of data in a one-hot or multi-hot mode according to the mapping relation between the materials and the attributes to obtain vector representation of associated attribute information, inputting the vector representation of the associated attribute information into an embedding layer for processing, and finally generating associated attribute semantic embedding characteristics with the dimension K, wherein the associated attribute semantic embedding characteristics are represented as E other1
S250, carrying out semantic embedding processing on the single attribute object data of each attribute dimension of the current object data to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the current object.
In the embodiment of the present invention, the single attribute object sample data may be title attribute object sample data of the current object. Specifically, the title text of the current object is preprocessed, punctuations and stop words in the title text of the current object are removed, and then the length of the title text of the object is countedDegree, if the length of the title text is less than L 2 Processing by padding, if it exceeds L 2 If the deletion exceeds L 2 Token of (2) ensures that the title text length is maintained at L 2 . Then, the title text data with the length of L is used as input data of a BERT (Bidirectional Encoder Representation) model and sent into the BERT model for training and learning to obtain a title attribute semantic embedding feature E with the dimensionality of K title2
In the embodiment of the present invention, the single attribute object sample data may also be the tag attribute object sample data of the current object. The label text data of the current object is a vocabulary list of item semantic information highly summarized by service personnel based on material data, and in the embodiment of the invention, a Word2Vec (Word to Vector) model is used for obtaining the label attribute semantic embedding characteristics. Specifically, the sample data of the tag attribute object is used as input data of a Word2Vec model, and is sent into the Word2Vec model for training and learning to obtain vectorization representation of the tag attribute. Because the number of labels depends on the setting of label dimensions, in order to consider dimension reduction and semantic representation capability, when the number of labels is N 2 Determining the output vector dimension of the label as
Figure BDA0003999113850000121
Practice proves that the importance of the labels is different, so that the obtained vectorization representation of the label attributes is used as input data of the self-attention network, the input data is sent to the self-attention network to train and learn the weights of different label vectors, finally, the label attribute semantic embedding feature with the dimension K is obtained, and the feature is represented as E tag2
In the embodiment of the present invention, the single attribute object sample data may also be text attribute object sample data of the current object. The body text data of the current object contains rich and detailed semantic content, and the body text is generally long, from hundreds to thousands of words, relative to the header text and the tag text data. Based on the text length consideration, in the embodiment of the invention, the positive is obtained by utilizing the LDA modelText attribute semantics embed features. Specifically, text attribute object sample data is used as LDA model input data and is sent to an LDA model for training and learning to obtain a dimension K 2 Text attribute semantic embedding feature of (1), denoted as E content2
In the embodiment of the present invention, the single attribute object sample data may also be the picture attribute object sample data of the current object. The image text data of the current object contains a large amount of valuable characteristic information, and the recommendation effect is improved by fully utilizing the image text data of the current object. Specifically, firstly, data enhancement (including rotation, interception and highlighting) is carried out on the existing picture (including the front cover and the detail display picture of the sample of the object to be recommended) of the current object to obtain k 2 Taking a picture as input data of a ResNet network, sending the picture into the ResNet network for training and learning to obtain vector representation of each picture, taking vectorization representation of the picture as input data of a self-attention network, sending the vectorization representation of the picture into the self-attention network for training and learning weights of different picture vectors, and finally generating a picture attribute semantic embedded feature with the dimension K, wherein the picture attribute semantic embedded feature is represented as E picture2
In the embodiment of the present invention, the single attribute object sample data may also be video attribute object sample data of the current object. Specifically, a first frame of the video is taken as a first picture, and a last frame of the video is taken as a second picture. Suppose the total duration of the video is T 2 (in seconds), the maximum number of pictures the machine allows to process is k (k)>2) If at all
Figure BDA0003999113850000131
Then frame slicing is performed in units of minimum units per second to obtain T 2 Filling the rest K-T-2 pictures in a data enhancement mode; if it is
Figure BDA0003999113850000132
Then according to>
Figure BDA0003999113850000133
And performing frame cutting on the integer seconds to obtain k pictures. Complete the view-to-sightAfter the frame taking operation of the video, the obtained picture is subjected to learning of embedding generation according to the method of the embodiment, and finally, a video attribute semantic embedding feature with the dimension of K is generated and expressed as E video2
In the embodiment of the present invention, the single attribute object sample data may also be other attribute object sample data of the current object. In addition to the above-mentioned attribute features of title, label, text, picture, video, etc., the current object also has a unique representation ID, and possibly attributes of publisher, category, etc., which characterize the current object from other dimensions. The attribute information is generally processed in a one-hot or multi-hot coding mode, wherein when the attribute of the object is a unique value and has a one-to-one relationship (for example, a publisher has only one current object, and otherwise, the relationship is not true, and one publisher can publish a plurality of current objects), the one-hot coding mode is used; if the item attribute is a multi-valued relationship (e.g., category, a current object may belong to multiple categories), then a multi-hot encoding scheme is used. Each kind of data can be processed in a one-hot or multi-hot mode according to the mapping relation between the materials and the attributes to obtain vector representations of other attribute information, then the vector representations of the other attribute information are input into an embedding layer to be processed, and finally other attribute semantic embedded features with the dimensionality K are generated and represented as E other2
S260, calculating attribute cross semantic embedding characteristics of the object to be recommended according to the basic attribute semantic embedding characteristics of the object to be recommended and the basic attribute semantic embedding characteristics of the current object.
In the embodiment of the invention, m basic attribute semantic embedded features of an object to be recommended and m basic attribute semantic embedded features of a current object are used as input data of an activation network and are sent into the activation network for training and learning, and finally, m K-dimensional attribute cross semantic embedded features are generated and are expressed as E cross
And S270, inputting the attribute cross semantic embedding characteristics and the basic attribute semantic embedding characteristics of the current object into a self-attention network for weight learning to obtain comprehensive semantic embedding characteristics of the current object.
Wherein the self-attention network may be a model for learning the weight of each attribute.
In the embodiment of the invention, each basic attribute semantic embedded feature and attribute cross semantic embedded feature of the current object are used as input data of the self-attention network, and are sent into the self-attention network for weight learning, and finally, the comprehensive semantic embedded feature E of the current object is obtained total1
S280, inputting the attribute cross semantic embedding features and the basic attribute semantic embedding features of the object to be recommended into a self-attention network for weight learning to obtain comprehensive semantic embedding features of the object to be recommended.
The basic attribute semantic embedding features comprise at least one of title attribute semantic embedding features, label attribute semantic embedding features, text attribute semantic embedding features, picture attribute semantic embedding features, video attribute semantic embedding features and associated attribute semantic embedding features.
In the embodiment of the invention, each basic attribute semantic embedded feature and attribute cross semantic embedded feature of the object to be recommended are used as input data of the self-attention network, and are sent to the self-attention network for weight learning, and finally, the comprehensive semantic embedded feature E of the object to be recommended is obtained total2
And S290, generating a semantic embedding feature matrix of the sample data of the object to be recommended according to the comprehensive semantic embedding feature.
S2100, inputting the semantic embedded features serving as training data into a recommendation model for training.
In the embodiment of the invention, a current object and an object to be recommended in sample data are trained and learned according to a semantic embedding learning method to respectively obtain comprehensive semantic embedding characteristics E of the current object total1 And the comprehensive semantic embedded characteristics E of the objects to be recommended total2 . Assume comprehensive semantic embedding features E of the current object total1 And the comprehensive semantic embedded characteristics E of the objects to be recommended total2 The dimensionalities of the N samples are all M dimensions, so that the N samples can be spliced to obtain 2M-dimensionality semantic embedded features, and further obtain an Nx 2M semantic embedded feature matrix S. The semantic embedded feature matrix S is used as input data of an input layer of the deep neural network and is expressed as follows:
h 0 =(S)
wherein h is 0 Representing input data to a deep neural network.
Further, data are input into a subsequent hidden layer, and an article relevance calculation function is learned through p hidden layers. Each hidden layer is denoted by h 1 ,h 2 ,…,h p It is defined as follows:
h l =δ l (W l T h l-1 +b l ),l=1,2,…,p
wherein, W l And b l Weight matrix and bias vector, delta, representing the l-th layer perceptron, respectively l Representing the activation function of the l-th layer.
Finally, the similarity between the two is calculated as follows:
Figure BDA0003999113850000141
where σ is an activation function sigmoid, the output value is (0, 1), which is a quantization value of the degree of similarity between the two, and W represents a weight matrix.
Meanwhile, in the model training stage, the following parameters need to be continuously adjusted for model optimization:
1) L1_ regular, L1 regularization, default 0.001;
2) L2_ regular, L2 regularization, default 0.001;
3) hieden _ layers, number of network hidden layer neurons, comma segmentation, default 64,32;
4) learning _ rate, learning rate, default 0.01;
5) embedding _ size, dimension size, default 32;
6) activation, namely activating a function, wherein values of elu, gelu, hard _ sigmoid, linear, relu, selu, sigmoid, softmax, softplus, softsign and swish are selected as default values;
7) epochs, training times, default 12;
8) loss, loss function, valued as Binarycross, categoricalcalCrossentpy, categoricalHinge, cosinesSimiarity, hinge, huber, KLDivergene, logCosh, meanaAbsolute Error, meanaAbsolute Percentager, meanSquaredError, meanSquaredLogerimitmicroError, poisson, reducion, sparsseCategoricalCrossentSentropy, squaredHinge, binarycross default;
9) An optimizer, which takes values of AdadeltaAdagrad, adam, adamax, FTRL, NAdam and SGD, and defaults to Adam;
10 Batch size, number of samples processed in batch, default 64.
The method comprises the steps of firstly using recommended log data of an object to be recommended, user click behavior data and basic attribute data of the object to be recommended as original sample data, carrying out data preprocessing on the original sample data to obtain preprocessed original sample data, and extracting key sample data according to the preprocessed original sample data to obtain sample data of the object to be recommended. And then carrying out semantic embedding processing on the single-attribute object sample data of each attribute dimension of the sample data of the object to be recommended to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the object to be recommended, carrying out semantic embedding processing on the single-attribute object data of each attribute dimension of the current object data to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the current object, and calculating attribute cross semantic embedding characteristics according to each basic attribute semantic embedding characteristic of the object to be recommended and each basic attribute semantic embedding characteristic of the current object. And inputting the attribute cross semantic embedding feature and the basic attribute semantic embedding feature of the current object into a self-attention network for weight learning to obtain the comprehensive semantic embedding feature of the current object. And inputting the attribute cross semantic embedding feature and the basic attribute semantic embedding feature of the object to be recommended into a self-attention network for weight learning to obtain the comprehensive semantic embedding feature of the object to be recommended. And finally, generating a semantic embedding feature matrix of the sample data according to the comprehensive semantic embedding features, and inputting the semantic embedding features serving as training data into the recommendation model for training. By performing semantic embedding processing of single attributes and cross attributes on the current object and the object to be recommended, the accuracy of the recommendation model is improved, the comprehensiveness of relevant recommendation results is ensured, and the overall effect of the recommendation results is improved.
EXAMPLE III
Fig. 3 is a flowchart of a recommendation method provided in a third embodiment of the present invention, where this embodiment may be used in a case of performing recommendation by using a recommendation model obtained by training in the foregoing embodiment, and the method may be executed by a recommendation apparatus, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device, where the electronic device may be a terminal device or a server device, and the embodiment of the present invention does not limit a specific device type of the electronic device. Correspondingly, as shown in fig. 3, the method of the present embodiment may include:
s310, obtaining the current object associated data of the current object and the alternative object associated data of the alternative recommended object.
S320, inputting the current object associated data of the current object and the alternative object associated data of the alternative recommended objects into a recommendation model to obtain the similarity of each alternative recommended object and the current object.
The current object may be an object browsed by the user in the current state. The alternative recommendation object may be an available object for recommending the user. The current object associated data may be related data of the browsing object in the current state of the user, including but not limited to recommendation log data of the current object, user click behavior data, current object basic attribute data, and the like. The alternative object associated data may be related data of the recommended object, including but not limited to recommendation log data, user click behavior data, alternative recommended object basic attribute data, and the like of the alternative recommended object. The similarity may be used to indicate the degree of association between the candidate recommended object and the current object.
S330, determining a target recommended object matched with the current object according to the similarity between each candidate recommended object and the current object.
The recommendation model is obtained by training through the recommendation model training method in any embodiment of the invention.
In this embodiment of the present invention, determining a target recommended object matched with the current object according to the similarity between each candidate recommended object and the current object may further include: sequencing the alternative recommended objects according to the sequence of the similarity between the alternative recommended objects and the current object from high to low; and selecting a set number of candidate recommendation objects as the target recommendation objects according to the sequencing result of each candidate recommendation object.
Wherein, the target recommendation object may be an object with higher similarity to the current object.
Specifically, the similarity between the alternative recommended objects and the current object is obtained through calculation, the similarity between each alternative recommended object and the current object is ranked in a descending order, and Top M alternative recommended objects are selected as target recommended objects according to ranking results.
In a specific embodiment, the item recommendation is specifically described as an application scenario. Firstly, comprehensively considering multi-dimensional multi-modal attribute data of an article, and selecting a proper embedding generation model to obtain low-dimensional and dense semantic embedding representation, thereby improving the expression capability of the features; and a deep neural network is selected for model training and prediction, and the generalization capability of a prediction model is improved, so that the recommendation method can be widely applied to more recommendation scenes. Secondly, for differences in importance of multiple labels, multiple pictures, multiple features and the like, learning importance weights by introducing a self-attention mechanism, optimizing by combining model targets, and automatically learning to obtain the optimal combination of the importance weights. Furthermore, semantic understanding and embedding generation are respectively carried out on multi-mode attributes of the article, in the aspect of text characteristics, BERT, LDA and Word2Vec are selected for differentiation processing based on the consideration of length difference, and compared with a single model for processing all text data, the method is more comprehensive in relevance calculation and better in effect. Meanwhile, consideration on picture and video information is added, so that not only is the effect improved, but also the application scene of the recommendation method is expanded. Finally, in the aspect of feature processing, in consideration of the improvement effect of the cross combination of multiple features on the model effect, an independent activation network is introduced to learn the multiple cross features, semantic embedded expression is generated, and the semantic embedded expression is input into a deep neural network, so that the feature learning capability of the model is further improved, and the comprehensiveness of calculation of related recommendation results is improved.
Fig. 4 is an overall schematic diagram of a recommendation method according to a third embodiment of the present invention, as shown in fig. 4, first, a current article and an article to be recommended respectively perform embedding learning of basic attributes through an embedding layer, so as to obtain a title attribute embedding feature, a tag attribute embedding feature, a text attribute embedding feature, a picture attribute embedding feature, a video attribute embedding feature, and an associated attribute embedding feature associated with the current article and the article to be recommended; performing cross attribute embedding learning on the basic attribute to obtain a cross attribute embedding characteristic; the title attribute embedding feature, the tag attribute embedding feature, the text attribute embedding feature, the picture attribute embedding feature, the video attribute embedding feature and the associated attribute embedding feature of the current article are sent to a self-attention network for training, the importance weight of multiple features is obtained through learning, and the similarity between the current article and the article to be recommended is finally obtained.
Fig. 5 is a schematic diagram of a recommendation module according to a third embodiment of the present invention, and as shown in fig. 5, first, relevant recommendation log data, user click behavior data, and basic attribute data of an article are sent to a data preprocessing module to implement preprocessing operation on original data; then sending the data to a semantic embedding generation module to generate semantic embedding characteristics in the data; then, the model is sent to a model training and predicting module to realize the training of a recommendation model; and finally, sending the data to an associated recommendation result generation module to complete similarity calculation between the two articles.
According to the technical scheme of the embodiment of the invention, sample data of an object to be recommended and current object data of the current object are obtained, wherein the sample data of the object to be recommended comprises multi-attribute dimension information of the object to be recommended, semantic embedding processing is carried out on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics, a semantic embedding characteristic matrix of the sample data is generated according to the comprehensive semantic embedding characteristics, and the semantic embedding characteristics are input into a recommendation model as training data to be trained. Correspondingly, after the training of the recommendation model is completed, current object association data of the current object and alternative object association data of the alternative recommendation objects are obtained, the current object association data of the current object and the alternative object association data of the alternative recommendation objects are input into the recommendation model, the similarity between each alternative recommendation object and the current object is obtained, and finally the target recommendation object matched with the current object is determined according to the similarity between each alternative recommendation object and the current object. According to the technical scheme of the embodiment of the invention, comprehensive semantic embedding learning representation is carried out on the sample data of the object to be recommended and the current object data of the current object, so that the expression capability of the characteristics is greatly expanded, the problems of single text characteristics and no consideration of text length difference in the prior art are solved, the accuracy of a recommendation model is improved, the comprehensiveness of relevant recommendation results is ensured, and the overall effect of the recommendation results is improved.
Example four
Fig. 6 is a schematic structural diagram of a recommended model training apparatus according to a fourth embodiment of the present invention, and as shown in fig. 6, the apparatus includes: the recommendation method comprises a to-be-recommended object sample data acquisition module 410, a semantic embedding processing module 420, a semantic embedding feature matrix generation module 430 and a recommendation model training module 440, wherein:
the module 410 for obtaining sample data of the object to be recommended is used for obtaining sample data of the object to be recommended including multi-attribute dimension information of the object to be recommended and current object data of the current object.
A semantic embedding processing module 420, configured to perform semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain a comprehensive semantic embedding feature; the comprehensive semantic embedded features are composed of basic attribute semantic embedded features of a plurality of single attributes and attribute cross semantic embedded features.
And a semantic embedded feature matrix generating module 430, configured to generate a semantic embedded feature matrix of the sample data according to the integrated semantic embedded feature.
And the recommendation model training module 440 is configured to input the semantic embedded features as training data to a recommendation model for training.
According to the technical scheme of the embodiment of the invention, firstly, sample data of an object to be recommended including multi-attribute dimension information of the object to be recommended and current object data of the current object are obtained, then semantic embedding processing is carried out on the sample data of the object to be recommended and the current object data of the current object to obtain comprehensive semantic embedding characteristics, a semantic embedding characteristic matrix of the sample data is generated according to the comprehensive semantic embedding characteristics, and the semantic embedding characteristics are used as training data to be input into a recommendation model for training. According to the technical scheme of the embodiment of the invention, comprehensive semantic embedding learning representation is carried out on the sample data of the object to be recommended and the current object data of the current object, so that the expression capability of the characteristics is greatly expanded, the problems of single text characteristics and no consideration of text length difference in the prior art are solved, the accuracy of a recommendation model is improved, the comprehensiveness of relevant recommendation results is ensured, and the overall effect of the recommendation results is improved.
Optionally, the module 410 for acquiring sample data of the object to be recommended is specifically configured to: taking recommendation log data of an object to be recommended, user click behavior data and basic attribute data of the object to be recommended as original sample data; performing data preprocessing on the original sample data to obtain preprocessed original sample data; and extracting key sample data according to the preprocessed original sample data to obtain the sample data of the object to be recommended.
Optionally, the semantic embedding processing module 420 is specifically configured to: performing semantic embedding processing on the single attribute object sample data of each attribute dimension of the sample data of the object to be recommended to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the object to be recommended; performing semantic embedding processing on the single attribute object data of each attribute dimension of the current object data to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the current object; calculating attribute cross semantic embedding characteristics of the object to be recommended according to the basic attribute semantic embedding characteristics of the object to be recommended and the basic attribute semantic embedding characteristics of the current object; inputting the attribute cross semantic embedded features and the basic attribute semantic embedded features of the current object into a self-attention network for weight learning to obtain comprehensive semantic embedded features of the current object; inputting the attribute cross semantic embedding feature and the basic attribute semantic embedding feature of the object to be recommended into a self-attention network for weight learning to obtain a comprehensive semantic embedding feature of the object to be recommended; the basic attribute semantic embedding features comprise at least one of title attribute semantic embedding features, label attribute semantic embedding features, text attribute semantic embedding features, picture attribute semantic embedding features, video attribute semantic embedding features and associated attribute semantic embedding features.
Optionally, the object to be recommended includes an item to be recommended.
The recommendation model training device can execute the recommendation model training method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to a recommendation model training method provided in any embodiment of the present invention.
EXAMPLE five
Fig. 7 is a schematic structural diagram of a recommendation apparatus according to a fifth embodiment of the present invention, and as shown in fig. 7, the recommendation apparatus includes: an object associated data obtaining module 510, a similarity obtaining module 520 and a target recommended object determining module 530, wherein:
an object association data acquisition module 510, configured to: acquiring current object associated data of a current object and alternative object associated data of alternative recommended objects;
a similarity obtaining module 520, configured to: inputting the current object associated data of the current object and the alternative object associated data of the alternative recommended objects into a recommendation model to obtain the similarity between each alternative recommended object and the current object;
a target recommended object determination module 530, configured to: determining a target recommendation object matched with the current object according to the similarity of each candidate recommendation object and the current object; the recommendation model is obtained by training through the recommendation model training method provided by the embodiment of the invention.
Optionally, the target recommendation object determining module 530 is specifically configured to: sequencing the alternative recommended objects according to the sequence of the similarity between the alternative recommended objects and the current object from high to low; and selecting a set number of candidate recommendation objects as the target recommendation objects according to the sequencing result of each candidate recommendation object.
According to the technical scheme of the embodiment of the invention, firstly, sample data of an object to be recommended including multi-attribute dimension information of the object to be recommended and current object data of a current object are obtained, then semantic embedding processing is carried out on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics, a semantic embedding characteristic matrix of the sample data is generated according to the comprehensive semantic embedding characteristics, and the semantic embedding characteristics are used as training data to be input into a recommendation model for training. Correspondingly, after the training of the recommendation model is completed, current object associated data of the current object and alternative object associated data of the alternative recommendation objects are obtained, the current object associated data of the current object and the alternative object associated data of the alternative recommendation objects are input into the recommendation model, the similarity between each alternative recommendation object and the current object is obtained, and finally, a target recommendation object matched with the current object is determined according to the similarity between each alternative recommendation object and the current object. According to the technical scheme of the embodiment of the invention, comprehensive semantic embedding learning representation is carried out on the sample data of the object to be recommended and the current object data of the current object, so that the expression capability of the characteristics is greatly expanded, the problems of single text characteristics and no consideration of text length difference in the prior art are solved, the accuracy of a recommendation model is improved, the comprehensiveness of relevant recommendation results is ensured, and the overall effect of the recommendation results is improved.
The recommendation device can execute the recommendation method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the recommendation method provided in any embodiment of the present invention.
EXAMPLE six
FIG. 8 illustrates a schematic diagram of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 8, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, or the like. The processor 11 performs the various methods and processes described above, such as the recommendation model training method or recommendation method.
In some embodiments, the recommendation model training method or recommendation method may be implemented as a computer program tangibly embodied in a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the recommendation model training method or recommendation method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the recommendation model training method or recommendation method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user may provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.

Claims (10)

1. A method for training a recommendation model, comprising:
acquiring sample data of an object to be recommended including multi-attribute dimensional information of the object to be recommended and current object data of a current object;
performing semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics; the comprehensive semantic embedded features consist of basic attribute semantic embedded features and attribute cross semantic embedded features of a plurality of single attributes;
generating a semantic embedding feature matrix of sample data according to the comprehensive semantic embedding feature, wherein the sample data is formed by splicing the sample data of the object to be recommended and the sample data of the current object;
and inputting the semantic embedded features serving as training data into a recommendation model for training.
2. The method according to claim 1, wherein the obtaining sample data of the object to be recommended including multi-attribute dimensional information of the object to be recommended comprises:
taking recommendation log data of an object to be recommended, user click behavior data and basic attribute data of the object to be recommended as original sample data;
performing data preprocessing on the original sample data to obtain preprocessed original sample data;
and extracting key sample data according to the preprocessed original sample data to obtain the sample data of the object to be recommended.
3. The method according to claim 1, wherein the semantic embedding processing is performed on the object sample data to be recommended and the current object data to obtain a comprehensive semantic embedded feature, and the semantic embedded feature includes:
performing semantic embedding processing on the single attribute object sample data of each attribute dimension of the object sample data to be recommended to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the object to be recommended;
performing semantic embedding processing on the single attribute object data of each attribute dimension of the current object data to obtain basic attribute semantic embedding characteristics of a plurality of single attributes of the current object;
calculating attribute cross semantic embedding characteristics according to the basic attribute semantic embedding characteristics of the object to be recommended and the basic attribute semantic embedding characteristics of the current object;
inputting the attribute cross semantic embedding feature and the basic attribute semantic embedding feature of the current object into a self-attention network for weight learning to obtain a comprehensive semantic embedding feature of the current object;
inputting the attribute cross semantic embedding feature and the basic attribute semantic embedding feature of the object to be recommended into a self-attention network for weight learning to obtain a comprehensive semantic embedding feature of the object to be recommended;
the basic attribute semantic embedding features comprise at least one of title attribute semantic embedding features, label attribute semantic embedding features, text attribute semantic embedding features, picture attribute semantic embedding features, video attribute semantic embedding features and associated attribute semantic embedding features.
4. The method according to any one of claims 1 to 3, wherein the object to be recommended comprises an item to be recommended.
5. A recommendation method, comprising:
acquiring current object associated data of a current object and alternative object associated data of an alternative recommended object;
inputting the current object associated data of the current object and the alternative object associated data of the alternative recommended objects into a recommendation model to obtain the similarity between each alternative recommended object and the current object;
determining a target recommendation object matched with the current object according to the similarity of each candidate recommendation object and the current object;
wherein the recommendation model is trained by the recommendation model training method of any one of claims 1-4.
6. The method of claim 5, wherein the determining the target recommended object matching the current object according to the similarity between each of the candidate recommended objects and the current object comprises:
sequencing the alternative recommended objects according to the sequence of the similarity between the alternative recommended objects and the current object from high to low;
and selecting a set number of candidate recommendation objects as the target recommendation objects according to the sequencing result of each candidate recommendation object.
7. A recommendation model training apparatus, comprising:
the system comprises a module for acquiring sample data of an object to be recommended, a module for acquiring sample data of the object to be recommended, and current object data of a current object, wherein the sample data of the object to be recommended comprises multi-attribute dimension information of the object to be recommended;
the semantic embedding processing module is used for carrying out semantic embedding processing on the sample data of the object to be recommended and the current object data to obtain comprehensive semantic embedding characteristics; the comprehensive semantic embedded features consist of basic attribute semantic embedded features and attribute cross semantic embedded features of a plurality of single attributes;
the semantic embedded characteristic matrix generating module is used for generating a semantic embedded characteristic matrix of the sample data of the object to be recommended according to the comprehensive semantic embedded characteristic;
and the recommendation model training module is used for inputting the semantic embedded features as training data into a recommendation model for training.
8. A recommendation device, comprising:
the object associated data acquisition module is used for acquiring current object associated data of a current object and alternative object associated data of alternative recommended objects;
a similarity obtaining module, configured to input current object association data of the current object and candidate object association data of the candidate recommended objects into a recommendation model, so as to obtain a similarity between each of the candidate recommended objects and the current object;
the target recommended object determining module is used for determining a target recommended object matched with the current object according to the similarity between each candidate recommended object and the current object;
wherein the recommendation model is trained by the recommendation model training method of any one of claims 1-4.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the recommendation model training method of any of claims 1-4 or to implement the recommendation method of any of claims 5-6.
10. A computer storage medium, characterized in that the computer-readable storage medium stores computer instructions for causing a processor, when executed, to implement the recommendation model training method of any of claims 1-4, or to implement the recommendation method of any of claims 5-6.
CN202211609979.3A 2022-12-14 2022-12-14 Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium Pending CN115982425A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211609979.3A CN115982425A (en) 2022-12-14 2022-12-14 Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211609979.3A CN115982425A (en) 2022-12-14 2022-12-14 Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115982425A true CN115982425A (en) 2023-04-18

Family

ID=85975117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211609979.3A Pending CN115982425A (en) 2022-12-14 2022-12-14 Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115982425A (en)

Similar Documents

Publication Publication Date Title
WO2020207196A1 (en) Method and apparatus for generating user tag, storage medium and computer device
CN111767461B (en) Data processing method and device
CN110909182B (en) Multimedia resource searching method, device, computer equipment and storage medium
CN112434151A (en) Patent recommendation method and device, computer equipment and storage medium
US11741094B2 (en) Method and system for identifying core product terms
CN112364204B (en) Video searching method, device, computer equipment and storage medium
CN110795657A (en) Article pushing and model training method and device, storage medium and computer equipment
CN112749330B (en) Information pushing method, device, computer equipment and storage medium
CN113806588B (en) Method and device for searching video
CN114154013A (en) Video recommendation method, device, equipment and storage medium
CN116975615A (en) Task prediction method and device based on video multi-mode information
CN114329051A (en) Data information identification method, device, equipment, storage medium and program product
CN112989182B (en) Information processing method, information processing device, information processing apparatus, and storage medium
CN116823410B (en) Data processing method, object processing method, recommending method and computing device
Li et al. Knowledge graph representation reasoning for recommendation system
CN114201622B (en) Method and device for acquiring event information, electronic equipment and storage medium
CN116226533A (en) News associated recommendation method, device and medium based on association prediction model
CN115982425A (en) Recommendation model training method and device, recommendation method and device, electronic equipment and storage medium
CN114169418A (en) Label recommendation model training method and device, and label obtaining method and device
CN114022233A (en) Novel commodity recommendation method
CN112258285A (en) Content recommendation method and device, equipment and storage medium
CN113342969A (en) Data processing method and device
CN111783808A (en) Method and apparatus for generating information
CN113709529B (en) Video synthesis method, device, electronic equipment and computer readable medium
CN112287184B (en) Migration labeling method, device, equipment and storage medium based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination