CN113792163B

CN113792163B - Multimedia recommendation method and device, electronic equipment and storage medium

Info

Publication number: CN113792163B
Application number: CN202110908672.2A
Authority: CN
Inventors: 刘艺语; 牛亚男; 王昶平; 宋洋; 田雨
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2021-08-09
Filing date: 2021-08-09
Publication date: 2022-12-27
Anticipated expiration: 2041-08-09
Also published as: CN113792163A

Abstract

The disclosure relates to a multimedia recommendation method, a multimedia recommendation device, an electronic device and a storage medium. The method comprises the following steps: acquiring an association relation graph; determining respective multimedia characteristics of a plurality of multimedia to be recommended according to a first keyword which has an incidence relation with the plurality of multimedia to be recommended in the incidence relation graph; pruning the incidence relation graph to obtain a target subgraph; the target subgraph comprises a target object, a plurality of multimedia to be recommended, target historical multimedia and target keywords; determining a first object feature of the target object based on the target historical multimedia and the target keyword; determining recommendation information of a plurality of multimedia to be recommended according to the multimedia characteristics and the first object characteristics; and recommending a plurality of multimedia to be recommended to the target object based on the recommendation information. According to the technical scheme provided by the disclosure, the precision and the efficiency of multimedia recommendation can be improved.

Description

Multimedia recommendation method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of internet application technologies, and in particular, to a multimedia recommendation method and apparatus, an electronic device, and a storage medium.

Background

In the related art, in order to improve recommendation accuracy, generally, a multi-modal data fusion method extracts multimedia content in real time to realize accurate understanding of multimedia, and the real-time extraction method causes great processing pressure on a server, and is low in recommendation efficiency and poor in expansibility of applicable scenes.

Disclosure of Invention

The present disclosure provides a multimedia recommendation method, apparatus, electronic device and storage medium, so as to at least solve the problem of how to recommend multimedia accurately and efficiently in the related art. The technical scheme of the disclosure is as follows:

according to a first aspect of an embodiment of the present disclosure, there is provided a multimedia recommendation method, including:

acquiring an incidence relation graph, wherein the incidence relation graph comprises three nodes of a target object, a plurality of multimedia and a plurality of keywords and edges between adjacent nodes, the plurality of multimedia comprises a plurality of multimedia to be recommended and a plurality of historical multimedia of the target object which executes preset behaviors, and the edges represent that incidence relations exist between the adjacent nodes;

determining respective multimedia characteristics of the multiple multimedia to be recommended according to a first keyword which has an incidence relation with the multiple multimedia to be recommended in the incidence relation graph;

pruning the incidence relation graph to obtain a target subgraph; the target sub-graph comprises the target object, the multiple multimedia to be recommended, target historical multimedia and target keywords, the target historical multimedia is at least one of the multiple historical multimedia, and the target keywords are at least one of the multiple keywords;

determining a first object feature of the target object based on the target historical multimedia and the target keyword;

determining recommendation information of the plurality of multimedia to be recommended according to the multimedia characteristics and the first object characteristics;

and recommending the plurality of multimedia to be recommended to the target object based on the recommendation information.

In a possible implementation manner, the step of determining, according to a first keyword in the association relationship diagram, a multimedia feature of each of the multiple to-be-recommended multimedia, the multimedia feature having an association relationship with the multiple to-be-recommended multimedia includes:

inputting the plurality of multimedia to be recommended and the first keyword into a first attention network to obtain first weight information of the first keyword;

and weighting the first keywords corresponding to the multiple to-be-recommended multimedia according to the first weight information to obtain the multimedia characteristics of the multiple to-be-recommended multimedia.

In a possible implementation manner, the association degree of the target historical multimedia and the target object meets a first preset condition; and the association degree of the target keyword and the target historical multimedia meets a second preset condition.

In a possible implementation manner, the pruning processing on the association relationship graph to obtain a target sub-graph includes:

determining first relevance information of the plurality of historical multimedia under the target object and second relevance information of a second keyword under the plurality of historical multimedia; the second keywords are keywords which have incidence relation with the plurality of historical multimedia;

and based on the first relevance information and the second relevance information, performing pruning processing on edges and/or nodes in the relevance relation graph to obtain the target subgraph.

In a possible implementation manner, the step of determining first relevance information of each of the plurality of historical multimedia under the target object and second relevance information of a second keyword under the plurality of historical multimedia includes:

determining a second object feature of the target object based on the plurality of historical multimedia and the second keyword;

inputting the plurality of historical multimedia and the second object characteristics into a first relevance prediction model, and performing prediction processing on the relevance between the target object and the plurality of historical multimedia to obtain first relevance information and intermediate characteristic information output by a gate cycle unit in the first relevance prediction model;

and inputting the plurality of historical multimedia, the second keyword and the intermediate characteristic information into a second relevance prediction model, and performing prediction processing on the relevance between the plurality of historical multimedia and the second keyword to obtain second relevance information.

In one possible implementation, the determining a second object feature of the target object based on the plurality of historical multimedia and the second keyword comprises:

inputting the plurality of historical multimedia and the second keyword into a second attention network to obtain second weight information of the second keyword;

inputting the target object and the plurality of historical multimedia into a third attention network to obtain third weight information of the plurality of historical multimedia;

and determining the second object characteristics according to the second keyword, the second weight information and the third weight information.

In a possible implementation manner, the step of obtaining the association relationship graph includes:

acquiring the multiple multimedia to be recommended, the multiple historical multimedia of which the target object executes preset behaviors, and first text information and second text information which are respectively associated with the multiple multimedia to be recommended;

performing word segmentation processing on the first text information and the second text information respectively to obtain a plurality of first initial keywords and a plurality of second initial keywords;

determining first importance information of the plurality of first initial keywords for the plurality of multimedia to be recommended and second importance information of the plurality of second initial keywords for the plurality of historical multimedia;

selecting the plurality of keywords from the plurality of first initial keywords and the plurality of second initial keywords based on the first importance information and the second importance information;

and constructing the incidence relation graph based on incidence relations among the target object, the plurality of historical multimedia, the plurality of multimedia to be recommended and the plurality of key words.

In one possible implementation, the method further includes:

acquiring a sample association relationship graph and label information, wherein the sample association relationship graph comprises three sample nodes of a plurality of sample objects, a plurality of sample multimedia and a plurality of sample words and edges between adjacent sample nodes, the sample multimedia performs a preset action on the target object, the edges between the adjacent sample nodes represent that the adjacent sample nodes have an association relationship, and the label information represents the association degree of the sample multimedia and the sample objects;

determining sample object characteristics of each of the plurality of sample objects based on the plurality of sample multimedia and the sample words;

inputting the sample multimedia and the sample object characteristics into a first preset model, and performing prediction processing on the association degree between the sample object and the sample multimedia to obtain sample prediction information;

repeating the input steps until the preset times, and carrying out statistical processing on the sample prediction information corresponding to the preset times to obtain target sample prediction information;

determining loss information according to the target sample prediction information and the label information;

and training the first preset model based on the loss information to obtain the first correlation degree prediction model.

According to a second aspect of an embodiment of the present disclosure, there is provided a multimedia recommendation apparatus including:

the incidence relation graph obtaining module is configured to execute obtaining of an incidence relation graph, the incidence relation graph comprises three nodes of a target object, a plurality of multimedia and a plurality of keywords and edges between adjacent nodes, the plurality of multimedia comprises a plurality of multimedia to be recommended and a plurality of historical multimedia of which the target object executes preset behaviors, and the edges represent incidence relations between the adjacent nodes;

the multimedia characteristic determining module is configured to determine respective multimedia characteristics of the multiple multimedia to be recommended according to a first keyword which has an incidence relation with the multiple multimedia to be recommended in the incidence relation graph;

the pruning module is configured to perform pruning processing on the incidence relation graph to obtain a target subgraph; the target sub-graph comprises the target object, the multiple multimedia to be recommended, target historical multimedia and target keywords, the target historical multimedia is at least one of the multiple historical multimedia, and the target keywords are at least one of the multiple keywords;

a first object feature determination module configured to perform determining a first object feature of the target object based on the target historical multimedia and the target keyword;

a recommendation information determination module configured to determine recommendation information of the plurality of multimedia to be recommended according to the multimedia feature and the first object feature;

a recommending module configured to recommend the plurality of multimedia to be recommended to the target object based on the recommending information.

In one possible implementation, the multimedia feature determination module includes:

a first weight information obtaining unit configured to perform input of the plurality of multimedia to be recommended and the first keyword into a first attention network to obtain first weight information of the first keyword;

and the multimedia characteristic determining unit is configured to perform weighting processing on the first keywords corresponding to the multiple to-be-recommended multimedia according to the first weight information to obtain the multimedia characteristics of the multiple to-be-recommended multimedia.

In a possible implementation manner, the association degree of the target historical multimedia and the target object meets a first preset condition; and the relevance of the target keyword and the target historical multimedia meets a second preset condition.

In one possible implementation, the pruning module includes:

a relevancy determining unit configured to perform determining first relevancy information of each of the plurality of historical multimedia under the target object and second relevancy information of a second keyword under the plurality of historical multimedia; the second keywords are keywords which have incidence relation with the plurality of historical multimedia;

and the target subgraph acquisition unit is configured to execute pruning processing on edges and/or nodes in the association relation graph based on the first association degree information and the second association degree information to obtain the target subgraph.

In one possible implementation manner, the association degree determining unit includes:

a second object feature determination subunit configured to perform determining a second object feature of the target object based on the plurality of historical multimedia and the second keyword;

a first relevance information and intermediate feature information obtaining subunit configured to perform input of the plurality of historical multimedia and the second object feature into a first relevance prediction model, perform prediction processing on the relevance between the target object and the plurality of historical multimedia, and obtain the first relevance information and intermediate feature information output by a gate cycle unit in the first relevance prediction model;

and the second relevancy information obtaining subunit is configured to input the plurality of historical multimedia, the second keyword and the intermediate feature information into a second relevancy prediction model, and perform prediction processing on the relevancy between the plurality of historical multimedia and the second keyword to obtain the second relevancy information.

In one possible implementation manner, the second object feature determination subunit includes:

a second weight information obtaining subunit, configured to perform input of the plurality of historical multimedia and the second keyword into a second attention network, to obtain second weight information of the second keyword;

a third weight information obtaining subunit, configured to perform inputting the target object and the plurality of historical multimedia into a third attention network, to obtain third weight information of the plurality of historical multimedia;

a second object feature obtaining subunit configured to perform determining the second object feature according to the second keyword, the second weight information, and the third weight information.

In a possible implementation manner, the association relationship graph obtaining module includes:

a node information obtaining unit configured to perform obtaining of the multiple multimedia to be recommended, multiple historical multimedia of which the target object has performed a preset action, and first text information and second text information associated with the multiple multimedia to be recommended;

a word segmentation processing unit configured to perform word segmentation processing on the first text information and the second text information respectively to obtain a plurality of first initial keywords and a plurality of second initial keywords;

an importance information determining unit configured to perform determining first importance information of the plurality of first initial keywords for the plurality of multimedia to be recommended and second importance information of the plurality of second initial keywords for the plurality of historical multimedia;

a plurality of keyword screening units configured to perform screening of the plurality of keywords from the plurality of first initial keywords and the plurality of second initial keywords based on the first importance degree information and the second importance degree information;

the incidence relation graph building unit is configured to execute the incidence relation building of the incidence relation graph based on three nodes of the target object, the plurality of historical multimedia, the plurality of multimedia to be recommended and the plurality of keywords.

In one possible implementation, the apparatus further includes:

the sample data acquisition module is configured to acquire a sample association relationship graph and label information, wherein the sample association relationship graph comprises three sample nodes of a plurality of sample objects, a plurality of sample multimedia and a plurality of sample words and edges between adjacent sample nodes, the sample multimedia performs a preset behavior on the target object, the edges between the adjacent sample nodes represent that the adjacent sample nodes have an association relationship, and the label information represents the association degree between the sample multimedia and the sample objects;

a sample object feature determination module configured to perform a sample object feature determination for each of the plurality of sample objects based on the plurality of sample multimedia and the sample word;

the sample prediction information acquisition module is configured to input the sample multimedia and the sample object characteristics into a first preset model, and perform prediction processing on the association degree between the sample object and the sample multimedia to obtain sample prediction information;

the target sample prediction information acquisition module is configured to repeat the input steps until preset times, and carry out statistical processing on sample prediction information corresponding to the preset times to obtain target sample prediction information;

a loss information determination module configured to perform determining loss information from the target sample prediction information and the tag information;

a first relevance prediction model obtaining module configured to perform training of the first preset model based on the loss information to obtain the first relevance prediction model.

According to a third aspect of an embodiment of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method of any of the first aspects above.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the method of any one of the first aspect of the embodiments of the present disclosure.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by a processor, cause a computer to perform the method of any one of the first aspects of the embodiments of the present disclosure.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

by including three nodes such as keywords and the like in the incidence relation graph, text data under multimedia is fully utilized, the multimedia understanding is improved, and meanwhile, the processing pressure can be reduced and the multimedia understanding efficiency can be improved as multimedia content does not need to be extracted in real time; and moreover, the incidence relation graph is selected to be pruned, so that the incidence degree among three nodes in the pruned target subgraph is higher, the graph structure is more simplified, the incidence relation in the incidence relation graph can be effectively denoised, the capturing precision and efficiency of the target object interest point can be improved, the multimedia recommendation precision and efficiency are improved, and the expansibility of an application scene of the multimedia recommendation processing is better.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.

FIG. 1 is a schematic diagram illustrating an application environment in accordance with an exemplary embodiment.

FIG. 2 is a flow chart illustrating a method of multimedia recommendation, according to an example embodiment.

FIG. 3a is a diagram illustrating an association graph, according to an example embodiment.

FIG. 3b is a schematic diagram of a target sub-graph shown in accordance with an exemplary embodiment.

Fig. 4 is a flowchart illustrating a method for determining respective multimedia features of a plurality of multimedia to be recommended according to a first keyword having an association relationship with the plurality of multimedia to be recommended in an association relationship diagram according to an exemplary embodiment.

FIG. 5 is a flowchart illustrating a method of obtaining an association graph, according to an example embodiment.

Fig. 6 is a flowchart illustrating a method for pruning an association graph to obtain a target subgraph according to an exemplary embodiment.

Fig. 7 is a flowchart illustrating a method for determining first relevancy information of a plurality of historical multimedia objects and second relevancy information of a second keyword under the plurality of historical multimedia objects according to an exemplary embodiment.

Fig. 8 is a schematic diagram illustrating a pruning network in accordance with an exemplary embodiment.

Fig. 9 is a flowchart illustrating a method for determining a second object feature of a target object based on a plurality of historical multimedia and a second keyword, according to an example embodiment.

FIG. 10 is a diagram illustrating a first relevance prediction model and a second relevance prediction model, according to an example embodiment.

FIG. 11 is a flowchart illustrating a method for training a first relevance prediction model, according to an example embodiment.

FIG. 12 is a block diagram illustrating a multimedia recommendation device according to an example embodiment.

FIG. 13 is a block diagram illustrating an electronic device for multimedia recommendation, according to an example embodiment.

Detailed Description

In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in other sequences than those illustrated or described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

In recent years, with research and development of artificial intelligence technology, the artificial intelligence technology is widely applied in multiple fields, and the scheme provided by the embodiment of the application relates to technologies such as machine learning/deep learning, and is specifically explained by the following embodiments:

referring to fig. 1, fig. 1 is a schematic diagram illustrating an application environment according to an exemplary embodiment, which may include a server 01 and a terminal 02, as shown in fig. 1.

In an alternative embodiment, server 01 may be used for multimedia recommendation processing. Specifically, the server 01 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.

In an alternative embodiment, the terminal 02 may be used to present recommended multimedia. Specifically, the terminal 02 may include, but is not limited to, a smart phone, a desktop computer, a tablet computer, a notebook computer, a smart speaker, a digital assistant, an Augmented Reality (AR)/Virtual Reality (VR) device, a smart wearable device, and other types of electronic devices. Optionally, the operating system running on the electronic device may include, but is not limited to, an android system, an IOS system, linux, windows, and the like.

In addition, it should be noted that fig. 1 illustrates only one application environment of the image processing method provided by the present disclosure.

In the embodiment of the present specification, the server 01 and the terminal 02 may be directly or indirectly connected by a wired or wireless communication method, and the present application is not limited herein.

It should be noted that the following figures show a possible sequence of steps, and in fact do not limit the order that must be followed. Some steps may be performed in parallel without being dependent on each other. User information (including but not limited to user device information, user personal information, user behavior information, etc.) and data (including but not limited to data for presentation, training, etc.) to which the present disclosure relates are both information and data that are authorized by the user or sufficiently authorized by various parties.

FIG. 2 is a flow chart illustrating a method of multimedia recommendation, according to an example embodiment. As shown in fig. 2, the following steps may be included.

In step S201, an association map is acquired.

The incidence relation graph may include three nodes of a target object, a plurality of multimedia and a plurality of keywords, and edges between adjacent nodes, the plurality of multimedia may include a plurality of multimedia to be recommended and a plurality of historical multimedia of which the target object has executed a preset behavior, and the edges may represent that there is an incidence relation between adjacent nodes, as shown in fig. 3 a.

In this embodiment, the target object may refer to a target user, and the target user may be one of multiple users in the multimedia recommendation platform. The preset behavior may include actions of approval, forwarding, presentation, etc., which are not limited by this disclosure.

In practical application, a plurality of historical multimedia of which the target object executes a preset behavior and multimedia to be recommended matched with the interest of the target object can be obtained, and keywords respectively associated with the historical multimedia and the multimedia to be recommended can be obtained, so that a heterogeneous graph among three nodes can be constructed as an association graph based on the target object, the multimedia (the historical multimedia and the multimedia to be recommended), the three nodes of the keywords and the association relationship among the three nodes. The multimedia node may be a first-order neighbor node of the target object, the keyword may be a second-order neighbor node of the target object, and the keyword may be a first-order neighbor node of the multimedia, as shown in fig. 3 a.

In step S203, according to the first keyword having an association relationship with the multiple to-be-recommended multimedia in the association relationship diagram, the respective multimedia characteristics of the multiple to-be-recommended multimedia are determined.

In an embodiment of the present specification, in order to improve interpretability of multimedia, the multimedia may be characterized by keywords associated with the multimedia. Based on the method, the multimedia characteristics of each multimedia to be recommended can be determined according to the first keywords which have the association relationship with each multimedia to be recommended in the association relationship graph. For example, the word features of the first keyword may be spliced or weighted to obtain the multimedia features of the multimedia to be recommended. Wherein the first keyword may be at least one of a plurality of keywords.

It should be noted that the word feature and the multimedia feature may be in the form of a vector, and the following information or feature related to the model input may also be in the form of a vector, and may be implemented by using a pre-trained embedded layer network to implement vector characterization, which is not limited in this disclosure.

In step S205, pruning is performed on the association graph to obtain a target subgraph; the target sub-graph comprises a target object, a plurality of multimedia to be recommended, target historical multimedia and a target keyword, wherein the target historical multimedia is at least one of the plurality of historical multimedia, and the target keyword can be at least one of the plurality of keywords.

In the embodiment of the present specification, in consideration of the existence of multimedia in which the target object is not interested in the historical multimedia in which the target object has performed the preset behavior, for example, in a single-column streaming recommendation scene of multiple videos, a video clicked by a user is not necessarily a video that is preferred by the user, so that redundancy exists in the interaction information between the user and the multimedia in an association relationship diagram, which increases more noise for capturing the interest of the user. Based on the method, the incidence relation graph is selected to be pruned, so that multimedia recommendation is carried out through the pruned target subgraph.

In one example, the association degree of the target historical multimedia and the target object can satisfy a first preset condition; the degree of association between the target keyword and the target historical multimedia can meet a second preset condition. The association degree between the target historical multimedia and the target object may be represented by interaction information between the target object and the target historical multimedia, and accordingly, the first preset condition may refer to that the interaction information is greater than or equal to an interaction threshold, for example, the number of clicks of the target object on the target historical multimedia is greater than a first threshold. Based on the method, the historical multimedia with the target object clicking times lower than the first time threshold value can be deleted, and the target historical multimedia is obtained.

Similarly to the above, the association degree of the target keyword with the target historical multimedia can be characterized by the frequency of the target keyword in the target historical multimedia. Accordingly, the second preset condition may refer to that the frequency of the keyword is greater than or equal to a frequency threshold, for example, the number of times that the target keyword appears in the target historical multimedia is greater than or equal to a second frequency threshold. Based on this, the number of times that a keyword (a keyword having an edge with the history multimedia) associated with each history multimedia appears under each history multimedia can be acquired, and keywords having a number of times that is smaller than the second-time threshold value that appear can be deleted, thereby taking the undeleted keywords as target keywords. The present disclosure does not limit the first and second decimal thresholds. The pruning processing through the correlation degree between the adjacent nodes can realize the effective denoising of the correlation diagram.

Further, a target sub-graph can be obtained based on the association relationship among the three nodes of the target object, the multiple multimedia to be recommended, the target historical multimedia and the target keyword, as shown in fig. 3b, and pruning processing on the association relationship graph is achieved.

In step S207, a first object feature of the target object is determined based on the target history multimedia and the target keyword.

In order to effectively represent the characteristics of the target object, the information of the target keyword is transferred to the target history multimedia and the information of the target history multimedia is transferred to the target object, so that the target object can determine the first object characteristics of the target object based on the target history multimedia and the target keyword, and the first object characteristics can cover more abundant information. In one example, the target historical multimedia and the target keyword may be aggregated to obtain the first object feature, for example, the multimedia feature of each target historical multimedia may be determined based on the word feature of the target keyword, so that the multimedia features of the target historical multimedia may be weighted or concatenated to obtain the first object feature, which is not limited by the present disclosure. The polymerization process may be a polymerization process based on the attention mechanism, and the following modes S901 to S905 may be referred to, but the present disclosure is not limited thereto.

In step S209, recommendation information of a plurality of multimedia to be recommended is determined according to the multimedia feature and the first object feature.

In practical applications, the recommendation information may be a recommendation probability, or may be recommendation classification information, such as recommendation or non-recommendation, which is not limited in this disclosure. In one example, the recommendation information is determined using a pre-trained recommendation prediction model, for example, the step S209 may include the steps of:

performing dot product processing on the multimedia characteristic and the first object characteristic to obtain a correlation characteristic;

and determining recommendation information of a plurality of multimedia to be recommended according to the association characteristics.

In practical application, the multimedia features and the first object features can be subjected to dot product processing through the recommendation prediction model to obtain the associated features, and the associated features can be processed based on the preset activation function to obtain recommendation information of a plurality of multimedia to be recommended. The recommendation information is determined through the machine learning model, and the efficiency and the precision of the recommendation information can be improved.

In step S211, a plurality of multimedia to be recommended are recommended to the target object based on the recommendation information.

In the embodiment of the specification, a target multimedia can be selected from a plurality of multimedia to be recommended based on the recommendation information, and the target multimedia is recommended to a target object; or the multiple multimedia to be recommended may be ranked based on the recommendation information to obtain a ranking result, and the multiple multimedia to be recommended may be recommended to the target object in sequence based on the ranking result. The present disclosure does not limit this, as long as it can be ensured that the multimedia to be recommended having a high matching degree with the target object can be preferentially recommended to the target user.

By including three nodes such as keywords and the like in the incidence relation graph, text data under multimedia is fully utilized, the multimedia understanding is improved, and meanwhile, the processing pressure can be reduced and the multimedia understanding efficiency can be improved as multimedia content does not need to be extracted in real time; and the incidence relation graph is selected to be pruned, so that the incidence degree among three nodes in the pruned target subgraph is higher, the graph structure is more simplified, the incidence relation in the incidence relation graph can be effectively denoised, the capturing precision and efficiency of the target object interest points can be improved, the multimedia recommendation precision and efficiency are improved, and the expansibility of an application scene of the multimedia recommendation processing is better.

Fig. 4 is a flowchart illustrating a method for determining respective multimedia features of a plurality of multimedia to be recommended according to a first keyword having an association relationship with the plurality of multimedia to be recommended in an association relationship diagram, according to an exemplary embodiment. As shown in fig. 4, in a possible implementation manner, the step S203 may include:

in step S401, a plurality of multimedia to be recommended and a first keyword are input into a first attention network, and first weight information of the first keyword is obtained;

in step S403, a weighting process is performed on the first keyword corresponding to each of the multiple multimedia to be recommended according to the first weighting information, so as to obtain the multimedia characteristics of each of the multiple multimedia to be recommended.

In practical application, the attention mechanism neural network can be used for carrying out aggregation processing on the first keyword to obtain the multimedia features characterized based on the first keyword. For example, a plurality of multimedia to be recommended and a first keyword may be input into the first attention network, and first weight information of the first keyword may be obtained. The plurality of multimedia to be recommended input herein may be identification features of the plurality of multimedia to be recommended, and the first keyword may be a word feature of the first keyword, which is not limited in this disclosure.

Further, the first keywords corresponding to the multiple to-be-recommended multimedia can be weighted according to the first weighting information, so that the respective multimedia characteristics of the multiple to-be-recommended multimedia are obtained. For example, the multimedia V1 to be recommended and the first keyword having an association relationship with V1 include e1, e2, and e5; the first weight information of e1, e2, e5 under V1 may be 0.3, 0.8, 0.5, and the word features corresponding to e1, e2, e5 may be c1, c2, c5; the multimedia feature of V1 may then be 0.3 × c1+0.8 × c2+0.5 × c5.

The first Attention Network may be a Graph Attention Network GAT (Graph Attention Network), and the manner of implementing aggregation by the weighting process may be implemented based on an aggregation function (AGG), which is not limited in this disclosure

The first weight information of the first keyword under the multimedia to be recommended is obtained through the attention mechanism, so that the precision and the efficiency of the first weight information can be improved, and the precision and the efficiency of the multimedia features can be improved.

FIG. 5 is a flowchart illustrating a method of obtaining an association graph, according to an example embodiment. As shown in fig. 5, in a possible implementation manner, the step S201 may include:

in step S501, a plurality of multimedia to be recommended, a plurality of historical multimedia of which the target object has performed a preset action, and first text information and second text information associated with the plurality of multimedia to be recommended are obtained.

In this embodiment of the present specification, a plurality of historical multimedia that has interacted with a target object, for example, a plurality of historical multimedia that has executed a preset action by the target object, may be obtained. And a plurality of multimedia to be recommended matched with the target object can be acquired, for example, multimedia matched with the interest of the target object is acquired as a plurality of multimedia to be recommended in a recall manner. Further, the first text information and the second text information which are respectively associated can be obtained from respective themes, tags, topics and comments of the plurality of multimedia to be recommended and the plurality of historical multimedia.

In step S503, the first text message and the second text message are respectively subjected to word segmentation processing to obtain a plurality of first initial keywords and a plurality of second initial keywords.

In practical application, a word segmentation tool can be used for performing word segmentation processing on the first text information and the second text information respectively to obtain a plurality of first initial keywords and a plurality of second initial keywords. The word segmentation tool may be a jieba word segmentation tool (a jieba word segmentation tool), which is not limited by the present disclosure. Optionally, a word bank in a multimedia recommendation scene may be constructed, and the word segmentation processing is performed in combination with the word bank, so as to improve the word segmentation accuracy.

In step S505, first importance degree information of a plurality of first initial keywords for a plurality of multimedia to be recommended and second importance degree information of a plurality of second initial keywords for a plurality of historical multimedia are determined.

In the embodiment of the present specification, the first importance level information and the second importance level information may be determined based on TFIDF (term frequency-inverse text frequency index). For example, the first importance degree information E _ t (i, j) may be determined using the following formula (1):

the E _ t (i, j) may refer to first importance information of an ith first initial keyword in first text information associated with a jth multimedia to be recommended for the jth multimedia to be recommended;

the number of times that the ith first initial keyword in the jth multimedia-to-be-recommended-associated first text information appears in the jth multimedia-to-be-recommended-associated first text information can be referred to; c _j The total number of the first initial keywords in the first text information related to the jth multimedia to be recommended can be referred to; v may refer to a total number of multimedia of the multimedia to be recommended and the history multimedia (i.e., a total number of texts of the first text information and the second text information); v ⁱ The number of multimedia including the ith first initial keyword in the multimedia to be recommended and the historical multimedia can be referred to, that is, the number of texts including the ith first initial keyword in the first text information and the second text information.

Accordingly, the second importance information may also be determined in the same manner as the first importance information, and is not described herein again.

In step S507, a plurality of keywords are screened out from the plurality of first initial keywords and the plurality of second initial keywords based on the first importance degree information and the second importance degree information.

In one example, the plurality of first initial keywords and the plurality of second initial keywords may be ranked based on the first importance information and the second importance information, respectively, and the first initial keywords and the second initial keywords ranked in the top may be taken as the plurality of keywords.

In another example, a first initial keyword and a second initial keyword, of which the first importance degree information and the second importance degree information are greater than an importance degree threshold, may be screened out from a plurality of first initial keywords and a plurality of second initial keywords as the plurality of keywords.

In step S509, an association relationship graph is constructed based on the association relationships among the target object, the plurality of historical multimedia, the plurality of multimedia to be recommended, and the plurality of keywords.

In the embodiment of the present specification, a target object, a plurality of historical multimedia, a plurality of multimedia to be recommended, and a plurality of keywords may be set as three types of nodes, as shown in fig. 3a, the multimedia may serve as an intermediate node, and the keywords may serve as second-order neighbor nodes of the target object and first-order neighbor nodes of the multimedia. And based on the incidence relation among three nodes of the target object, a plurality of historical multimedia, a plurality of multimedia to be recommended and a plurality of keywords, edges among adjacent nodes are constructed, so that an incidence relation graph can be constructed, as shown in fig. 3 a.

Through the establishment of the incidence relation graph and the screening of the initial keywords based on the importance degree of the initial keywords to the multimedia, the screened keywords have stronger incidence relation with the multimedia, and the guarantee is provided for accurately representing the characteristics of the multimedia through the keywords in the follow-up process.

Fig. 6 is a flowchart illustrating a method for pruning an association graph to obtain a target subgraph according to an exemplary embodiment. As shown in fig. 6, in a possible implementation manner, the step S205 may include:

in step S601, determining first relevance information of each of a plurality of historical multimedia under a target object and second relevance information of a second keyword under the plurality of historical multimedia; the second keyword is a keyword having an association relation with a plurality of historical multimedia;

in step S603, based on the first relevance information and the second relevance information, pruning is performed on the edges and/or nodes in the relevance graph to obtain a target subgraph.

In the embodiment of the present specification, the first preset condition and the second preset condition may be the same, and may be, for example, greater than or equal to the association threshold; or the first preset condition and the second preset condition may be different, for example, the first preset condition may refer to that the first association degree information is before the first preset number of ranks in the first association degree information rank; the second preset condition may refer to that the rank of the second relevance information is before a second preset number of ranks. Based on the method, the edges corresponding to the first relevance information and the second relevance information which are smaller than the relevance threshold value can be cut off; alternatively, the plurality of historical multimedia may be sorted based on the first association information, and a first preset number of historical multimedia may be retained from the plurality of historical multimedia based on the sorting result, for example, the first preset number of historical multimedia may be retained from the plurality of historical multimedia according to the sorting from high to low; wherein, the higher the ranking, the stronger the association degree of the representation target object and the historical multimedia. The plurality of second keywords may be ranked based on the second association information, and a second preset number of second keywords may be reserved from the plurality of second keywords based on the ranking result, for example, the second preset number of second keywords may be reserved from the plurality of second keywords according to the ranking from high to low; and the higher the ranking is, the stronger the association degree of the second keywords and the historical multimedia is represented. And deleting the edges corresponding to the historical multimedia and the second keyword except the retained historical multimedia and the second keyword.

Further, after pruning the edges, the historical multimedia without edges with the target object and the second keyword in the plurality of historical multimedia may be deleted, and the keyword without edges with the historical multimedia and the multimedia to be recommended in the plurality of keywords may be deleted, so as to obtain a target sub-graph, as shown in fig. 3 b.

The incidence relation graph is pruned according to the incidence degree among the nodes, so that the incidence relation graph can be effectively denoised, a target subgraph obtained through pruning can be better used for capturing user interests, and the recommendation precision can be further improved.

Fig. 7 is a flowchart illustrating a method for determining first relevancy information of a plurality of historical multimedia objects and second relevancy information of a second keyword under the plurality of historical multimedia objects according to an exemplary embodiment. As shown in fig. 7, in a possible implementation manner, the step S601 may include:

in step S701, a second object feature of the target object is determined based on the plurality of historical multimedia and the second keyword.

In practical application, as shown in fig. 3b, the word features of the keywords can be used to represent the multimedia features of the associated multimedia, and the object features of the associated objects can be represented by the multimedia features, so as to realize information transmission from right to left. For example, the sum of the features of the second keyword associated with one historical multimedia can be used as the multimedia feature of the one historical multimedia, and then the sum of the multimedia features of all historical multimedia can be used as the second object feature of the target object.

In one example, a pruning network as shown in fig. 8 may include a second attention network, a third attention network, a first relevance prediction model, and a second relevance prediction model. It should be noted that the second attention network and the third attention network may be both graph attention mechanism networks.

As shown in fig. 8 and 9, the step S701 may include:

in step S901, a plurality of historical multimedia and a second keyword are input to a second attention network, and second weight information of the second keyword is obtained;

in step S903, inputting the target object and the plurality of historical multimedia into a third attention network, obtaining third weight information of the plurality of historical multimedia;

in step S905, a second object feature is determined according to the second keyword, the second weight information, and the third weight information.

As an example, the second object feature U2 may be determined using the following equation (2):

U2＝W ^m1 *(W1*c1+W2*c2)+W ^m2 *(W3*c1+W4*c3+W5*c8) (2)

the historical multimedia comprises m1 and m2, wherein W1 and W2 are second weight information respectively corresponding to m 1-associated second keywords e1 and e2, and W3, W4 and W5 are second weight information respectively corresponding to m 2-associated second keywords e1, e3 and e5; w is a group of ^m1 And W ^m2 Third weight information for each of the historical multimedia m1 and m 2; e1, e2, e3 and e5 respectively correspond to the word characteristics ofc1, c2, c3 and c5.

The weight of the keyword under the multimedia and the weight of the multimedia under the target object are determined through an attention mechanism, the multimedia and the keyword associated with the target object can be captured more effectively, and therefore the second object features can be more accurate; in addition, based on the pruning of the attention mechanism, when different users leave the same multimedia, different users can pay attention to different keywords of the same multimedia, so that the interests of the users can be effectively distinguished.

In step S703, inputting the plurality of historical multimedia and the second object feature into a first relevance prediction model, and performing prediction processing on the relevance between the target object and the plurality of historical multimedia to obtain first relevance information and intermediate feature information output by a gate cycle unit in the first relevance prediction model;

in step S705, the plurality of historical multimedia, the second keyword, and the intermediate feature information are input into the second relevance prediction model, and the relevance between the plurality of historical multimedia and the second keyword is predicted to obtain second relevance information.

In one example, the first relevance prediction model may include a Gate recycling Unit GRU (Gate current Unit), a multi-layer Perceptron (MLP), and a classifier, which are connected in sequence, and the classifier may implement classification based on Gumbel Softmax, which is not limited by the present disclosure. The structure of the second correlation prediction model may be the same as the structure of the first correlation prediction model.

As shown in fig. 10, the first relevance prediction model may include a gate cycle unit 1, a multi-tier perceptron 1, and a classifier 1 connected in sequence; the second relevance prediction model may include a gate cycle unit 2, a multi-layer perceptron 2, and a classifier 2 connected in sequence. The first association degree information and the second association degree information may be a numerical value between 0 and 1, which facilitates comparison of the association degree between the multimedia and the keyword and the association degree between the target object and the multimedia. Wherein the intermediate characteristic information may be an output of the gate cycle unit 1.

The association degree between the three nodes is determined through the neural network model, so that the determination of the association degree information is more efficient, the method can effectively adapt to scenes with high requirements on big data processing and real-time performance, and the expandability of the multimedia recommendation processing is further improved; in addition, the intermediate characteristic information of the first relevance prediction model is used as the input of the second relevance prediction model, and the intermediate characteristic information can represent the first relevance of each historical multimedia under the target object, so that the information about the historical multimedia in the input of the second relevance prediction model is richer, and the accuracy of the second relevance information can be improved.

FIG. 11 is a flowchart illustrating a method for training a first relevance prediction model, according to an example embodiment. As shown in fig. 11, in one possible implementation, the method may include:

in step S1101, a sample association relationship graph and label information are obtained, where the sample association relationship graph may include three sample nodes of a plurality of sample objects, a plurality of sample multimedia and a plurality of sample words, and edges between adjacent sample nodes, the plurality of sample multimedia is multimedia in which a preset behavior is executed for a target object, the edges between adjacent sample nodes represent that an association relationship exists between the adjacent sample nodes, and the label information represents an association degree between the sample multimedia and the sample objects.

In practical applications, a plurality of sample objects and multimedia with the plurality of sample objects performing preset behaviors may be obtained, for example, a record that a hundred thousand users click on multimedia within two consecutive days may be obtained, and a plurality of sample objects and a plurality of sample multimedia may be obtained from the record. The obtaining manner of the corresponding sample word and the construction of the sample association relationship diagram may refer to step S201, which is not described herein again. The label information may be a pre-labeled numerical value between 0 and 1, which is not limited by this disclosure.

In step S1103, based on the plurality of sample multimedia and the sample words, a sample object feature of each of the plurality of sample objects is determined; the implementation manner of this step may refer to step S207 described above, and is not described herein again.

In step S1105, a plurality of sample multimedia and sample object characteristics are input into a first preset model, and a prediction process is performed on the association between a sample object and a plurality of sample multimedia, so as to obtain sample prediction information; the implementation manner of this step may refer to step S903 described above, and is not described herein again.

In step S1107, the above input steps are repeated until the preset number of times, and the sample prediction information corresponding to the preset number of times is subjected to statistical processing to obtain the target sample prediction information.

In practical applications, in order to improve the stability of the first relevance prediction model, the input step S1205 may be repeated until the preset number of times, and the sample prediction information corresponding to the preset number of times is subjected to statistical processing, for example, average value processing, to obtain the target sample prediction information.

In step S1109, loss information is determined from the target sample prediction information and the label information.

In the embodiments of the present specification, the loss information may be determined by using a cross entropy loss function, which is not limited in this disclosure.

In step S1111, a first preset model is trained based on the loss information, so as to obtain a first relevance prediction model.

In this embodiment of the present specification, gradient information may be determined based on the loss information, so that a model parameter of the first preset model is updated based on the gradient information until a preset condition is reached, and the first preset model when the preset condition is met is used as the first relevance prediction model.

In one example, the preset condition may be that the verification result does not rise continuously for a certain number of times. For example, an indicator such as AUC (area under the curve) may be calculated, which may reflect the performance of the model. The index verification is performed once in each round (S1205-S1207), and when the verification result is not lifted for 5 times continuously, the model stops iteration.

Alternatively, in the case that the first preset model comprises Gumbel Softmax, a higher temperature may be set at the initial stage of the model training, and the temperature may be controlled to gradually decrease as the number of iterations increases, which may facilitate efficient learning of the model.

By presetting times for the prediction processing of the correlation between the sample object and the plurality of sample multimedia in training and carrying out statistical processing on the sample prediction information corresponding to the preset times, the target sample prediction information is obtained for calculating the loss information, so that the loss information is more accurate, and the prediction result of the first correlation prediction model obtained based on the loss information training is more accurate.

FIG. 12 is a block diagram illustrating a multimedia recommendation device according to an example embodiment. Referring to fig. 12, the apparatus may include:

an incidence relation graph obtaining module 1201, configured to perform obtaining of an incidence relation graph, where the incidence relation graph includes three nodes of a target object, multiple multimedia and multiple keywords, and edges between adjacent nodes, the multiple multimedia includes multiple multimedia to be recommended and multiple historical multimedia of which the target object has performed a preset behavior, and the edges represent that there is an incidence relation between adjacent nodes;

a multimedia feature determining module 1203, configured to perform determining respective multimedia features of the multiple multimedia to be recommended according to a first keyword having an association relationship with the multiple multimedia to be recommended in the association relationship diagram;

a pruning module 1205 configured to perform pruning processing on the association relation graph to obtain a target sub-graph; the target sub-graph comprises a target object, a plurality of multimedia to be recommended, target historical multimedia and target keywords, the target historical multimedia is at least one of the plurality of historical multimedia, and the target keywords are at least one of the plurality of keywords;

a first object feature determination module 1207 configured to perform determining a first object feature of the target object based on the target history multimedia and the target keyword;

a recommendation information determining module 1209 configured to perform determining recommendation information of a plurality of multimedia to be recommended according to the multimedia feature and the first object feature;

and a recommending module 1211 configured to perform recommending a plurality of multimedia to be recommended to the target object based on the recommending information.

In one possible implementation, the multimedia feature determination module 1203 may include:

the first weight information acquisition unit is configured to input a plurality of multimedia to be recommended and a first keyword into a first attention network to obtain first weight information of the first keyword;

and the multimedia characteristic determining unit is configured to perform weighting processing on the first keywords corresponding to the multiple to-be-recommended multimedia according to the first weight information to obtain the respective multimedia characteristics of the multiple to-be-recommended multimedia.

In a possible implementation manner, the association degree of the target historical multimedia and the target object meets a first preset condition; and the degree of association between the target keyword and the target historical multimedia meets a second preset condition.

In one possible implementation, the pruning module 1205 can include:

the association degree determining unit is configured to determine first association degree information of each of the plurality of historical multimedia under the target object and second association degree information of the second keyword under the plurality of historical multimedia; the second keyword is a keyword having an association relation with a plurality of historical multimedia;

and the target subgraph acquisition unit is configured to execute pruning processing on edges and/or nodes in the association relationship graph based on the first association degree information and the second association degree information to obtain a target subgraph.

In one possible implementation manner, the association degree determining unit may include:

the first relevance information and intermediate characteristic information acquisition subunit is configured to input the plurality of historical multimedia and the second object characteristics into a first relevance prediction model, and perform prediction processing on the relevance between the target object and the plurality of historical multimedia to obtain the first relevance information and intermediate characteristic information output by a gate cycle unit in the first relevance prediction model;

and the second relevancy information acquiring subunit is configured to input the plurality of historical multimedia, the second keyword and the intermediate feature information into a second relevancy prediction model, and perform prediction processing on the relevancy between the plurality of historical multimedia and the second keyword to obtain second relevancy information.

In a possible implementation manner, the second object feature determination subunit may include:

a second weight information obtaining subunit configured to perform inputting the plurality of historical multimedia and the second keyword into a second attention network, to obtain second weight information of the second keyword;

a third weight information obtaining subunit configured to perform inputting the target object and the plurality of historical multimedia into a third attention network, resulting in third weight information of the plurality of historical multimedia;

and the second object characteristic acquisition subunit is configured to determine the second object characteristic according to the second keyword, the second weight information and the third weight information.

In a possible implementation manner, the association relationship graph obtaining module 1201 may include:

the node information acquisition unit is configured to acquire a plurality of multimedia to be recommended, a plurality of historical multimedia of which the target object executes a preset action, and first text information and second text information which are respectively associated with the plurality of multimedia to be recommended;

the word segmentation processing unit is configured to perform word segmentation processing on the first text information and the second text information respectively to obtain a plurality of first initial keywords and a plurality of second initial keywords;

an importance level information determining unit configured to perform determining first importance level information of a plurality of first initial keywords for a plurality of multimedia to be recommended and second importance level information of a plurality of second initial keywords for a plurality of historical multimedia;

a plurality of keyword screening units configured to perform screening of a plurality of keywords from a plurality of first initial keywords and a plurality of second initial keywords based on the first importance degree information and the second importance degree information;

and the incidence relation graph building unit is configured to execute the incidence relation among three nodes based on the target object, the plurality of historical multimedia, the plurality of multimedia to be recommended and the plurality of keywords to build the incidence relation graph.

In one possible implementation, the apparatus may further include:

the sample data acquisition module is configured to acquire a sample association relationship graph and label information, wherein the sample association relationship graph comprises a plurality of sample objects, a plurality of sample multimedia, a plurality of sample word three sample nodes and edges between adjacent sample nodes, the plurality of sample multimedia is multimedia of a target object which executes a preset behavior, the edges between the adjacent sample nodes represent that the adjacent sample nodes have an association relationship, and the label information represents the association degree of the sample multimedia and the sample objects;

a sample object feature determination module configured to perform a determination of a sample object feature for each of a plurality of sample objects based on a plurality of sample multimedia and sample words;

the sample prediction information acquisition module is configured to input a plurality of sample multimedia and sample object characteristics into a first preset model, and perform prediction processing on the association degree between a sample object and the plurality of sample multimedia to obtain sample prediction information;

and the first relevance prediction model obtaining module is configured to train a first preset model based on the loss information to obtain a first relevance prediction model.

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 13 is a block diagram illustrating an electronic device for multimedia recommendation, which may be a server, according to an exemplary embodiment, and an internal structure thereof may be as shown in fig. 13. The electronic device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the electronic device is used for connecting and communicating with an external terminal through a network. The computer program is executed by a processor to implement a method of multimedia recommendation.

Those skilled in the art will appreciate that the architecture shown in fig. 13 is merely a block diagram of some of the structures associated with the disclosed aspects and does not constitute a limitation on the electronic devices to which the disclosed aspects apply, as a particular electronic device may include more or less components than those shown, or combine certain components, or have a different arrangement of components.

In an exemplary embodiment, there is also provided an electronic device including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the multimedia recommendation method as in the embodiments of the present disclosure.

In an exemplary embodiment, there is also provided a computer-readable storage medium, in which instructions, when executed by a processor of an electronic device, enable the electronic device to perform a multimedia recommendation method in an embodiment of the present disclosure. The computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product containing instructions that, when run on a computer, cause the computer to perform the method of multimedia recommendation in embodiments of the present disclosure is also provided.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, the computer program may include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for multimedia recommendation, comprising:

acquiring an incidence relation graph, wherein the incidence relation graph comprises a target object, a plurality of multimedia, a plurality of key words and edges between adjacent nodes, the plurality of multimedia comprises a plurality of multimedia to be recommended which are acquired in a recall mode and are matched with the target object and a plurality of historical multimedia of which the target object executes a preset behavior, and the edges represent that incidence relation exists between the adjacent nodes; the target object refers to a target user, and the target user is one of a plurality of users in a multimedia recommendation platform;

pruning the incidence relation graph to obtain a target subgraph; the target sub-graph comprises the target object, the multiple multimedia to be recommended, target historical multimedia and target keywords, the target historical multimedia is at least one of the multiple historical multimedia, interaction information of the target object is larger than or equal to an interaction threshold, and the target keywords are at least one of the multiple keywords;

2. The method according to claim 1, wherein the step of determining the respective multimedia features of the plurality of to-be-recommended multimedia according to the first keyword in the association relationship graph, which has an association relationship with the plurality of to-be-recommended multimedia, comprises:

and carrying out weighting processing on the first keywords corresponding to the multiple to-be-recommended multimedia respectively according to the first weight information to obtain the multimedia characteristics of the multiple to-be-recommended multimedia respectively.

3. The multimedia recommendation method according to claim 1, wherein the degree of association between the target historical multimedia and the target object satisfies a first preset condition; and the relevance of the target keyword and the target historical multimedia meets a second preset condition.

4. The method according to claim 1 or 3, wherein the step of pruning the association graph to obtain a target subgraph comprises:

determining first relevance information of the plurality of historical multimedia under the target object and second relevance information of a second keyword under the plurality of historical multimedia; the second keyword is a keyword having an association relation with the plurality of historical multimedia;

and based on the first relevance information and the second relevance information, pruning is carried out on edges and/or nodes in the relevance relation graph to obtain the target subgraph.

5. The method of claim 4, wherein the step of determining a first association degree information of each of the plurality of historical multimedia under the target object and a second association degree information of a second keyword under the plurality of historical multimedia comprises:

6. The multimedia recommendation method according to claim 5, wherein said determining a second object feature of said target object based on said plurality of historical multimedia and said second keyword comprises:

7. The multimedia recommendation method according to claim 1, wherein said obtaining an association graph step comprises:

and constructing the incidence relation graph based on the incidence relation among the target object, the plurality of historical multimedia, the plurality of multimedia to be recommended and the plurality of key words.

8. The method of claim 5, further comprising:

acquiring a sample association relation graph and label information, wherein the sample association relation graph comprises three sample nodes of a plurality of sample objects, a plurality of sample multimedia and a plurality of sample words and edges between adjacent sample nodes, the sample multimedia performs a preset behavior on the target object, the edges between the adjacent sample nodes represent that the adjacent sample nodes have an association relation, and the label information represents the association degree between the sample multimedia and the sample objects;

determining sample object features of each of the plurality of sample objects based on the plurality of sample multimedia and the sample words;

repeating the step of inputting the plurality of sample multimedia and the sample object characteristics into a first preset model until preset times, and performing statistical processing on sample prediction information corresponding to the preset times to obtain target sample prediction information;

and training the first preset model based on the loss information to obtain the first relevance prediction model.

9. A multimedia recommendation apparatus, comprising:

the system comprises an incidence relation graph acquisition module, a correlation relation graph acquisition module and a correlation relation graph acquisition module, wherein the incidence relation graph comprises a target object, a plurality of multimedia, a plurality of key words and edges between adjacent nodes, the plurality of multimedia comprises a plurality of multimedia to be recommended which are acquired in a recall mode and matched with the target object and a plurality of historical multimedia of which the target object executes preset behaviors, and the edges represent that incidence relation exists between the adjacent nodes; the target object refers to a target user, and the target user is one of a plurality of users in the multimedia recommendation platform;

the pruning module is configured to perform pruning processing on the incidence relation graph to obtain a target subgraph; the target sub-graph comprises the target object, the multiple multimedia to be recommended, target historical multimedia and target keywords, the target historical multimedia is at least one of the multiple historical multimedia, interaction information of the target object is larger than or equal to an interaction threshold, and the target keywords are at least one of the multiple keywords;

10. The multimedia recommendation apparatus according to claim 9, wherein the multimedia feature determination module comprises:

and the multimedia characteristic determining unit is configured to perform weighting processing on the first key words corresponding to the multiple to-be-recommended multimedia according to the first weighting information to obtain the multimedia characteristics of the multiple to-be-recommended multimedia.

11. The multimedia recommendation device according to claim 9, wherein the degree of association between the target historical multimedia and the target object satisfies a first preset condition; and the relevance of the target keyword and the target historical multimedia meets a second preset condition.

12. The multimedia recommendation device according to claim 9 or 11, wherein the pruning module comprises:

a relevancy determining unit configured to perform determining first relevancy information of each of the plurality of historical multimedia under the target object and second relevancy information of a second keyword under the plurality of historical multimedia; the second keyword is a keyword having an association relation with the plurality of historical multimedia;

and the target subgraph acquisition unit is configured to execute pruning processing on edges and/or nodes in the association relationship graph based on the first association degree information and the second association degree information to obtain the target subgraph.

13. The multimedia recommendation apparatus according to claim 12, wherein the association degree determining unit comprises:

14. The multimedia recommendation device according to claim 13, wherein the second object feature determination subunit comprises:

15. The multimedia recommendation device according to claim 9, wherein the association map obtaining module comprises:

a node information obtaining unit configured to perform obtaining of the plurality of multimedia to be recommended, a plurality of historical multimedia of which the target object has performed a preset action, and first text information and second text information associated with the plurality of multimedia to be recommended;

and the incidence relation graph building unit is configured to execute building of the incidence relation graph based on incidence relations among the target object, the plurality of historical multimedia, the plurality of multimedia to be recommended and the plurality of key words.

16. The multimedia recommendation device of claim 13, further comprising:

a sample object feature determination module configured to perform a determination of sample object features for each of the plurality of sample objects based on the plurality of sample multimedia and the sample word;

the target sample prediction information acquisition module is configured to repeat the step of inputting the plurality of sample multimedia and the sample object characteristics into the first preset model until preset times, and perform statistical processing on sample prediction information corresponding to the preset times to obtain target sample prediction information;

17. An electronic device, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the multimedia recommendation method of any of claims 1-8.

18. A computer-readable storage medium, wherein instructions in the computer-readable storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the multimedia recommendation method of any of claims 1-8.