CN106484777B - Multimedia data processing method and device - Google Patents

Multimedia data processing method and device Download PDF

Info

Publication number
CN106484777B
CN106484777B CN201610821282.0A CN201610821282A CN106484777B CN 106484777 B CN106484777 B CN 106484777B CN 201610821282 A CN201610821282 A CN 201610821282A CN 106484777 B CN106484777 B CN 106484777B
Authority
CN
China
Prior art keywords
multimedia data
user
target
hidden
operation behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610821282.0A
Other languages
Chinese (zh)
Other versions
CN106484777A (en
Inventor
黄昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610821282.0A priority Critical patent/CN106484777B/en
Publication of CN106484777A publication Critical patent/CN106484777A/en
Application granted granted Critical
Publication of CN106484777B publication Critical patent/CN106484777B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the invention discloses a multimedia data processing method and a device, wherein the method comprises the following steps: generating a multimedia data operation behavior matrix according to the operation behaviors of a plurality of multimedia data in a preset multimedia database by a historical user group; based on the sparse self-coding neural network, calculating implicit eigenvectors corresponding to all multimedia data and user eigenvectors corresponding to all historical users according to the multimedia data operation behavior matrix; when a recommendation request corresponding to a target user is received and a history user group comprises the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and an implicit feature vector corresponding to each multimedia data in the personal operation behavior information. By adopting the invention, the recommended songs can be ensured to be the songs liked by the user, so that the recommendation effect is improved.

Description

Multimedia data processing method and device
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a multimedia data processing method and apparatus.
Background
With the development of internet technology, a wide variety of applications are emerging, such as instant messaging applications, gaming applications, multimedia data applications, and so forth. Taking the multimedia data application as an example, the user can listen to various songs through the multimedia data application, and can also recommend corresponding songs to the user by presuming songs that the user likes. Currently, the way to infer songs that a user likes may include: the songs collected (or downloaded) by the user are regarded as songs liked by the user, and therefore, it is possible to presume that the songs liked by the user include songs similar to the collected (or downloaded) songs, and to recommend the similar songs to the user. When the user does not collect (or download) songs, the songs which are completely played are regarded as the songs which are liked by the user, and then the recommendation of similar songs is carried out. However, the completely played song does not represent the song that the user listens to (for example, the user temporarily leaves the computer and the music player in the computer continues to play), and thus cannot represent the song that the user likes, so if the completely played song is directly identified as the song that the user likes, it cannot be guaranteed that the recommended song is the song that the user likes, and the recommendation effect is not good.
Disclosure of Invention
The embodiment of the invention provides a method and a device for processing multimedia data, which can ensure that recommended songs are songs liked by a user so as to improve the recommendation effect.
The embodiment of the invention provides a multimedia data processing method, which comprises the following steps:
generating a multimedia data operation behavior matrix according to the operation behaviors of a plurality of multimedia data in a preset multimedia database by a historical user group;
based on a sparse self-coding neural network, calculating implicit eigenvectors corresponding to all multimedia data and user eigenvectors corresponding to all historical users according to the multimedia data operation behavior matrix; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data;
when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and an implicit feature vector corresponding to each multimedia data in the personal operation behavior information.
Correspondingly, an embodiment of the present invention further provides a multimedia data processing apparatus, including:
the matrix generation module is used for generating a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database;
the characteristic calculation module is used for calculating implicit characteristic vectors corresponding to the multimedia data and user characteristic vectors corresponding to the historical users respectively based on the sparse self-coding neural network and according to the multimedia data operation behavior matrix; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data;
and the recommending module is used for acquiring a plurality of multimedia data in the personal operation behavior information of the target user when a recommending request corresponding to the target user is received and the historical user group comprises the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to the user characteristic vector corresponding to the target user and the implicit characteristic vector corresponding to each multimedia data in the personal operation behavior information.
The embodiment of the invention generates a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database, based on sparse self-coding neural network, calculating implicit eigenvector corresponding to each multimedia data and user eigenvector corresponding to each historical user according to multimedia data operation behavior matrix, therefore, the implicit characteristic vector can accurately represent the preference degree information of the historical user group to the multimedia data, and the user feature vector can accurately represent the preference degree information of a historical user to a plurality of multimedia data, accurate personalized recommendation can be realized for the target user through the implicit characteristic vector and the user characteristic vector, the recommended songs can be guaranteed to be songs liked by the target user, so that the recommendation effect is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a multimedia data processing method according to an embodiment of the present invention;
FIG. 3 is a flow chart of another multimedia data processing method according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a multimedia data processing apparatus according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a feature calculation module according to an embodiment of the present invention;
FIG. 6 is a schematic structural diagram of a recommendation module according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an implicit feature generation unit according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a user feature generation unit according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of another multimedia data processing apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic diagram of a network architecture according to an embodiment of the present invention. The network architecture may include a background server and a plurality of clients, each client may be connected to the background server through a network, the background server may determine users corresponding to each client as a historical user group, and the background server may further collect operation behaviors corresponding to each client (for example, a song listening behavior of a certain client, specifically, a song listened by the user corresponding to the client). Therefore, the background server can generate a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database, and calculate hidden eigenvectors corresponding to the multimedia data and user eigenvectors corresponding to the historical users respectively according to the multimedia data operation behavior matrix based on the sparse self-coding neural network; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizes a historical user's preference information for the plurality of multimedia data. When a client sends a recommendation request to the background server, the background server may obtain a plurality of multimedia data in the individual operation behavior information of a plurality of target users corresponding to the client, and perform recommendation processing on the plurality of multimedia data in the individual operation behavior information according to the user feature vector corresponding to the target user and the implicit feature vector corresponding to each multimedia data in the individual operation behavior information.
Referring to fig. 2, a flow chart of a multimedia data processing method according to an embodiment of the present invention is shown, where the method includes:
s101, generating a multimedia data operation behavior matrix according to operation behaviors of a historical user group on a plurality of multimedia data in a preset multimedia database;
specifically, the background server based on the multimedia data application may obtain an operation behavior of the historical user group on a plurality of multimedia data in a preset multimedia database. The multimedia database may be a music library, each multimedia data may be a song in the music library, and thus the operation behavior may include listening behaviors of each historical user in the historical user group to each multimedia data, the listening behaviors include a listened behavior and an listened-to behavior, and the background server may set different feature values for different listening behaviors to generate a multimedia data operation behavior matrix corresponding to the operation behavior. A table of characteristics of operation behaviors of a plurality of multimedia data in a preset multimedia database by a historical user group as shown in table 1 below:
song 1 Song 2 Song 3 Song 4
User 1 1 0 0 1
User 2 1 1 1 0
User 3 0 0 1 0
User 4 1 1 1 1
TABLE 1
Wherein, the feature value "1" in table 1 indicates that the user listened to the song, and the feature value "0" indicates that the user did not listen to the song, and if the feature value between user 1 and song 1 is "1", it indicates that user 1 listened to song 1; if the characteristic value between user 3 and song 2 is "0", it indicates that user 3 has not listened to song 2. Therefore, according to all the eigenvalues in table 1, a multimedia data operation behavior matrix corresponding to the operation behavior can be generated, i.e. the element P in the multimedia data operation behavior matrixuiCharacteristic value (P) corresponding to the listening behavior of user u to Song iui1, say user u listened to song i; pui0, indicating that user u did not listen to song i).
S102, calculating implicit eigenvectors corresponding to the multimedia data and user eigenvectors corresponding to the historical users respectively according to the multimedia data operation behavior matrix based on a sparse self-coding neural network; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data;
specifically, the background server may input the multimedia data operation behavior matrix to an input layer of a sparse self-encoder corresponding to the sparse self-encoding neural network, that is, input a feature value corresponding to each listening behavior in the multimedia data operation behavior matrix to the input layer; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes; the number of hidden nodes can be inferred (possibly from empirical values) by balancing the computational efficiency of the backend server with the user/song characterization accuracy. Wherein the dimensions of the input layer are the same as the dimensions of the output layer.
The sparse self-encoder can perform partial derivative training on the target parameters and the hidden parameters of the hidden nodes according to the parameters (namely the multimedia data operation behavior matrix) in the input layer and a preset target function for training the target parameters and the hidden parameters of the hidden nodes; wherein the objective function may be:
Figure BDA0001113170140000051
wherein x is the multimedia data operation behavior matrix; a is represented by W1 (1)、W2 (1)、W3 (1)、……、Wk+1 (1)A composed matrix, W1 (1)、W2 (1)、W3 (1)、……、Wk+1 (1)Hidden parameters (e.g. W) in K +1 hidden nodes respectively1 (1)Hidden parameters representing the first hidden node), and the last hidden node of the K +1 hidden nodes is an intercept term with hidden parameters of 1, and the output layer can be reconstructed by retaining the intercept term. s is composed ofN1、bN2、bN3、……、bNKA composed matrix, bN1、bN2、bN3、……、bNKIn the embodiment of the present invention, the number K +1 of hidden nodes has been specified, and the sparse requirement of the hidden layer can be relaxed, so that the 1 norm of s is approximated to 2 norms. Wherein, some users do not listen to some songs, but do not dislike the songs on behalf of the users,similarly, even some users listen to some songs, they cannot directly indicate that the users like the songs, so the embodiment of the present invention increases the correlation between the trained s matrix in the hidden layer and the user's preference for songs by adding a user interest factor item C to the objective function; the user interest factor item C comprises interest values C of the historical users in the multimedia data in the multimedia database respectivelyui(cuiA value representing the user u's interest in song i); an interest value is calculated based on the operation behavior type, the operation times and the complete operation rate of a historical user on multimedia data. Wherein each interest value C in the user interest factor item C isuiThe calculation formula of (c) may be: c. Cui=1+αlog(1+rui) (ii) a Wherein r is when user u directly collects/downloads Song iui1 is ═ 1; when user u has just listened to Song i (i.e., no favorites/downloads), rui=[min(nui,5)/5]*fuiWherein n isuiIs the number of times that user u listens to Song i, fuiIs the full listening rate (i.e. the full operating rate), fuiThe number of persons who completely listen to song i/all persons who listen to song i (i.e. the historical group of users); when user u has not listened to Song i, cui=0。
The sparse self-encoder can repeatedly and alternately perform the step S1 and the step S2 in the process of performing partial derivative training; step S1 is: fixing A, deriving s, and obtaining an optimal solution by using a least square method; step S2 is: and fixing s, deriving A, and obtaining an optimal solution by using a least square method.
Further, when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer (that is, the matching degree between the parameters of the output layer and the parameters of the input layer reaches a preset matching degree threshold), it is determined that the target parameters and the hidden parameters of the hidden nodes satisfy a convergence condition, at this time, the sparse self-encoder stops executing step S1 and step S2, and combines the hidden parameters of the hidden nodes satisfying the convergence condition into an implicit feature matrix, that is, the implicit feature matrix is a trained matrix a. In order to ensure the working efficiency of the background server, therefore, the dimension of the hidden layer in the sparse self-encoder is smaller than the dimension of the input layer, so the hidden parameters in the hidden layer can also be regarded as a compression form of the parameters in the input layer, that is, the implicit feature matrix is a compression matrix corresponding to the multimedia data operation behavior matrix; for example, a first column of features of the implicit feature matrix may be determined as an implicit feature vector of song 1, a second column of features of the implicit feature matrix may be determined as an implicit feature vector of song 2, and the like, where the implicit feature vector of song 1 may be characterized as preference information of the historical user group for song 1.
Furthermore, since the number of the sparse autoencoder is one, the user feature vectors corresponding to the historical users in the historical user group can be extracted from the parameter matrix s corresponding to the target parameter trained in the sparse autoencoder. After the background server calculates the user characteristic vectors corresponding to the historical users in the historical user group and the implicit characteristic vectors corresponding to the multimedia data in the multimedia database, the background server can store the user characteristic vectors and the implicit characteristic vectors so as to facilitate follow-up user-customized song recommendation according to the user characteristic vectors and the implicit characteristic vectors.
S103, when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and performing recommendation processing on the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and implicit feature vectors corresponding to the multimedia data in the personal operation behavior information;
specifically, when a recommendation request corresponding to a target user is received and the historical user group includes the target user, whether collected multimedia data is included in personal operation behavior information corresponding to the target user can be detected; if the multimedia data is detected to contain the collected multimedia data, acquiring first similar multimedia data corresponding to the collected multimedia data (namely a recommendation source), and taking the first similar multimedia data as recommendation data of the target user; if the collected multimedia data is not contained, whether the completely operated multimedia data (namely the completely listened song) is contained in the personal operation behavior information is further judged. When the individual operation behavior information corresponding to the target user contains completely operated multimedia data, performing dot product operation on a user characteristic vector corresponding to the target user and an implicit characteristic vector corresponding to the completely operated multimedia data to obtain an individualized characteristic value, and when the individualized characteristic value is larger than a preset characteristic value threshold value, acquiring second similar multimedia data corresponding to the completely operated multimedia data (namely a recommendation source), and taking the second similar multimedia data as the recommendation data of the target user.
Optionally, when the personal operation behavior information corresponding to the target user does not include completely operated multimedia data, obtaining a plurality of candidate multimedia data, and performing a dot product operation on the user feature vector corresponding to the target user and the implicit feature vector corresponding to each candidate multimedia data to obtain personalized feature values corresponding to each candidate multimedia data; and sequencing the candidate multimedia data according to the sequence of the individual characteristic values from large to small, and taking the candidate multimedia data with preset recommendation quantity as the recommendation data of the target user according to the sequencing result. For example, if the personal operation behavior information of the user a does not include a completely listened song, and the candidate songs include song a, song B, and song C, the user feature vector of the user a and the implicit feature vector of the song a may be point-multiplied to obtain an individualized feature value a; performing point multiplication on the user characteristic vector of the user A and the implicit characteristic vector of the song B to obtain an individualized characteristic value B; performing point multiplication on the user characteristic vector of the user A and the implicit characteristic vector of the song C to obtain an individualized characteristic value C; and after the personalized characteristic values are sequenced, B > C > a is obtained, and if the preset recommendation number is 2, the song B corresponding to the personalized characteristic value B and the song C corresponding to the personalized characteristic value C can be recommended to the user A.
Optionally, when the personalized feature value is calculated, the explicit feature value may also be referred to, that is, the personalized feature value of the user u to the song i is the result of point multiplication of the explicit feature value of the user u to the song i + the user feature vector of the user u and the implicit feature vector of the song i. The display characteristic value can be obtained by calculation according to the matching degree between the tag of the song and the tag of the user; the tags of the song may include a type of a music style, a type of a language, a type of a rhythm, etc. corresponding to the song; the tags of the user may include types of music styles, language types, rhythm types, etc. that the user likes. By adding the explicit characteristic value, the personalized characteristic value can more accurately describe the preference degree of the user u for the song i.
Optionally, when a recommendation request corresponding to a target user is received and the historical user group does not include the target user, it is indicated that the target user has not listened to a song on a multimedia data platform provided by a background server, the candidate multimedia data may be sorted according to a descending order of vector values of implicit eigenvectors corresponding to the candidate multimedia data, and a preset recommendation number of candidate multimedia data is used as the recommendation data of the target user according to a sorting result.
Optionally, the background server may periodically calculate and update the user feature vectors corresponding to the respective historical users and the implicit feature vectors corresponding to the respective multimedia data according to the new operation behavior, so that when a target user is recommended, it may be ensured that the acquired implicit feature vectors and user feature vectors related to the target user are always in accordance with the interest change of the user, that is, the accuracy of the calculated personalized feature value may be ensured. The target user can update the recommendation source of the target user once every time the target user listens to a song, so that multimedia data similar to the latest updated recommendation source can be directly acquired when the target user recommends songs to the target user in the follow-up process, and the similar multimedia data are recommended to the user, so that the recommendation efficiency is improved; and a user interface of the client can display that "song B is recommended according to the song a that you listen to on trial", wherein song a is a recommended source, and song B is multimedia data similar to the recommended source.
The embodiment of the invention generates a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database, based on sparse self-coding neural network, calculating implicit eigenvector corresponding to each multimedia data and user eigenvector corresponding to each historical user according to multimedia data operation behavior matrix, therefore, the implicit characteristic vector can accurately represent the preference degree information of the historical user group to the multimedia data, and the user feature vector can accurately represent the preference degree information of a historical user to a plurality of multimedia data, accurate personalized recommendation can be realized for the target user through the implicit characteristic vector and the user characteristic vector, the recommended songs can be guaranteed to be songs liked by the target user, so that the recommendation effect is improved.
Referring to fig. 3 again, a flow chart of another multimedia data processing method according to an embodiment of the present invention is shown, where the method includes:
s201, generating a multimedia data operation behavior matrix according to operation behaviors of a historical user group on a plurality of multimedia data in a preset multimedia database;
specifically, a specific implementation manner of the step S201 may refer to S101 in the embodiment corresponding to fig. 2, which is not described herein again.
S202, inputting the multimedia data operation behavior matrix to an input layer of a sparse self-coding device corresponding to the sparse self-coding neural network; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes;
specifically, the background server may input the multimedia data operation behavior matrix to an input layer of a sparse self-encoder corresponding to the sparse self-encoding neural network, that is, input a feature value corresponding to each listening behavior in the multimedia data operation behavior matrix to the input layer; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes; the number of hidden nodes can be inferred (possibly from empirical values) by balancing the computational efficiency of the backend server with the user/song characterization accuracy. Wherein the dimensions of the input layer are the same as the dimensions of the output layer.
S203, the sparse autoencoder performs partial derivative training on the target parameters and the hidden parameters of the hidden nodes according to the parameters in the input layer and a preset target function for training the target parameters and the hidden parameters of the hidden nodes;
the sparse self-encoder can perform partial derivative training on the target parameters and the hidden parameters of the hidden nodes according to the parameters (namely the multimedia data operation behavior matrix) in the input layer and a preset target function for training the target parameters and the hidden parameters of the hidden nodes; wherein the objective function may be:
Figure BDA0001113170140000091
wherein x is the multimedia data operation behavior matrix; a is represented by W1 (1)、W2 (1)、W3 (1)、……、Wk+1 (1)A composed matrix, W1 (1)、W2 (1)、W3 (1)、……、Wk+1 (1)Hidden parameters (e.g. W) in K +1 hidden nodes respectively1 (1)Hidden parameter representing the first hidden node), and the last hidden node of the K +1 hidden nodes is an intercept term with a hidden parameter of 1The output layer can be reconstructed by retaining the intercept term. s is composed ofN1、bN2、bN3、……、bNKA composed matrix, bN1、bN2、bN3、……、bNKIn the embodiment of the present invention, the number K +1 of hidden nodes has been specified, and the sparse requirement of the hidden layer can be relaxed, so that the 1 norm of s is approximated to 2 norms. Because some users do not listen to certain songs, but do not dislike the songs on behalf of the users, and similarly, even some users listen to certain songs, they cannot directly indicate that the users like the songs, the embodiment of the present invention increases the correlation between the trained s matrix in the hidden layer and the user's preference for songs by adding a user interest factor item C to the objective function; the user interest factor item C comprises interest values C of the historical users in the multimedia data in the multimedia database respectivelyui(cuiA value representing the user u's interest in song i); an interest value is calculated based on the operation behavior type, the operation times and the complete operation rate of a historical user on multimedia data. Wherein each interest value C in the user interest factor item C isuiThe calculation formula of (c) may be: c. Cui=1+αlog(1+rui) (ii) a Wherein r is when user u directly collects/downloads Song iui1 is ═ 1; when user u has just listened to Song i (i.e., no favorites/downloads), rui=[min(nui,5)/5]*fuiWherein n isuiIs the number of times that user u listens to Song i, fuiIs the full listening rate (i.e. the full operating rate), fuiThe number of persons who completely listen to song i/all persons who listen to song i (i.e. the historical group of users); when user u has not listened to Song i, cui=0。
The sparse self-encoder can repeatedly and alternately perform the step S1 and the step S2 in the process of performing partial derivative training; step S1 is: fixing A, deriving s, and obtaining an optimal solution by using a least square method; step S2 is: and fixing s, deriving A, and obtaining an optimal solution by using a least square method.
S204, when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, determining that the target parameters and the hidden parameters of the hidden nodes meet convergence conditions, and determining the hidden parameters of all the hidden nodes meeting the convergence conditions as target input sources;
specifically, the background server may also preset at least two sparse self-encoders, and when the parameters in the output layer of the first sparse self-encoder are close to the parameters in the input layer, it is determined that the target parameters and the hidden parameters of the hidden nodes satisfy the convergence condition, that is, the first sparse self-encoder completes training, and at this time, the hidden parameters of each hidden node satisfying the convergence condition in the first sparse self-encoder may be determined as the target input source.
S205, inputting the target input source into an input layer of a next sparse self-encoder according to a preset number of sparse self-encoders, training hidden parameters corresponding to the target input source by the next sparse self-encoder according to the target function, taking the trained hidden parameters in the next sparse self-encoder as the target input source, and repeatedly executing the step until the last sparse self-encoder trains the hidden parameters;
specifically, the background server may further input the target input source into an input layer of a next sparse self-encoder according to a preset number of sparse self-encoders, the next sparse self-encoder trains a hidden parameter corresponding to the target input source according to the target function, and the hidden parameter trained in the next sparse self-encoder is used as the target input source, and this step is repeatedly executed until a last sparse self-encoder trains a hidden parameter. The hidden parameters trained by the nth sparse self-encoder can be called n-order characteristic parameters, and the larger the n value is, the more accurate the n-order characteristic parameters can describe the preference degree of the user on the multimedia data. Wherein the objective function in the first sparse self-encoder may contain the user interest factor item C, and the objective functions in other sparse self-encoders except the first sparse self-encoder may not contain the user interest factor item C. Wherein the convergence condition of each sparse self-encoder is the same, that is, the convergence condition is that the parameters in the output layer are similar to the parameters in the input layer.
S206, combining the hidden parameters trained by the last sparse self-encoder into a hidden feature matrix;
s207, when the number of the sparse self-encoders is at least two, acquiring personal operation behavior information corresponding to each historical user, respectively performing vector average calculation on implicit characteristic vectors corresponding to operated multimedia data in each personal operation behavior information, and respectively taking each calculated average vector as a user characteristic vector corresponding to each historical user;
specifically, when the number of the sparse self-encoders is at least two, the personal operation behavior information corresponding to each historical user can be obtained, the implicit feature vectors corresponding to the operated multimedia data in each personal operation behavior information are respectively subjected to vector average calculation, and each calculated average vector is respectively used as the user feature vector corresponding to each historical user. For example, the personal operation behavior information corresponding to the historical user a may include songs listened to by the user a, and the songs listened to by the user a may include song 1, song 2, and song 3, and then the user feature vector corresponding to the historical user a may be an average vector of the implicit feature vectors of song 1, the implicit feature vector of song 2, and the implicit feature vector of song 3.
S208, when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and implicit feature vectors corresponding to the multimedia data in the personal operation behavior information;
the specific implementation manner of the step S208 may refer to S101 in the embodiment corresponding to fig. 2, which is not described herein again.
The embodiment of the invention generates a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database, based on sparse self-coding neural network, calculating implicit eigenvector corresponding to each multimedia data and user eigenvector corresponding to each historical user according to multimedia data operation behavior matrix, therefore, the implicit characteristic vector can accurately represent the preference degree information of the historical user group to the multimedia data, and the user feature vector can accurately represent the preference degree information of a historical user to a plurality of multimedia data, accurate personalized recommendation can be realized for the target user through the implicit characteristic vector and the user characteristic vector, the recommended songs can be guaranteed to be songs liked by the target user, so that the recommendation effect is improved.
Fig. 4 is a schematic structural diagram of a multimedia data processing apparatus according to an embodiment of the present invention. The multimedia data processing apparatus 1 may be applied to a background server, and the multimedia data processing apparatus 1 may include: the device comprises a matrix generation module 10, a feature calculation module 20 and a recommendation module 30;
the matrix generating module 10 is configured to generate a multimedia data operation behavior matrix according to operation behaviors of a historical user group on a plurality of multimedia data in a preset multimedia database;
specifically, the matrix generation module 10 may obtain an operation behavior of a historical user group on a plurality of multimedia data in a preset multimedia database. The multimedia database may be a music library, and each multimedia data may be a song in the music library, and thus the operation behavior may include the receipt of each multimedia data by each historical user in the historical user groupListening behavior, including both listening behavior and non-listening behavior, the matrix generation module 10 may set different feature values for different listening behaviors to generate a multimedia data manipulation behavior matrix corresponding to the manipulation behavior. Taking table 1 in the embodiment corresponding to fig. 2 as an example, a feature value "1" in table 1 indicates that the user has listened to the song, a feature value "0" indicates that the user has not listened to the song, and if the feature value between user 1 and song 1 is "1", it indicates that user 1 has listened to song 1; if the characteristic value between user 3 and song 2 is "0", it indicates that user 3 has not listened to song 2. Therefore, the matrix generating module 10 can generate a multimedia data operation behavior matrix corresponding to the operation behavior according to all the eigenvalues in the table 1, that is, an element P in the multimedia data operation behavior matrixuiCharacteristic value (P) corresponding to the listening behavior of user u to Song iui1, say user u listened to song i; pui0, indicating that user u did not listen to song i).
The feature calculation module 20 is configured to calculate, based on the sparse self-coding neural network, implicit feature vectors corresponding to each multimedia data and user feature vectors corresponding to each historical user according to the multimedia data operation behavior matrix; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data;
specifically, please refer to fig. 5, which is a schematic structural diagram of the feature calculating module 20, where the feature calculating module 20 may include: an input unit 201, a training unit 202, an implicit feature generation unit 203, and a user feature generation unit 204;
the input unit 201 is configured to input the multimedia data operation behavior matrix to an input layer of a sparse self-coding corresponding to the sparse self-coding neural network; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes;
the training unit 202 is configured to control the sparse autoencoder to perform partial derivative training on the target parameter and the hidden parameter of the hidden node according to the parameter in the input layer and a preset target function for training the target parameter and the hidden parameter of the hidden node;
the implicit characteristic generating unit 203 is configured to determine that the target parameter and the hidden parameter of the hidden node satisfy a convergence condition when the parameter in the output layer of the sparse self-encoder is similar to the parameter in the input layer, and combine the hidden parameters of the hidden nodes that satisfy the convergence condition into an implicit characteristic matrix; the implicit feature matrix is a compression matrix corresponding to the multimedia data operation behavior matrix, and comprises implicit feature vectors corresponding to the multimedia data respectively;
the user feature generating unit 204 is configured to calculate, according to the trained target parameter or the implicit feature matrix, user feature vectors corresponding to the historical users in the historical user group respectively;
when the number of the sparse self-encoders is one, the calculated hidden parameters are first-order feature parameters, that is, the hidden feature matrix includes the first-order feature parameters. The specific implementation manners of the input unit 201, the training unit 202, the implicit feature generating unit 203, and the user feature generating unit 204 may refer to S102 in the embodiment corresponding to fig. 2, which is not described herein again.
The recommending module 30 is configured to, when a recommending request corresponding to a target user is received and the historical user group includes the target user, obtain a plurality of multimedia data in the personal operation behavior information of the target user, and recommend the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and implicit feature vectors respectively corresponding to each multimedia data in the personal operation behavior information;
specifically, please refer to fig. 6, which is a schematic structural diagram of the recommending module 30, where the recommending module 30 may include: a detection unit 301, a similarity recommendation unit 302, an acquisition judgment unit 303, a dot product operation unit 304, and a sorting recommendation unit 305;
the detecting unit 301 is configured to detect whether collected multimedia data is included in personal operation behavior information corresponding to a target user when a recommendation request corresponding to the target user is received and the historical user group includes the target user;
the similar recommending unit 302 is configured to, if the detecting unit 301 detects that the collected multimedia data is included, obtain first similar multimedia data corresponding to the collected multimedia data, and use the first similar multimedia data as recommended data of the target user;
the obtaining and determining unit 303 is configured to further determine whether the personal operation behavior information includes completely operated multimedia data if the detecting unit 301 detects that the personal operation behavior information does not include collected multimedia data;
the point multiplication operation unit 304 is configured to, when the personal operation behavior information includes completely operated multimedia data, perform point multiplication operation on a user feature vector corresponding to the target user and an implicit feature vector corresponding to the completely operated multimedia data to obtain an individualized feature value;
the similar recommending unit 302 is further configured to, when the personalized feature value is greater than a preset feature value threshold, obtain second similar multimedia data corresponding to the completely operated multimedia data, and use the second similar multimedia data as recommended data of the target user;
the dot product operation unit 304 is further configured to, when the personal operation behavior information does not include completely operated multimedia data, obtain a plurality of candidate multimedia data, and perform dot product operation on the user feature vector corresponding to the target user and the implicit feature vector corresponding to each candidate multimedia data, to obtain personalized feature values corresponding to each candidate multimedia data;
the sorting recommendation unit 305 is configured to sort the plurality of candidate multimedia data according to each personalized feature value, and use a preset recommended number of candidate multimedia data as recommended data of the target user according to a sorting result;
the specific implementation manners of the detecting unit 301, the similar recommending unit 302, the obtaining determining unit 303, the point-product calculating unit 304, and the sorting recommending unit 305 may refer to S103 in the corresponding embodiment of fig. 2, and are not described herein again.
Optionally, when calculating the personalized feature value, the recommending module 30 may further refer to an explicit feature value, that is, the personalized feature value of the user u to the song i is the result of dot multiplication of the explicit feature value of the user u to the song i + the user feature vector of the user u and the implicit feature vector of the song i. The display characteristic value can be obtained by calculation according to the matching degree between the tag of the song and the tag of the user; the tags of the song may include a type of a music style, a type of a language, a type of a rhythm, etc. corresponding to the song; the tags of the user may include types of music styles, language types, rhythm types, etc. that the user likes. By adding the explicit characteristic value, the personalized characteristic value can more accurately describe the preference degree of the user u for the song i.
Optionally, when a recommendation request corresponding to a target user is received and the historical user group does not include the target user, which indicates that the target user has not listened to a song on a multimedia data platform provided by a background server, the recommendation module 30 may sort the plurality of candidate multimedia data according to a sequence from large to small of vector values of implicit eigenvectors corresponding to the candidate multimedia data, and use candidate multimedia data of a preset recommendation number as recommendation data of the target user according to a sorting result.
Optionally, the feature calculation module 20 may periodically calculate and update the user feature vectors corresponding to the historical users and the implicit feature vectors corresponding to the multimedia data according to the new operation behaviors, so that when a target user is recommended, it may be ensured that the acquired implicit feature vectors and user feature vectors related to the target user are always in accordance with the interest change of the user, that is, the accuracy of the personalized feature values calculated by the recommendation module 30 may be ensured. The recommending module 30 may update the recommending source of the target user once every time the target user listens to a song, so that when recommending songs to the target user in the following process, the recommending module 30 may directly acquire multimedia data similar to the recently updated recommending source, so as to recommend the similar multimedia data to the user, thereby improving the recommending efficiency; and a user interface of the client can display that "song B is recommended according to the song a that you listen to on trial", wherein song a is a recommended source, and song B is multimedia data similar to the recommended source.
Further, please refer to fig. 7 together, which is a schematic structural diagram of an implicit feature generation unit 203 according to an embodiment of the present invention, where the implicit feature generation unit 203 includes: a determination subunit 2031, a deep learning subunit 2032, and a combination subunit 2033;
the determining subunit 2031, configured to determine, when the parameter in the output layer of the sparse self-encoder is similar to the parameter in the input layer, that the target parameter and the hidden parameter of the hidden node satisfy a convergence condition, and determine the hidden parameter of each hidden node that satisfies the convergence condition as a target input source;
the deep learning subunit 2032 is configured to input the target input source to an input layer of a next sparse self-encoder according to a preset number of sparse self-encoders, where the next sparse self-encoder trains a hidden parameter corresponding to the target input source according to the target function, and uses the hidden parameter trained in the next sparse self-encoder as a target input source, and this step is repeatedly performed until a last sparse self-encoder trains a hidden parameter;
the combining subunit 2033 is configured to combine the hidden parameters trained by the last sparse self-encoder into a hidden feature matrix;
the hidden parameters trained by the nth sparse self-encoder can be called n-order characteristic parameters, and the larger the n value is, the more accurate the n-order characteristic parameters can describe the preference degree of the user on the multimedia data. For specific implementation of the determining subunit 2031, the deep learning subunit 2032, and the combining subunit 2033, reference may be made to S204 to S206 in the embodiment corresponding to fig. 3, which is not described herein again.
Further, please refer to fig. 8, which is a schematic structural diagram of a user feature generating unit 204 according to an embodiment of the present invention, where the user feature generating unit 204 includes: an extraction subunit 2041 and an average calculation subunit 2042;
the extracting subunit 2041 is configured to, when the number of the sparse autoencoder is one, extract a user feature vector corresponding to each historical user in the historical user group from a parameter matrix corresponding to a target parameter trained in the sparse autoencoder;
the average calculating subunit 2042 is configured to, when the number of the sparse autoencoders is at least two, obtain personal operation behavior information corresponding to each historical user, perform vector average calculation on implicit feature vectors corresponding to operated multimedia data in each personal operation behavior information, and use each calculated average vector as a user feature vector corresponding to each historical user;
specifically, when the number of the sparse autoencoder is at least two, the average calculating subunit 2042 may obtain the individual operation behavior information corresponding to each historical user, perform vector average calculation on the implicit feature vectors corresponding to the operated multimedia data in each individual operation behavior information, and use each calculated average vector as the user feature vector corresponding to each historical user. For example, the personal operation behavior information corresponding to the historical user a may include songs listened to by the user a, and the songs listened to by the user a may include song 1, song 2, and song 3, and then the user feature vector corresponding to the historical user a may be an average vector of the implicit feature vectors of song 1, the implicit feature vector of song 2, and the implicit feature vector of song 3.
The embodiment of the invention generates a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database, based on sparse self-coding neural network, calculating implicit eigenvector corresponding to each multimedia data and user eigenvector corresponding to each historical user according to multimedia data operation behavior matrix, therefore, the implicit characteristic vector can accurately represent the preference degree information of the historical user group to the multimedia data, and the user feature vector can accurately represent the preference degree information of a historical user to a plurality of multimedia data, accurate personalized recommendation can be realized for the target user through the implicit characteristic vector and the user characteristic vector, the recommended songs can be guaranteed to be songs liked by the target user, so that the recommendation effect is improved.
Fig. 9 is a schematic structural diagram of another multimedia data processing apparatus according to an embodiment of the present invention. As shown in fig. 9, the multimedia data processing apparatus 1000 may be applied to a backend server, and the multimedia data processing apparatus 1000 may include: at least one processor 1001, such as a CPU, at least one network interface 1004, a user interface 1003, memory 1005, at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a standard wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 9, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the multimedia data processing apparatus 1000 shown in fig. 9, the network interface 1004 is mainly used for connecting a client to recommend multimedia data to the client; the user interface 1003 is mainly used for providing an input interface for a user and acquiring data output by the user; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement
Generating a multimedia data operation behavior matrix according to the operation behaviors of a plurality of multimedia data in a preset multimedia database by a historical user group;
based on a sparse self-coding neural network, calculating implicit eigenvectors corresponding to all multimedia data and user eigenvectors corresponding to all historical users according to the multimedia data operation behavior matrix; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data;
when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and an implicit feature vector corresponding to each multimedia data in the personal operation behavior information.
In one embodiment, when the processor 1001 executes a sparse-based self-coding neural network and calculates implicit eigenvectors corresponding to each multimedia data and user eigenvectors corresponding to each historical user according to the multimedia data operation behavior matrix, the following steps are specifically executed:
inputting the multimedia data operation behavior matrix to an input layer of a sparse self-coding corresponding to the sparse self-coding neural network; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes;
controlling the sparse self-encoder to perform partial derivative training on the target parameters and hidden parameters of the hidden nodes according to parameters in the input layer and a preset target function for training the target parameters and the hidden parameters of the hidden nodes;
when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, determining that the target parameters and the hidden parameters of the hidden nodes meet the convergence condition, and combining the hidden parameters of all the hidden nodes meeting the convergence condition into an implicit feature matrix; the implicit feature matrix is a compression matrix corresponding to the multimedia data operation behavior matrix, and comprises implicit feature vectors corresponding to the multimedia data respectively;
and calculating user characteristic vectors corresponding to the historical users in the historical user group according to the trained target parameters or the implicit characteristic matrix.
In one embodiment, the processor 1001 specifically performs the following steps when performing determining that the target parameter and the hidden parameters of the hidden nodes satisfy a convergence condition when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, and combining the hidden parameters of the hidden nodes satisfying the convergence condition into an implicit feature matrix:
when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, determining that the target parameters and the hidden parameters of the hidden nodes meet convergence conditions, and determining the hidden parameters of all the hidden nodes meeting the convergence conditions as target input sources;
inputting the target input source into an input layer of a next sparse self-encoder according to a preset number of sparse self-encoders, training hidden parameters corresponding to the target input source by the next sparse self-encoder according to the target function, taking the trained hidden parameters in the next sparse self-encoder as the target input source, and repeatedly executing the step until the last sparse self-encoder trains the hidden parameters;
and combining the hidden parameters trained by the last sparse self-encoder into a hidden feature matrix.
In one embodiment, the objective function includes a preset user interest factor item, where the user interest factor item includes an interest value of each historical user in each multimedia data in the multimedia database;
an interest value is calculated based on the operation behavior type, the operation times and the complete operation rate of a historical user on multimedia data.
In an embodiment, when the processor 1001 calculates the user feature vectors corresponding to the historical users in the historical user group according to the trained target parameters or the implicit feature matrix, the following steps are specifically performed:
when the number of the sparse self-encoders is one, extracting user feature vectors respectively corresponding to historical users in the historical user group from a parameter matrix corresponding to a target parameter trained in the sparse self-encoders;
when the number of the sparse self-encoders is at least two, personal operation behavior information corresponding to each historical user is obtained, vector average calculation is carried out on implicit characteristic vectors corresponding to operated multimedia data in each personal operation behavior information, and each calculated average vector is used as a user characteristic vector corresponding to each historical user.
In one embodiment, when the processor 1001 obtains a plurality of multimedia data in the personal operation behavior information of the target user when a recommendation request corresponding to the target user is received and the historical user group includes the target user, and performs recommendation processing on the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and an implicit feature vector corresponding to each multimedia data in the personal operation behavior information, the following steps are specifically performed:
when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, detecting whether collected multimedia data is contained in personal operation behavior information corresponding to the target user or not;
if the collected multimedia data is detected to be contained, acquiring first similar multimedia data corresponding to the collected multimedia data, and using the first similar multimedia data as recommended data of the target user;
if the collected multimedia data is not contained, further judging whether the completely operated multimedia data is contained in the personal operation behavior information;
when the individual operation behavior information contains completely operated multimedia data, performing dot product operation on a user characteristic vector corresponding to the target user and an implicit characteristic vector corresponding to the completely operated multimedia data to obtain an individualized characteristic value;
and when the personalized characteristic value is larger than a preset characteristic value threshold value, second similar multimedia data corresponding to the completely operated multimedia data are obtained, and the second similar multimedia data are used as the recommended data of the target user.
In one embodiment, the processor 1001 further performs the steps of:
when the individual operation behavior information does not contain completely operated multimedia data, obtaining a plurality of candidate multimedia data, and performing dot product operation on the user characteristic vector corresponding to the target user and the implicit characteristic vector corresponding to each candidate multimedia data to obtain the personalized characteristic value corresponding to each candidate multimedia data;
and sequencing the plurality of candidate multimedia data according to each personalized characteristic value, and taking the candidate multimedia data with preset recommendation quantity as the recommendation data of the target user according to the sequencing result.
The embodiment of the invention generates a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database, based on sparse self-coding neural network, calculating implicit eigenvector corresponding to each multimedia data and user eigenvector corresponding to each historical user according to multimedia data operation behavior matrix, therefore, the implicit characteristic vector can accurately represent the preference degree information of the historical user group to the multimedia data, and the user feature vector can accurately represent the preference degree information of a historical user to a plurality of multimedia data, accurate personalized recommendation can be realized for the target user through the implicit characteristic vector and the user characteristic vector, the recommended songs can be guaranteed to be songs liked by the target user, so that the recommendation effect is improved.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (16)

1. A method for processing multimedia data, comprising:
generating a multimedia data operation behavior matrix according to the operation behaviors of a plurality of multimedia data in a preset multimedia database by a historical user group;
based on a sparse self-coding neural network, calculating implicit eigenvectors corresponding to all multimedia data and user eigenvectors corresponding to all historical users according to the multimedia data operation behavior matrix; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data; the implicit feature vector is a vector in an implicit feature matrix, and the implicit feature matrix is obtained by combining hidden parameters of hidden nodes in the sparse self-coding neural network meeting a convergence condition; the user characteristic vector is obtained by calculation based on an implicit characteristic vector corresponding to operated multimedia data in the personal operation behavior information of the historical user;
when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and an implicit feature vector corresponding to each multimedia data in the personal operation behavior information.
2. The method of claim 1, wherein the computing implicit eigenvectors corresponding to each multimedia data and user eigenvectors corresponding to each historical user according to the multimedia data operation behavior matrix based on the sparse self-coding neural network comprises:
inputting the multimedia data operation behavior matrix to an input layer of a sparse self-coding corresponding to the sparse self-coding neural network; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes;
the sparse self-encoder performs partial derivative training on the target parameters and hidden parameters of the hidden nodes according to the parameters in the input layer and a preset target function for training the target parameters and the hidden parameters of the hidden nodes;
when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, determining that the target parameters and the hidden parameters of the hidden nodes meet the convergence condition, and combining the hidden parameters of all the hidden nodes meeting the convergence condition into an implicit feature matrix; the implicit feature matrix is a compression matrix corresponding to the multimedia data operation behavior matrix, and comprises implicit feature vectors corresponding to the multimedia data respectively;
and calculating user characteristic vectors corresponding to the historical users in the historical user group according to the trained target parameters or the implicit characteristic matrix.
3. The method of claim 2, wherein the determining that the target parameter and the hidden parameters of the hidden nodes satisfy a convergence condition when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, and combining the hidden parameters of the hidden nodes satisfying the convergence condition into an implicit feature matrix comprises:
when the parameters in the output layer of the sparse self-encoder are similar to the parameters in the input layer, determining that the target parameters and the hidden parameters of the hidden nodes meet convergence conditions, and determining the hidden parameters of all the hidden nodes meeting the convergence conditions as target input sources;
inputting the target input source into an input layer of a next sparse self-encoder according to a preset number of sparse self-encoders, training hidden parameters corresponding to the target input source by the next sparse self-encoder according to the target function, taking the trained hidden parameters in the next sparse self-encoder as the target input source, and repeatedly executing the step until the last sparse self-encoder trains the hidden parameters;
and combining the hidden parameters trained by the last sparse self-encoder into a hidden feature matrix.
4. The method of claim 2, wherein the objective function comprises a preset user interest factor item, and the user interest factor item comprises interest values of the historical users in the multimedia data in the multimedia database respectively;
an interest value is calculated based on the operation behavior type, the operation times and the complete operation rate of a historical user on multimedia data.
5. The method of claim 3, wherein the calculating the user feature vector corresponding to each historical user in the historical user group according to the trained target parameter or the implicit feature matrix comprises:
when the number of the sparse self-encoders is one, extracting user feature vectors respectively corresponding to historical users in the historical user group from a parameter matrix corresponding to a target parameter trained in the sparse self-encoders;
when the number of the sparse self-encoders is at least two, personal operation behavior information corresponding to each historical user is obtained, vector average calculation is carried out on implicit characteristic vectors corresponding to operated multimedia data in each personal operation behavior information, and each calculated average vector is used as a user characteristic vector corresponding to each historical user.
6. The method of claim 1, wherein when a recommendation request corresponding to a target user is received and the historical user group includes the target user, acquiring a plurality of multimedia data in the personal operation behavior information of the target user, and performing recommendation processing on the plurality of multimedia data in the personal operation behavior information according to a user feature vector corresponding to the target user and an implicit feature vector corresponding to each multimedia data in the personal operation behavior information, respectively, comprises:
when a recommendation request corresponding to a target user is received and the historical user group comprises the target user, detecting whether collected multimedia data is contained in personal operation behavior information corresponding to the target user or not;
if the collected multimedia data is detected to be contained, acquiring first similar multimedia data corresponding to the collected multimedia data, and using the first similar multimedia data as recommended data of the target user;
if the collected multimedia data is not contained, further judging whether the completely operated multimedia data is contained in the personal operation behavior information;
when the individual operation behavior information contains completely operated multimedia data, performing dot product operation on a user characteristic vector corresponding to the target user and an implicit characteristic vector corresponding to the completely operated multimedia data to obtain an individualized characteristic value;
and when the personalized characteristic value is larger than a preset characteristic value threshold value, second similar multimedia data corresponding to the completely operated multimedia data are obtained, and the second similar multimedia data are used as the recommended data of the target user.
7. The method of claim 6, further comprising:
when the individual operation behavior information does not contain completely operated multimedia data, obtaining a plurality of candidate multimedia data, and performing dot product operation on the user characteristic vector corresponding to the target user and the implicit characteristic vector corresponding to each candidate multimedia data to obtain the personalized characteristic value corresponding to each candidate multimedia data;
and sequencing the plurality of candidate multimedia data according to each personalized characteristic value, and taking the candidate multimedia data with preset recommendation quantity as the recommendation data of the target user according to the sequencing result.
8. A multimedia data processing apparatus, comprising:
the matrix generation module is used for generating a multimedia data operation behavior matrix according to the operation behaviors of the historical user group on a plurality of multimedia data in a preset multimedia database;
the characteristic calculation module is used for calculating implicit characteristic vectors corresponding to the multimedia data and user characteristic vectors corresponding to the historical users respectively based on the sparse self-coding neural network and according to the multimedia data operation behavior matrix; an implicit feature vector characterizes the preference information of the historical user group for a multimedia data; a user feature vector characterizing the preference degree information of a historical user for the plurality of multimedia data; the implicit feature vector is a vector in an implicit feature matrix, and the implicit feature matrix is obtained by combining hidden parameters of hidden nodes in the sparse self-coding neural network meeting a convergence condition; the user characteristic vector is obtained by calculation based on an implicit characteristic vector corresponding to operated multimedia data in the personal operation behavior information of the historical user;
and the recommending module is used for acquiring a plurality of multimedia data in the personal operation behavior information of the target user when a recommending request corresponding to the target user is received and the historical user group comprises the target user, and recommending the plurality of multimedia data in the personal operation behavior information according to the user characteristic vector corresponding to the target user and the implicit characteristic vector corresponding to each multimedia data in the personal operation behavior information.
9. The apparatus of claim 8, wherein the feature calculation module comprises:
the input unit is used for inputting the multimedia data operation behavior matrix to an input layer of a sparse self-coding device corresponding to the sparse self-coding neural network; the sparse self-encoder comprises the input layer, a hidden layer, an output layer and a target parameter between the hidden layer and the output layer; the hidden layer comprises a preset number of hidden nodes;
the training unit is used for controlling the sparse self-encoder to perform partial derivative training on the target parameters and the hidden parameters of the hidden nodes according to the parameters in the input layer and a preset target function for training the target parameters and the hidden parameters of the hidden nodes;
an implicit feature generation unit, configured to determine that the target parameter and hidden parameters of the hidden nodes satisfy a convergence condition when a parameter in the output layer of the sparse self-encoder is similar to a parameter in the input layer, and combine hidden parameters of hidden nodes that satisfy the convergence condition into an implicit feature matrix; the implicit feature matrix is a compression matrix corresponding to the multimedia data operation behavior matrix, and comprises implicit feature vectors corresponding to the multimedia data respectively;
and the user characteristic generating unit is used for calculating user characteristic vectors corresponding to all the historical users in the historical user group according to the trained target parameters or the implicit characteristic matrix.
10. The apparatus of claim 9, wherein the implicit feature generation unit comprises:
a determining subunit, configured to determine that the target parameter and the hidden parameter of the hidden node satisfy a convergence condition when a parameter in the output layer of the sparse self-encoder is similar to a parameter in the input layer, and determine the hidden parameter of each hidden node that satisfies the convergence condition as a target input source;
the deep learning subunit is used for inputting the target input source into an input layer of a next sparse self-encoder according to a preset number of sparse self-encoders, the next sparse self-encoder trains hidden parameters corresponding to the target input source according to the target function, the hidden parameters trained in the next sparse self-encoder are used as target input sources, and the step is repeatedly executed until the last sparse self-encoder trains the hidden parameters;
and the combining subunit is used for combining the hidden parameters trained by the last sparse self-encoder into a hidden feature matrix.
11. The apparatus of claim 9, wherein the objective function comprises a preset user interest factor item, and the user interest factor item comprises interest values of the historical users in the multimedia data in the multimedia database respectively;
an interest value is calculated based on the operation behavior type, the operation times and the complete operation rate of a historical user on multimedia data.
12. The apparatus of claim 10, wherein the user characteristic generating unit comprises:
the extraction subunit is configured to, when the number of the sparse autoencoder is one, extract a user feature vector corresponding to each historical user in the historical user group from a parameter matrix corresponding to a target parameter trained in the sparse autoencoder;
and the average calculation subunit is used for acquiring the personal operation behavior information corresponding to each historical user when the number of the sparse self-encoders is at least two, performing vector average calculation on implicit characteristic vectors corresponding to the operated multimedia data in each piece of personal operation behavior information, and taking each calculated average vector as a user characteristic vector corresponding to each historical user.
13. The apparatus of claim 8, wherein the recommendation module comprises:
the detection unit is used for detecting whether collected multimedia data is contained in the personal operation behavior information corresponding to the target user when a recommendation request corresponding to the target user is received and the historical user group contains the target user;
the similar recommending unit is used for acquiring first similar multimedia data corresponding to the collected multimedia data if the detecting unit detects that the collected multimedia data are included, and taking the first similar multimedia data as the recommended data of the target user;
the acquisition judging unit is used for further judging whether the personal operation behavior information contains completely operated multimedia data or not if the detecting unit detects that the collected multimedia data is not contained;
the dot multiplication operation unit is used for performing dot multiplication operation on the user characteristic vector corresponding to the target user and the implicit characteristic vector corresponding to the completely operated multimedia data to obtain an individualized characteristic value when the completely operated multimedia data is contained in the personal operation behavior information;
the similar recommending unit is further configured to acquire second similar multimedia data corresponding to the completely operated multimedia data when the personalized feature value is larger than a preset feature value threshold, and use the second similar multimedia data as the recommended data of the target user.
14. The apparatus of claim 13, wherein the recommendation module further comprises:
the dot multiplication operation unit is further configured to, when the personal operation behavior information does not include completely operated multimedia data, obtain a plurality of candidate multimedia data, and perform dot multiplication operation on the user feature vector corresponding to the target user and the implicit feature vector corresponding to each candidate multimedia data, to obtain personalized feature values corresponding to each candidate multimedia data;
and the sequencing recommendation unit is used for sequencing the candidate multimedia data according to each personalized characteristic value and taking the candidate multimedia data with preset recommendation quantity as the recommendation data of the target user according to the sequencing result.
15. A multimedia data processing apparatus, comprising: a processor, a memory, and a network interface;
the processor is connected to the memory and the network interface, wherein the network interface is configured to provide data communication functions, the memory is configured to store program code, and the processor is configured to call the program code to perform the method according to any one of claims 1 to 7.
16. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method according to any of claims 1-7.
CN201610821282.0A 2016-09-12 2016-09-12 Multimedia data processing method and device Active CN106484777B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610821282.0A CN106484777B (en) 2016-09-12 2016-09-12 Multimedia data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610821282.0A CN106484777B (en) 2016-09-12 2016-09-12 Multimedia data processing method and device

Publications (2)

Publication Number Publication Date
CN106484777A CN106484777A (en) 2017-03-08
CN106484777B true CN106484777B (en) 2020-09-08

Family

ID=58273774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610821282.0A Active CN106484777B (en) 2016-09-12 2016-09-12 Multimedia data processing method and device

Country Status (1)

Country Link
CN (1) CN106484777B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10757218B2 (en) * 2017-03-29 2020-08-25 Alibaba Group Holding Limited Method and apparatus for generating push notifications
CN107577736B (en) * 2017-08-25 2021-12-17 武汉数字智能信息科技有限公司 File recommendation method and system based on BP neural network
CN108345697A (en) * 2018-03-22 2018-07-31 山东财经大学 Wisdom course towards group of college students recommends method, system and storage medium
CN108846097B (en) * 2018-06-15 2021-01-29 北京搜狐新媒体信息技术有限公司 User interest tag representation method, article recommendation device and equipment
CN109034953B (en) * 2018-07-02 2021-11-23 西南交通大学 Movie recommendation method
CN108959603B (en) * 2018-07-13 2022-03-29 北京印刷学院 Personalized recommendation system and method based on deep neural network
CN109491704A (en) * 2018-11-08 2019-03-19 北京字节跳动网络技术有限公司 Method and apparatus for handling information
CN109522333A (en) * 2018-11-23 2019-03-26 北京锐安科技有限公司 Data analysing method, device, equipment and medium
CN110213325B (en) * 2019-04-02 2021-09-24 腾讯科技(深圳)有限公司 Data processing method and data pushing method
CN111061907B (en) * 2019-12-10 2023-06-20 腾讯科技(深圳)有限公司 Media data processing method, device and storage medium
CN112733014A (en) * 2020-12-30 2021-04-30 上海众源网络有限公司 Recommendation method, device, equipment and storage medium
CN114491093B (en) * 2021-12-22 2023-03-28 北京达佳互联信息技术有限公司 Multimedia resource recommendation and object representation network generation method and device
CN115174997B (en) * 2022-06-29 2023-11-28 Vidaa国际控股(荷兰)公司 Display device and media asset recommendation method

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102135991A (en) * 2011-03-11 2011-07-27 桂林电子科技大学 Backward learning-based dynamic multi-attribute service selecting method
CN102426686A (en) * 2011-09-29 2012-04-25 南京大学 Internet information product recommending method based on matrix decomposition
CN103377250B (en) * 2012-04-27 2017-08-04 杭州载言网络技术有限公司 Top k based on neighborhood recommend method
CN102831234B (en) * 2012-08-31 2015-04-22 北京邮电大学 Personalized news recommendation device and method based on news content and theme feature
CN104123315B (en) * 2013-04-28 2018-01-30 北京音之邦文化科技有限公司 The recommendation method and recommendation server of multimedia file
CN103383702A (en) * 2013-07-17 2013-11-06 中国科学院深圳先进技术研究院 Method and system for recommending personalized news based on ranking of votes of users
CN104765728B (en) * 2014-01-08 2017-07-18 富士通株式会社 The method trained the method and apparatus of neutral net and determine sparse features vector
CN104063481B (en) * 2014-07-02 2017-11-14 山东大学 A kind of film personalized recommendation method based on the real-time interest vector of user
US20160078520A1 (en) * 2014-09-12 2016-03-17 Microsoft Corporation Modified matrix factorization of content-based model for recommendation system
US10567838B2 (en) * 2014-10-01 2020-02-18 Adobe Inc. Digital content consumption analysis
CN104504055B (en) * 2014-12-19 2017-12-26 常州飞寻视讯信息科技有限公司 The similar computational methods of commodity and commercial product recommending system based on image similarity
CN104731954B (en) * 2015-04-01 2018-01-26 天翼爱音乐文化科技有限公司 Music is had an X-rayed based on group and recommends method and system
CN104834686B (en) * 2015-04-17 2018-12-28 中国科学院信息工程研究所 A kind of video recommendation method based on mixing semantic matrix
CN105512681A (en) * 2015-12-07 2016-04-20 北京信息科技大学 Method and system for acquiring target category picture
CN105915949A (en) * 2015-12-23 2016-08-31 乐视网信息技术(北京)股份有限公司 Video content recommending method, device and system

Also Published As

Publication number Publication date
CN106484777A (en) 2017-03-08

Similar Documents

Publication Publication Date Title
CN106484777B (en) Multimedia data processing method and device
CN105701191B (en) Pushed information click rate estimation method and device
CN107451894B (en) Data processing method, device and computer readable storage medium
WO2018103718A1 (en) Application recommendation method and apparatus, and server
CN107451832B (en) Method and device for pushing information
CN107832426B (en) APP recommendation method and system based on using sequence context
CN108304354B (en) Prediction model training method and device, storage medium and electronic equipment
CN103761254A (en) Method for matching and recommending service themes in various fields
CN106951527B (en) Song recommendation method and device
CN111209477B (en) Information recommendation method and device, electronic equipment and storage medium
CN107203558B (en) Object recommendation method and device, and recommendation information processing method and device
KR20190128246A (en) Searching methods and apparatus and non-transitory computer-readable storage media
CN109410001A (en) A kind of Method of Commodity Recommendation, system, electronic equipment and storage medium
CN110969172A (en) Text classification method and related equipment
CN115187345A (en) Intelligent household building material recommendation method, device, equipment and storage medium
CN112784998A (en) Data processing method and device and computing equipment
CN110909258A (en) Information recommendation method, device, equipment and storage medium
WO2017000341A1 (en) Information processing method, device, and terminal
CN110213660B (en) Program distribution method, system, computer device and storage medium
CN110427358A (en) Data cleaning method and device and information recommendation method and device
JP2014074961A (en) Commercial product recommendation device, method and program
CN111667018A (en) Object clustering method and device, computer readable medium and electronic equipment
CN112541010A (en) User gender prediction method based on logistic regression
CN110717037A (en) Method and device for classifying users
JP5588938B2 (en) Item recommendation apparatus, method and program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant