WO2021169451A1 - Procédé et appareil de recommandation de contenu reposant sur un apprentissage antagoniste et dispositif informatique - Google Patents

Procédé et appareil de recommandation de contenu reposant sur un apprentissage antagoniste et dispositif informatique Download PDF

Info

Publication number
WO2021169451A1
WO2021169451A1 PCT/CN2020/132592 CN2020132592W WO2021169451A1 WO 2021169451 A1 WO2021169451 A1 WO 2021169451A1 CN 2020132592 W CN2020132592 W CN 2020132592W WO 2021169451 A1 WO2021169451 A1 WO 2021169451A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
user
discriminator
generator
feature
Prior art date
Application number
PCT/CN2020/132592
Other languages
English (en)
Chinese (zh)
Inventor
方聪
张旭
郑越
旷雄
黄宇星
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021169451A1 publication Critical patent/WO2021169451A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of artificial intelligence, in particular to content recommendation methods, devices and computer equipment based on adversarial learning.
  • Existing content recommendation systems are generally based on manual feature extraction, collaborative filtering and decomposition techniques to achieve automatic recommendation.
  • the user By collecting user behavior data, system log data and other information, the user’s preferences and interests are modeled, and users are based on their preferences. Interests are clustered and grouped, and the same kind of content is recommended for users with similar preferences and interests.
  • the existing content recommendation system regards the collected user behavior data as statistical features, but cannot take into account the temporal logic of the development and change of user preferences and interests, and the recommended content does not have the automatic update function that keeps pace with the times. .
  • the main purpose of this application is to provide content recommendation based on adversarial learning, which aims to solve the technical problem that the temporal logic of the development and change of user preference and interest cannot be considered, and the recommended content does not have the technical problem of automatic updating with the times.
  • This application proposes a content recommendation method based on adversarial learning, including:
  • This application also provides a content recommendation device based on adversarial learning, including:
  • the obtaining module is used to obtain the weighted compression vector corresponding to the user's historical behavior feature through weighted compression of the user characteristics constructed in advance;
  • the modeling module is used to model the generator and the discriminator according to the weighted compression vector
  • the adversarial learning module is used to combine the modeled generator and the discriminator to conduct adversarial learning under the adversarial model;
  • the first judgment module is used to judge whether the confrontation learning of the generator and the discriminator meets the preset condition
  • the determining module is used to input the historical information of the current user into the generator after the confrontation learning if the preset conditions are met, and combine the feedback value of the discriminator after the confrontation learning to determine the interest preference characteristics of the current user;
  • the recommendation module is used to recommend to the current user content information that matches the current user's interest preference characteristics according to the current user's interest preference characteristics.
  • the present application also provides a computer device, including a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method when the computer program is executed.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the above method are realized.
  • This application uses weighted compression to model the user's historical behavior characteristics to capture the characteristics of the user's historical behavior characteristics that follow time series changes, and based on adversarial learning, the generator can obtain the interest preference characteristics of online users and accurately recommend content information.
  • FIG. 1 is a schematic flowchart of a content recommendation method based on adversarial learning according to an embodiment of the present application
  • Fig. 2 is a schematic structural diagram of a content recommendation device based on adversarial learning according to an embodiment of the present application
  • Fig. 3 is a schematic diagram of the internal structure of a computer device according to an embodiment of the present application.
  • a content recommendation method based on adversarial learning includes:
  • the user characteristics constructed in the embodiments of the present application include user attribute characteristics P, historical click characteristics T, behavior cue characteristics Q, and user click behavior c.
  • User attribute features P include but are not limited to user profile information such as age and occupation; behavioral clue features Q include but are not limited to promoted information types, preferential policies, etc.; historical click features T include, but are not limited to, user historical personal information and user history Click content information; user click behavior c includes whether the assignment of the click behavior is true, if it is true, the click behavior occurred, otherwise it did not occur.
  • This application uses weighted compression to encode the time series features in the above user features to form a time series feature matrix, and uses the time series feature matrix and user attribute features to model the generator and the discriminator and fight against learning, so that the generated generation after learning
  • the device can identify the time sequence features in the user characteristics, obtain the user's interest preference feature carrying the time sequence change feature, and then recommend the content information according to the user's interest preference feature.
  • this application is more in line with the interest preferences of current users, and the recommended content is more accurate and targeted.
  • step S1 of obtaining the weighted compression vector corresponding to the user's historical behavior feature by weighting and compressing the pre-built user characteristics includes:
  • the user characteristics are coded according to the time sequence in the two-dimensional space of the time sequence dimension and the feature dimension to obtain the time sequence feature matrix corresponding to the user characteristics.
  • the above-mentioned user historical behavior characteristics are characteristic representations of historical data of user characteristics, and are a combination of user characteristics and historical time series characteristics.
  • This application processes the time series feature matrix through one-level weighted compression to obtain the embedding vector corresponding to the user's historical behavior feature.
  • the calculation process to obtain the first-level weighted compression of the embedding vector is as follows: Among them, St represents the embedding vector, and h represents according to the time series feature matrix
  • the operator to perform the operation vec represents the operator that pulls into a vector
  • represents the sigmoid function
  • W represents the feature weight matrix, that is, the aforementioned first compression weight matrix
  • B represents the feature bias vector, that is, the aforementioned first bias vector.
  • Two weighted compression process i.e.
  • the step S2 of modeling the generator and the discriminator according to the weighted compression vector includes:
  • S21 Perform vector splicing on user attribute characteristics, historical click characteristics, and behavior cue characteristics to obtain a second splicing vector
  • the second stitching vector [P; T; Q] is obtained by performing vector stitching of user attribute characteristics, historical click features, and behavior cue features.
  • the sample training data is first constructed. The specific method is to splice the second splicing vector [P;T;Q] with the cpred output by the generator as the negative sample feature vector; use the second splicing vector [P;T;Q] is spliced with the user's real click c as a positive sample feature vector.
  • is a strategy model based on a multi-layer convolutional neural network
  • R( ⁇ ) is a regularization term
  • is a regularization parameter
  • r represents a discriminator with fixed parameters.
  • the parameters of the multi-layer convolutional neural network of this application are optimized by the Adam algorithm.
  • step S21 of performing vector splicing of the user attribute characteristics, historical click characteristics and behavior cue characteristics to obtain the second splicing vector the method includes:
  • v T represents the parameter of the reward function.
  • step S3 of adversarial learning is performed under the adversarial model by combining the modeled generator and the discriminator, including:
  • S31 splicing the second splicing vector with the modeling result of the generator to form a negative sample feature vector, splicing the second splicing vector and the user click real value corresponding to the second splicing vector into a positive sample feature vector;
  • the formula of the confrontation model of this application is expressed as: Among them, ⁇ represents the optimized parameters of the discriminator in the adversarial learning, and ⁇ represents the parameters of the generator in the adversarial learning.
  • the learning goal of the generator is to generate a similar user click behavior cpred as much as possible according to the constructed vector of user characteristics, while the learning goal of the discriminator is to be able to distinguish the real user click behavior from the generator generation Similar to user click behavior.
  • the parameters of the discriminator and generator are alternately fixed. First fix the parameters of the discriminator, and train the generator through loss g . When the loss g drops, it means that the cpred generated by the generator successfully deceived the discriminator. Then fix the generator parameters and train the discriminator under the constraint of loss d . When loss d drops, it means that the discriminator has successfully distinguished between cpred and c.
  • the step S5 of determining the interest preference feature of the current user includes:
  • S51 Input the current user's historical information and designated marketing activity information into the generator after confrontation learning
  • the embodiments of this application are used for selecting marketing activity information as an example for detailed description.
  • the above-mentioned marketing information includes, but is not limited to, red envelopes, discount coupons, rebates, etc.
  • the method includes:
  • S61 Acquire a designated feature that affects the user's click action, where the designated feature is any one of all the features that affect the user's click action;
  • the user's historical characteristics and real click behavior are input into the discriminator, and the discriminator feedback output value is 1, indicating that it is a real click behavior.
  • the specified feature is time.
  • the feature data range includes the time span. Following the change of the time span, the change range of the discriminator output value also changes significantly, indicating that the user is sensitive to the time feature, and the time feature is determined as the user's sensitive feature.
  • a content recommendation device based on adversarial learning includes:
  • the obtaining module 1 is used to obtain the weighted compression vector corresponding to the user's historical behavior feature through weighted compression of the user characteristics constructed in advance;
  • Modeling module 2 used to model the generator and the discriminator according to the weighted compression vector
  • the adversarial learning module 3 is used to combine the modeled generator and the discriminator to conduct adversarial learning under the adversarial model;
  • the first judging module 4 is used to judge whether the confrontation learning of the generator and the discriminator reaches the preset condition
  • the determination module 5 is used to input the historical information of the current user into the generator after the confrontation learning if the preset conditions are met, and combine the feedback value of the discriminator after the confrontation learning to determine the interest preference characteristics of the current user;
  • the recommendation module 6 is used to recommend to the current user content information that matches the current user's interest preference characteristics according to the current user's interest preference characteristics.
  • the user characteristics constructed in the embodiments of the present application include user attribute characteristics P, historical click characteristics T, behavior cue characteristics Q, and user click behavior c.
  • User attribute features P include but are not limited to user profile information such as age and occupation; behavioral clue features Q include but are not limited to promoted information types, preferential policies, etc.; historical click features T include, but are not limited to, user historical personal information and user history Click content information; user click behavior c includes whether the assignment of the click behavior is true, if it is true, the click behavior occurred, otherwise it did not occur.
  • This application uses weighted compression to encode the time series features in the above user features to form a time series feature matrix, and uses the time series feature matrix and user attribute features to model the generator and the discriminator and fight against learning, so that the generated generation after learning
  • the device can identify the time sequence features in the user characteristics, obtain the user's interest preference feature carrying the time sequence change feature, and then recommend the content information according to the user's interest preference feature.
  • this application is more in line with the interest preferences of current users, and the recommended content is more accurate and targeted.
  • module 1 is obtained, including:
  • the coding unit is used to code the user characteristics in a time series in the two-dimensional space of the time series dimension and the feature dimension to obtain a time series feature matrix corresponding to the user characteristics;
  • the first multiplication unit is configured to multiply the time series feature matrix and the first compression weight matrix to obtain the first product matrix after data compression;
  • the first correction unit is configured to obtain the first correction matrix after correcting the first product matrix by the first paranoia vector
  • the first input unit is configured to input the first correction matrix into the sigmoid function to obtain the embedding vector corresponding to the user's historical behavior feature;
  • the first splicing unit is used to splice the embedding vector corresponding to the user's historical behavior feature with the time series feature corresponding to the specified time to form the first splicing vector;
  • the second multiplication unit is configured to multiply the first splicing vector and the second compression weight matrix to obtain a second product matrix after data compression;
  • the second correction unit is used to correct the second product matrix through the second paranoia vector to obtain the weighted compression vector corresponding to the user's historical behavior feature.
  • the user characteristics are coded according to the time sequence in the two-dimensional space of the time sequence dimension and the feature dimension to obtain the time sequence feature matrix corresponding to the user characteristics.
  • the above-mentioned user historical behavior characteristics are characteristic representations of historical data of user characteristics, and are a combination of user characteristics and historical time series characteristics.
  • This application processes the time series feature matrix through one-level weighted compression to obtain the embedding vector corresponding to the user's historical behavior feature.
  • the calculation process to obtain the first-level weighted compression of the embedding vector is as follows: Among them, St represents the embedding vector, and h represents according to the time series feature matrix
  • the operator to perform the operation vec represents the operator that pulls into a vector
  • represents the sigmoid function
  • W represents the feature weight matrix, that is, the aforementioned first compression weight matrix
  • B represents the feature bias vector, that is, the aforementioned first bias vector.
  • Two weighted compression process i.e.
  • Modeling module 2 includes:
  • the second splicing unit is used to perform vector splicing of user attribute characteristics, historical click characteristics and behavior cue characteristics to obtain a second splicing vector;
  • the first modeling unit is used to input the second stitching vector into the model of the generator under the fixed model parameters of the discriminator, and model the model of the generator through the first cross-entropy loss function constraint;
  • the first judging unit is used to judge whether the first cross-entropy loss function reaches the minimum value
  • Get unit is used to get the model of the generator if it reaches the minimum value.
  • the second stitching vector [P; T; Q] is obtained by performing vector stitching of user attribute characteristics, historical click features, and behavior cue features.
  • the sample training data is first constructed. The specific method is to splice the second splicing vector [P;T;Q] with the cpred output by the generator as the negative sample feature vector; use the second splicing vector [P;T;Q] is spliced with the user's real click c as a positive sample feature vector.
  • is a strategy model based on a multilayer convolutional neural network
  • R( ⁇ ) is a regularization term
  • is a regularization parameter
  • r represents a discriminator with fixed parameters.
  • the parameters of the multi-layer convolutional neural network of this application are optimized by the Adam algorithm.
  • the second splicing unit includes:
  • the input subunit is used to input the weighted compression vector into the sigmoid function to obtain the output result of the weighted compression vector;
  • v T represents the parameter of the reward function.
  • the confrontation learning module 3 includes:
  • the third splicing unit is used to splice the second splicing vector with the modeling result of the generator to form a negative sample feature vector, and splice the second splicing vector and the user click real value corresponding to the second splicing vector into a positive sample feature vector;
  • the second modeling unit is used to input the negative sample feature vector and the positive sample feature vector into the discriminator, fix the generator parameters, and model the discriminator under the constraints of the second cross-entropy loss function;
  • the second judging unit is used to judge whether the second cross-entropy loss function reaches the minimum value
  • the determination unit is used to determine the parameters of the discriminator if the minimum value is reached;
  • the confrontation learning unit is used for learning the generator and the discriminator through the confrontation model according to the modeling process of the generator and the discriminator, until the first cross entropy loss function and the second cross entropy loss function both reach the minimum.
  • the formula of the confrontation model of this application is expressed as: Among them, ⁇ represents the optimized parameters of the discriminator in the adversarial learning, and ⁇ represents the parameters of the generator in the adversarial learning.
  • the learning goal of the generator is to generate a similar user click behavior cpred as much as possible according to the constructed vector of user characteristics, while the learning goal of the discriminator is to be able to distinguish the real user click behavior from the generator generation Similar to user click behavior.
  • the parameters of the discriminator and generator are alternately fixed. First fix the parameters of the discriminator, and train the generator through loss g . When the loss g drops, it means that the cpred generated by the generator successfully deceived the discriminator. Then fix the generator parameters and train the discriminator under the constraint of loss d . When loss d drops, it means that the discriminator has successfully distinguished between cpred and c.
  • the determining module 5 includes:
  • the second input unit is used for inputting the current user's history information and designated marketing activity information into the generator after confrontation learning;
  • the third judgment unit is used to judge whether the feedback value of the discriminator after confrontation learning is equal to 1;
  • the determining unit is used for determining that the specified marketing activity information belongs to the current user's interest preference feature if it is equal to 1.
  • the embodiments of this application are used for selecting marketing activity information as an example for detailed description.
  • the above-mentioned marketing information includes, but is not limited to, red envelopes, discount coupons, rebates, etc.
  • the content recommendation device based on adversarial learning includes:
  • the first acquisition module is used to acquire the designated feature that affects the user's click action, where the designated feature is any one of all the features that affect the user's click action;
  • the change module is used to change the range of feature data when the specified feature is input to the discriminator;
  • the second acquisition module is used to acquire the output value change range that follows the corresponding change of the characteristic data range
  • the second judgment module is used to judge whether the change range of the output value exceeds the preset range
  • the judging module is used for judging that the specified feature is a sensitive feature that affects the user's click action if it exceeds the preset range.
  • the user's historical characteristics and real click behavior are input into the discriminator, and the discriminator feedback output value is 1, indicating that it is a real click behavior.
  • the specified feature is time.
  • the feature data range includes the time span. Following the change of the time span, the change range of the discriminator output value also changes significantly, indicating that the user is sensitive to the time feature, and the time feature is determined as the user's sensitive feature.
  • an embodiment of the present application also provides a computer device.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 3.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium.
  • the database of the computer equipment is used to store all the data required for the content recommendation process based on adversarial learning.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize the content recommendation method based on adversarial learning.
  • the processor executes the content recommendation method based on adversarial learning, including: obtaining a weighted compression vector corresponding to the user's historical behavior feature through weighted compression of pre-built user characteristics; modeling the generator and the discriminator according to the weighted compression vector; The modeled generator is combined with the discriminator to conduct adversarial learning under the adversarial model; judge whether the adversarial learning of the generator and the discriminator meets the preset conditions; if so, input the current user's historical information into the adversarial learning generator In the process, the feedback value of the discriminator after adversarial learning is combined to determine the current user's interest preference feature; according to the current user's interest preference feature, content information matching the current user's interest preference feature is recommended to the current user.
  • adversarial learning including: obtaining a weighted compression vector corresponding to the user's historical behavior feature through weighted compression of pre-built user characteristics; modeling the generator and the discriminator according to the weighted compression vector; The modeled generator is combined with the discriminator to conduct adversari
  • the above-mentioned computer equipment models the user's historical behavior characteristics through weighted compression to capture the characteristics of the user's historical behavior characteristics that follow time series changes, and based on adversarial learning, the generator can obtain online users' interest and preference characteristics, and accurately recommend content information.
  • the above-mentioned processor obtains the weighted compression vector corresponding to the user's historical behavior feature by weighting and compressing the user characteristics constructed in advance, including: performing the user characteristics according to the two-dimensional space of the time series dimension and the feature dimension. Time sequence coding to obtain the time sequence feature matrix corresponding to the user characteristics; multiply the time sequence feature matrix with the first compression weight matrix to obtain the first product matrix after data compression; after the first product matrix is corrected by the first paranoia vector, the first product matrix is obtained.
  • a correction matrix input the first correction matrix into the sigmoid function to obtain the embedding vector corresponding to the user's historical behavior feature; splicing the embedding vector corresponding to the user's historical behavior feature with the time series feature corresponding to the specified time to form the first splicing vector; The splicing vector is multiplied by the second compression weight matrix to obtain a second product matrix after data compression; after the second product matrix is corrected by the second paranoia vector, the weighted compression vector corresponding to the user's historical behavior characteristics is obtained.
  • the user characteristics include user attribute characteristics, historical click characteristics, and behavior cue characteristics.
  • the step of modeling the generator and the discriminator according to the weighted compression vector by the above-mentioned processor includes: combining the user attribute characteristics and the historical click characteristics Perform vector stitching with behavioral clues to obtain the second stitching vector; under the fixed model parameters of the discriminator, the second stitching vector is input into the model of the generator, and the model of the generator is constrained by the first cross-entropy loss function. Perform modeling; determine whether the first cross-entropy loss function reaches the minimum value; if so, obtain the generator model.
  • the processor before the step of performing vector splicing of the user attribute characteristics, historical click characteristics, and behavior cue characteristics, to obtain the second splicing vector, the processor includes: inputting the weighted compression vector into the sigmoid function to obtain the weighted compression vector Output result; multiply the output result of the weighted compression vector by the reward function parameter to obtain the reward value; use the calculation method of the reward value as the model of the discriminator.
  • the above-mentioned processor combines the modeled generator with the discriminator, and performs the step of adversarial learning under the adversarial model, including: splicing the second splicing vector with the modeling result of the generator to form a negative sample
  • the feature vector, the second stitching vector and the user click real value corresponding to the second stitching vector are stitched into the positive sample feature vector; the negative sample feature vector and the positive sample feature vector are input to the discriminator, the generator parameters are fixed, and the second cross-entropy loss Model the discriminator under the constraints of the function; judge whether the second cross-entropy loss function reaches the minimum value; if so, determine the parameters of the discriminator; according to the modeling process of the generator and the discriminator, the generator and the discriminator are combined through the confrontation model
  • the discriminator fights against learning until the first cross-entropy loss function and the second cross-entropy loss function both reach the minimum.
  • the above-mentioned processor inputs the historical information of the current user into the generator after the confrontation learning, and combines the feedback value of the discriminator after the confrontation learning to determine the interest preference characteristics of the current user.
  • the historical information and designated marketing activity information are input into the generator after confrontation learning; it is judged whether the feedback value of the discriminator after confrontation learning is equal to 1; if so, it is determined that the designated marketing activity information belongs to the current user’s interest preference feature.
  • the step includes: acquiring a specified feature that affects the user’s click action, where the specified A feature is any one of all the features that affect the user's click action; change the feature data range when the specified feature is input to the discriminator; obtain the output value change range that follows the corresponding change of the feature data range; determine whether the output value change range exceeds the preset range; if so , It is determined that the specified feature is a sensitive feature that affects the user's click action.
  • FIG. 3 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile.
  • a computer program is stored thereon.
  • the The learning content recommendation method includes: obtaining the weighted compression vector corresponding to the user's historical behavior feature by weighting and compressing the user characteristics constructed in advance; modeling the generator and the discriminator according to the weighted compression vector; and combining the modeled generator with The discriminator is combined to conduct adversarial learning under the adversarial model; it is judged whether the adversarial learning of the generator and the discriminator meets the preset conditions; if so, the current user’s historical information is input into the adversarial learning generator, combined with the adversarial learning
  • the feedback value of the discriminator determines the interest preference feature of the current user; according to the interest preference feature of the current user, it is recommended to the current user content information that matches the interest preference feature of the current user.
  • the above-mentioned computer-readable storage medium models the user's historical behavior characteristics through weighted compression to capture the characteristics of the user's historical behavior characteristics changing with time series, and based on adversarial learning, the generator can obtain the interest preference characteristics of online users, and Accurately recommend content information.
  • the above-mentioned processor obtains the weighted compression vector corresponding to the user's historical behavior feature by weighting and compressing the user characteristics constructed in advance, including: performing the user characteristics according to the two-dimensional space of the time series dimension and the feature dimension.
  • Time sequence coding to obtain the time sequence feature matrix corresponding to the user characteristics; multiply the time sequence feature matrix with the first compression weight matrix to obtain the first product matrix after data compression; after the first product matrix is corrected by the first paranoia vector, the first product matrix is obtained A correction matrix; input the first correction matrix into the sigmoid function to obtain the embedding vector corresponding to the user's historical behavior feature; splicing the embedding vector corresponding to the user's historical behavior feature with the time series feature corresponding to the specified time to form the first splicing vector; The splicing vector is multiplied by the second compression weight matrix to obtain a second product matrix after data compression; after the second product matrix is corrected by the second paranoia vector, the weighted compression vector corresponding to the user's historical behavior characteristics is obtained.
  • the user characteristics include user attribute characteristics, historical click characteristics, and behavior cue characteristics.
  • the step of modeling the generator and the discriminator according to the weighted compression vector by the above-mentioned processor includes: combining the user attribute characteristics and the historical click characteristics Perform vector stitching with behavioral clues to obtain the second stitching vector; under the fixed model parameters of the discriminator, the second stitching vector is input into the model of the generator, and the model of the generator is constrained by the first cross-entropy loss function. Perform modeling; determine whether the first cross-entropy loss function reaches the minimum value; if so, obtain the generator model.
  • the processor before the step of performing vector splicing of the user attribute characteristics, historical click characteristics, and behavior cue characteristics, to obtain the second splicing vector, the processor includes: inputting the weighted compression vector into the sigmoid function to obtain the weighted compression vector Output result; multiply the output result of the weighted compression vector by the reward function parameter to obtain the reward value; use the calculation method of the reward value as the model of the discriminator.
  • the above-mentioned processor combines the modeled generator with the discriminator, and performs the step of adversarial learning under the adversarial model, including: splicing the second splicing vector with the modeling result of the generator to form a negative sample
  • the feature vector, the second stitching vector and the user click real value corresponding to the second stitching vector are stitched into the positive sample feature vector; the negative sample feature vector and the positive sample feature vector are input to the discriminator, the generator parameters are fixed, and the second cross-entropy loss Model the discriminator under the constraints of the function; judge whether the second cross-entropy loss function reaches the minimum value; if so, determine the parameters of the discriminator; according to the modeling process of the generator and the discriminator, the generator and the discriminator are combined through the confrontation model
  • the discriminator fights against learning until the first cross-entropy loss function and the second cross-entropy loss function both reach the minimum.
  • the above-mentioned processor inputs the historical information of the current user into the generator after the confrontation learning, and combines the feedback value of the discriminator after the confrontation learning to determine the interest preference characteristics of the current user.
  • the historical information and designated marketing activity information are input into the generator after confrontation learning; it is judged whether the feedback value of the discriminator after confrontation learning is equal to 1; if so, it is determined that the designated marketing activity information belongs to the current user’s interest preference feature.
  • the step includes: acquiring a specified feature that affects the user’s click action, where the specified A feature is any one of all the features that affect the user's click action; change the feature data range when the specified feature is input to the discriminator; obtain the output value change range that follows the corresponding change of the feature data range; determine whether the output value change range exceeds the preset range; if so , It is determined that the specified feature is a sensitive feature that affects the user's click action.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual-rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé de recommandation de contenu reposant sur un apprentissage antagoniste. Le procédé a trait au domaine de l'intelligence artificielle, et comprend les étapes suivantes consistant à : obtenir, au moyen d'une compression pondérée d'une caractéristique d'utilisateur pré-construite, un vecteur de compression pondéré correspondant à une caractéristique de comportement historique d'un utilisateur (S1) ; modéliser un générateur et un discriminateur selon le vecteur de compression pondéré (S2) ; combiner le générateur et le discriminateur qui ont été soumis à une modélisation, et réaliser un apprentissage antagoniste sous un modèle antagoniste (S3) ; déterminer si l'apprentissage antagoniste du générateur et du discriminateur satisfont une condition prédéfinie (S4) ; si tel est le cas, entrer des informations historiques de l'utilisateur courant dans le générateur après l'apprentissage antagoniste, et déterminer des caractéristiques de préférence d'intérêt de l'utilisateur courant en combinaison avec une valeur de rétroaction du discriminateur après l'apprentissage antagoniste (S5) ; et selon les caractéristiques de préférence d'intérêt de l'utilisateur courant, recommander, à l'utilisateur courant, des informations de contenu correspondant aux caractéristiques de préférence d'intérêt de l'utilisateur courant (S6). Des caractéristiques de comportement sont modélisées au moyen d'une compression pondérée, des caractéristiques de changement de séquence temporelle de caractéristiques de comportement d'utilisateur sont capturées, et un générateur peut acquérir des caractéristiques de préférence d'intérêt sur la base d'un apprentissage antagoniste, de façon à recommander avec précision des informations de contenu.
PCT/CN2020/132592 2020-09-28 2020-11-30 Procédé et appareil de recommandation de contenu reposant sur un apprentissage antagoniste et dispositif informatique WO2021169451A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011044966.7 2020-09-28
CN202011044966.7A CN112182384B (zh) 2020-09-28 2020-09-28 基于对抗学习的内容推荐方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2021169451A1 true WO2021169451A1 (fr) 2021-09-02

Family

ID=73945688

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/132592 WO2021169451A1 (fr) 2020-09-28 2020-11-30 Procédé et appareil de recommandation de contenu reposant sur un apprentissage antagoniste et dispositif informatique

Country Status (2)

Country Link
CN (1) CN112182384B (fr)
WO (1) WO2021169451A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837805A (zh) * 2021-09-24 2021-12-24 深圳闪回科技有限公司 一种xDeepFM的二手手机价格预测算法
CN114168845A (zh) * 2021-11-24 2022-03-11 电子科技大学 一种基于多任务学习的序列化推荐方法
CN114841778A (zh) * 2022-05-23 2022-08-02 安徽农业大学 一种基于动态图神经网络的商品推荐方法
CN115098767A (zh) * 2022-05-29 2022-09-23 北京理工大学 一种基于兴趣感知和用户相似度的新闻推荐方法
CN114841778B (zh) * 2022-05-23 2024-06-04 安徽农业大学 一种基于动态图神经网络的商品推荐方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113434761B (zh) * 2021-06-25 2024-02-02 平安科技(深圳)有限公司 推荐模型训练方法、装置、计算机设备和存储介质
CN114493781A (zh) * 2022-01-25 2022-05-13 工银科技有限公司 用户行为预测方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109360069A (zh) * 2018-10-29 2019-02-19 郑州大学 一种基于成对对抗训练的推荐模型
CN110442804A (zh) * 2019-08-13 2019-11-12 北京市商汤科技开发有限公司 一种对象推荐网络的训练方法、装置、设备及存储介质
CN110727868A (zh) * 2019-10-12 2020-01-24 腾讯音乐娱乐科技(深圳)有限公司 对象推荐方法、装置和计算机可读存储介质
CN111259264A (zh) * 2020-01-15 2020-06-09 电子科技大学 一种基于生成对抗网络的时序评分预测方法
CN111460130A (zh) * 2020-03-27 2020-07-28 咪咕数字传媒有限公司 信息推荐方法、装置、设备和可读存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11721090B2 (en) * 2017-07-21 2023-08-08 Samsung Electronics Co., Ltd. Adversarial method and system for generating user preferred contents
KR102629474B1 (ko) * 2018-05-09 2024-01-26 삼성전자주식회사 데이터 압축 및 복원을 위한 전자 장치 및 그 압축 방법
KR102203252B1 (ko) * 2018-10-19 2021-01-14 네이버 주식회사 생성적 적대 신경망에 기반한 협업 필터링을 위한 방법 및 시스템
US11568260B2 (en) * 2018-10-29 2023-01-31 Google Llc Exponential modeling with deep learning features
US10715176B1 (en) * 2019-04-15 2020-07-14 EMC IP Holding Company LLC Recommending data compression scheme using machine learning and statistical attributes of the data
CN110162703A (zh) * 2019-05-13 2019-08-23 腾讯科技(深圳)有限公司 内容推荐方法、训练方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109360069A (zh) * 2018-10-29 2019-02-19 郑州大学 一种基于成对对抗训练的推荐模型
CN110442804A (zh) * 2019-08-13 2019-11-12 北京市商汤科技开发有限公司 一种对象推荐网络的训练方法、装置、设备及存储介质
CN110727868A (zh) * 2019-10-12 2020-01-24 腾讯音乐娱乐科技(深圳)有限公司 对象推荐方法、装置和计算机可读存储介质
CN111259264A (zh) * 2020-01-15 2020-06-09 电子科技大学 一种基于生成对抗网络的时序评分预测方法
CN111460130A (zh) * 2020-03-27 2020-07-28 咪咕数字传媒有限公司 信息推荐方法、装置、设备和可读存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KANDO NORIKO, SAKAI TETSUYA, JOHO HIDEO, LI HANG, DE VRIES ARJEN P., WHITE RYEN W., WANG JUN, YU LANTAO, ZHANG WEINAN, GONG YU, XU: "IRGAN : A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models", RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, ACM, 2 PENN PLAZA, SUITE 701NEW YORKNY10121-0701USA, 7 August 2017 (2017-08-07), 2 Penn Plaza, Suite 701New YorkNY10121-0701USA, pages 515 - 524, XP055840571, ISBN: 978-1-4503-5022-8, DOI: 10.1145/3077136.3080786 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837805A (zh) * 2021-09-24 2021-12-24 深圳闪回科技有限公司 一种xDeepFM的二手手机价格预测算法
CN114168845A (zh) * 2021-11-24 2022-03-11 电子科技大学 一种基于多任务学习的序列化推荐方法
CN114168845B (zh) * 2021-11-24 2023-08-15 电子科技大学 一种基于多任务学习的序列化推荐方法
CN114841778A (zh) * 2022-05-23 2022-08-02 安徽农业大学 一种基于动态图神经网络的商品推荐方法
CN114841778B (zh) * 2022-05-23 2024-06-04 安徽农业大学 一种基于动态图神经网络的商品推荐方法
CN115098767A (zh) * 2022-05-29 2022-09-23 北京理工大学 一种基于兴趣感知和用户相似度的新闻推荐方法

Also Published As

Publication number Publication date
CN112182384B (zh) 2023-08-25
CN112182384A (zh) 2021-01-05

Similar Documents

Publication Publication Date Title
WO2021169451A1 (fr) Procédé et appareil de recommandation de contenu reposant sur un apprentissage antagoniste et dispositif informatique
US20240046106A1 (en) Multi-task neural networks with task-specific paths
Krishnan et al. On the challenges of learning with inference networks on sparse, high-dimensional data
CN109902753B (zh) 用户推荐模型训练方法、装置、计算机设备和存储介质
CN110472060B (zh) 题目推送方法、装置、计算机设备和存储介质
CN111506820B (zh) 推荐模型、方法、装置、设备及存储介质
CN113901327A (zh) 一种目标推荐模型训练方法、推荐方法、装置及电子设备
CN110705688B (zh) 对操作事件进行风险评估的神经网络系统、方法及装置
CN113220886A (zh) 文本分类方法、文本分类模型训练方法及相关设备
CN110737730B (zh) 基于无监督学习的用户分类方法、装置、设备及存储介质
JP2012518834A (ja) ウェブサイト訪問者の評価値を計算するための方法及びシステム
CN110796261A (zh) 基于强化学习的特征提取方法、装置和计算机设备
CN111598213A (zh) 网络训练方法、数据识别方法、装置、设备和介质
CN111695084A (zh) 模型生成方法、信用评分生成方法、装置、设备及存储介质
CN111797320A (zh) 数据处理方法、装置、设备及存储介质
CN112817563B (zh) 目标属性配置信息确定方法、计算机设备和存储介质
CN113536105A (zh) 推荐模型训练方法和装置
KR20220098698A (ko) 잠재인자에 기반한 협업 필터링을 사용하여 사용자의 정답확률을 예측하는 학습 컨텐츠 추천 시스템 및 그것의 동작방법
CN109255389B (zh) 一种装备评价方法、装置、设备及可读存储介质
CN110807693A (zh) 专辑的推荐方法、装置、设备和存储介质
CN113051468B (zh) 一种基于知识图谱和强化学习的电影推荐方法及系统
Ardimansyah et al. Preprocessing matrix factorization for solving data sparsity on memory-based collaborative filtering
CN115525782A (zh) 自适应图结构的视频摘要生成方法
CN110929163B (zh) 课程推荐方法、装置、计算机设备及存储介质
CN114118631B (zh) 基于图神经网络的装卸货点推荐方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921604

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20921604

Country of ref document: EP

Kind code of ref document: A1