CN112396099A - Click rate estimation method based on deep learning and information fusion - Google Patents

Click rate estimation method based on deep learning and information fusion Download PDF

Info

Publication number
CN112396099A
CN112396099A CN202011277167.4A CN202011277167A CN112396099A CN 112396099 A CN112396099 A CN 112396099A CN 202011277167 A CN202011277167 A CN 202011277167A CN 112396099 A CN112396099 A CN 112396099A
Authority
CN
China
Prior art keywords
deep
layer
module
shallow
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011277167.4A
Other languages
Chinese (zh)
Other versions
CN112396099B (en
Inventor
李静梅
黄海亮
代昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202011277167.4A priority Critical patent/CN112396099B/en
Publication of CN112396099A publication Critical patent/CN112396099A/en
Application granted granted Critical
Publication of CN112396099B publication Critical patent/CN112396099B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a click rate estimation method based on deep learning and information fusion, which is characterized by being divided into three modules: the system comprises a field decomposition machine FFM of a shallow extraction module, a convolutional neural network CNN of a deep extraction module and a deep confidence network DBN of a feature fusion module. The shallow layer module and the deep layer module adopt parallel structures and share fixed dense vectors converted from discrete features of users and commodities; the shallow layer module adopts a second-order combination of FFM automatic extraction features, the deep layer module adopts a CNN local perception domain to extract a high-order nonlinear feature combination, and the fusion module adopts a DBN to fuse the output of the shallow layer FFM and the deep layer CNN, so that the interaction of the shallow layer features and the deep layer features is realized. The method provided by the invention has the advantages that the internal relation between the features is excavated by combining the feature interaction depth of the shallow layer and the deep layer, the problems of gradient explosion and gradient disappearance are effectively solved, and the click prediction capability is improved.

Description

Click rate estimation method based on deep learning and information fusion
Technical Field
The invention relates to a click rate estimation method, in particular to a click rate estimation method based on deep learning and information fusion, and belongs to the field of recommendation systems.
Background
With the combination of deep learning and recommendation systems, the click rate estimation method also changes in a coverage area. From the combination of logistic regression and gradient boosting trees of initial artificial feature combination to a factorization machine FM of shallow feature automatic combination, Deep learning is proposed by the Hua Noah ark as a Deep FM, and the accuracy of click rate prediction is remarkably improved, but the existing click rate estimation method still has some problems, and the Deep DNN network has the problems of gradient explosion and gradient disappearance along with the increase of the number of layers, so that the training effect is difficult and the optimization is difficult.
In order to solve the problems of the current click rate estimation method, a more effective and accurate click rate estimation method needs to be researched.
Disclosure of Invention
Aiming at the problems in the background art, the click rate estimation method CNN-FFM based on deep learning and information fusion is provided by the method.
The purpose of the invention is realized as follows:
a click rate estimation method based on deep learning and information fusion is characterized by being divided into three modules: the system comprises a field decomposition machine FFM of a shallow extraction module, a convolutional neural network CNN of a deep extraction module and a deep confidence network DBN of a feature fusion module. The shallow layer module and the deep layer module adopt parallel structures and share fixed dense vectors converted from discrete features of users and commodities; the shallow layer module adopts a second-order combination of FFM automatic extraction features, the deep layer module adopts a CNN local perception domain to extract a high-order nonlinear feature combination, and the fusion module adopts a DBN to fuse the output of the shallow layer FFM and the deep layer CNN, so that the interaction of the shallow layer features and the deep layer features is realized.
The invention also includes such features:
the method is characterized in that the CNN-based high-order nonlinear feature extraction mode comprises the following steps:
the convolutional neural network comprises 5 convolutional layers, 5 pooling layers and 2 full-connection layers, wherein the convolutional layers extract high-order nonlinear features in a local sensing domain mode to complete deep feature combination, and the mode reduces the number of model parameters while retaining main features.
The DBN-based feature fusion mode comprises the following steps:
the output of the shallow FFM module and the deep CNN module is used as the input of the feature fusion module, the DBN is used as a fusion model, the DBN comprises a 3-layer hidden layer and a 1-layer Sigmoid layer, the DBN fusion model aims to capture the height nonlinear relation between the shallow feature and the deep feature, and the click pre-estimation judgment result is output to the interval (0,1) through a Sigmoid function.
Compared with the prior art, the invention has the beneficial effects that:
the method provided by the invention has the advantages that the internal relation between the features is excavated by combining the feature interaction depth of the shallow layer and the deep layer, the problems of gradient explosion and gradient disappearance are effectively solved, and the click prediction capability is improved.
Drawings
FIG. 1 is a schematic diagram of a convolutional neural network process of the method of the present invention;
FIG. 2 is a process diagram of a deep belief network of the method of the present invention;
FIG. 3 is a flow chart of click through rate estimation of the method of the present invention;
FIG. 4 is a schematic diagram of deep learning and information fusion based on the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
1. The deep high-order nonlinear feature extraction model adopts a convolutional neural network CNN. And inputting the fixed dense vector converted by the embedded layer into the CNN, and extracting high-order nonlinear features by the convolution layer in a local perception domain mode to complete the deep feature combination problem.
2. The feature fusion module adopts DBN and a layer of Sigmoid function. The output of the shallow FFM module and the deep CNN module is used as the input of the feature fusion module, the DBN is used as a fusion model, the DBN fusion model aims to capture the height nonlinear relation between the shallow feature and the deep feature, and the click estimation judgment result is output to the interval (0,1) through a Sigmoid function.
The execution process of the method is divided into 8 steps:
1. embedding of user commodity data: in the data set of the user commodity, the data type features are more, so that the original data can be subjected to one-hot coding before data input to obtain sparse features; in order to simplify data of neural network parameters and reduce the calculation amount of a model, an embedding layer is added between the data and the model, and a sparse matrix is converted into a fixed dense vector through a mapping relation to be used as the input of the model. Where the vector i in the embedding layer is represented as:
Figure BDA0002780627930000021
wherein eiIs represented as a feature vector that is,
Figure BDA0002780627930000022
the i-th dimension of the feature field, x, in which the feature is locatedfieldSamples representing the ith dimension feature field. Then the output of the embedding layer
Figure BDA0002780627930000023
It can be expressed as:
Figure BDA0002780627930000024
where n represents the number of features and k represents the dimension of the embedding vector.
2. The shallow layer and the deep layer adopt a parallel structure, and the two modules share the output of the embedded layer as the input of the model.
3. The shallow layer adopts derived FFM of a traditional factorization machine, and the correlation of features in a field is more concerned; the shallow FFM mainly realizes automatic combination of shallow features, and for the feasibility of calculation, the FFM module considers the interaction between second-order features. The output of the FFM layer can be expressed as:
Figure BDA0002780627930000031
wherein with yffmRepresents the output of the FFM layer, wiWeight, w, representing first order features0The bias of the equation is expressed,
Figure BDA0002780627930000032
and representing the weight of the second-order feature interaction.
4. Extracting high-order nonlinear features by adopting a Convolutional Neural Network (CNN) in a deep layer; initialized sample weight w ═ w1,w2,...,wn]And normalizing, wherein n is expressed as the number of samples; initializing the convolution layer number of the convolution neural network, the convolution kernel number of each layer, the full connection layer number, the weight w on each layer of the network and the bias b.
5. The input is a fixed dimension vector of the embedded layer output, the output is a reduced-size mapping, and each mapping is a convolution value combination of the input mapping of the upper layer; the input sample set is X ═ X1,x2,x3,…xn) Wherein n represents the size of the sample and x represents the sample; the CNN module trains a feature fusion network by adopting all feature data, minimizes the weight and bias of subsequent learning, and can be expressed as:
Figure BDA0002780627930000033
where n represents the size of the sample,
Figure BDA0002780627930000034
representing input data YiAnd reconstructing the data
Figure BDA0002780627930000035
Cross entropy loss function between.
6. The output of the shallow FFM module and the deep CNN module is used as the input of the feature fusion module, a deep confidence network (DBN) is selected as a fusion model, and the model comprises a 3-layer hidden layer and a 1-layer Sigmoid layer.
Figure BDA0002780627930000036
Wherein y isffmAnd ycnnRepresents the output of the shallow FFM and the deep CNN, w represents the weight, b represents the equation bias,
Figure BDA0002780627930000042
representing the predicted value of the sample. y isiThe true value of the sample.
7. In a DBN network, by minimizing the difference between the actual and predicted values, and back-broadcasting the parameter values, the overall loss function of the model can be defined as:
Figure BDA0002780627930000041
8. to prevent the model from overfitting, the model network is optimized at the hidden layer using Dropout; the DBN fusion model aims to capture the highly nonlinear relation between the shallow layer characteristic and the deep layer characteristic and output the click estimation judgment result to the interval (0,1) through a Sigmoid function.
Through the 8 steps, the click rate estimation method FFM-CNN based on deep learning and information fusion is formed. The method effectively fuses the correlation of the characteristics between the shallow layer and the deep layer, and improves the success rate of recommending click rate estimation.
The above embodiments are only examples, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered as the scope of the present invention in the light of the technical solutions disclosed in the present disclosure. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
In summary, the following steps: the invention provides a click rate estimation method CNN-FFM based on deep learning and information fusion. Most of the traditional click rate estimation methods adopt linear models, and the interaction between features depends on a large amount of labor cost; the automatic combination characteristics of the FM model of the factorization machine ignore the high-order nonlinear relation among the characteristics; and the common deep learning model can have the problems of gradient disappearance, gradient explosion and the like due to the deepening of a network layer. Aiming at the problems, the invention provides a method CNN-FFM combining a convolutional neural network CNN and a field factor decomposition machine FFM. Firstly, converting discrete characteristics of users and commodities into fixed dense vectors through one-hot coding and mapping to be used as input of a whole model; then, the shallow layer uses FFM to realize automatic combination between second-order features, and the deep layer uses a convolutional neural network CNN to extract high-order nonlinear features; and finally, the output of the shallow layer and the deep layer is used as the input of the feature fusion module to complete the fusion interaction of the shallow layer and the deep layer features. The method is combined with the internal relation between the characteristics of the shallow layer and the deep layer in the characteristic interaction depth, effectively solves the problems of gradient explosion and gradient disappearance, and improves the click estimation capability.

Claims (3)

1. A click rate estimation method based on deep learning and information fusion is characterized by being divided into three modules: the system comprises a field decomposition machine FFM of a shallow extraction module, a convolutional neural network CNN of a deep extraction module and a deep confidence network DBN of a feature fusion module. The shallow layer module and the deep layer module adopt parallel structures and share fixed dense vectors converted from discrete features of users and commodities; the shallow layer module adopts a second-order combination of FFM automatic extraction features, the deep layer module adopts a CNN local perception domain to extract a high-order nonlinear feature combination, and the fusion module adopts a DBN to fuse the output of the shallow layer FFM and the deep layer CNN, so that the interaction of the shallow layer features and the deep layer features is realized.
2. The click rate estimation method based on deep learning and information fusion as claimed in claim 1, wherein the CNN-based high-order nonlinear feature extraction method:
the convolutional neural network comprises 5 convolutional layers, 5 pooling layers and 2 full-connection layers, wherein the convolutional layers extract high-order nonlinear features in a local sensing domain mode to complete deep feature combination, and the mode reduces the number of model parameters while retaining main features.
3. The click rate estimation method based on deep learning and information fusion as claimed in claim 1, wherein the feature fusion mode based on DBN is as follows:
the output of the shallow FFM module and the deep CNN module is used as the input of the feature fusion module, the DBN is used as a fusion model, the DBN comprises a 3-layer hidden layer and a 1-layer Sigmoid layer, the DBN fusion model aims to capture the height nonlinear relation between the shallow feature and the deep feature, and the click pre-estimation judgment result is output to the interval (0,1) through a Sigmoid function.
CN202011277167.4A 2020-11-16 2020-11-16 Click rate estimation method based on deep learning and information fusion Active CN112396099B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011277167.4A CN112396099B (en) 2020-11-16 2020-11-16 Click rate estimation method based on deep learning and information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011277167.4A CN112396099B (en) 2020-11-16 2020-11-16 Click rate estimation method based on deep learning and information fusion

Publications (2)

Publication Number Publication Date
CN112396099A true CN112396099A (en) 2021-02-23
CN112396099B CN112396099B (en) 2022-12-09

Family

ID=74600345

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011277167.4A Active CN112396099B (en) 2020-11-16 2020-11-16 Click rate estimation method based on deep learning and information fusion

Country Status (1)

Country Link
CN (1) CN112396099B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487351A (en) * 2021-07-05 2021-10-08 哈尔滨工业大学(深圳) Privacy protection advertisement click rate prediction method, device, server and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130346182A1 (en) * 2012-06-20 2013-12-26 Yahoo! Inc. Multimedia features for click prediction of new advertisements
WO2018212710A1 (en) * 2017-05-19 2018-11-22 National University Of Singapore Predictive analysis methods and systems
CN111506811A (en) * 2020-03-19 2020-08-07 上海理工大学 Click rate prediction method based on deep residual error network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130346182A1 (en) * 2012-06-20 2013-12-26 Yahoo! Inc. Multimedia features for click prediction of new advertisements
WO2018212710A1 (en) * 2017-05-19 2018-11-22 National University Of Singapore Predictive analysis methods and systems
CN111506811A (en) * 2020-03-19 2020-08-07 上海理工大学 Click rate prediction method based on deep residual error network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王倩倩: "基于深度学习的广告点击率预测方法研究", 《中国优秀硕博士学位论文全文数据库(博士) 信息科技辑》 *
胡勤生: "基于共现关系网络与深层神经网络的在线广告点击率预估模型研究", 《中国优秀硕博士学位论文全文数据库(硕士) 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487351A (en) * 2021-07-05 2021-10-08 哈尔滨工业大学(深圳) Privacy protection advertisement click rate prediction method, device, server and storage medium

Also Published As

Publication number Publication date
CN112396099B (en) 2022-12-09

Similar Documents

Publication Publication Date Title
WO2021258967A1 (en) Neural network training method and device, and data acquisition method and device
CN113780149A (en) Method for efficiently extracting building target of remote sensing image based on attention mechanism
CN113486190B (en) Multi-mode knowledge representation method integrating entity image information and entity category information
CN112036276B (en) Artificial intelligent video question-answering method
CN108984904A (en) A kind of Home Fashion & Design Shanghai method based on deep neural network
CN110188866B (en) Feature extraction method based on attention mechanism
CN109829495A (en) Timing image prediction method based on LSTM and DCGAN
CN112016002A (en) Mixed recommendation method integrating comment text level attention and time factors
CN111597929A (en) Group behavior identification method based on channel information fusion and group relation space structured modeling
CN113516133A (en) Multi-modal image classification method and system
CN114970517A (en) Visual question and answer oriented method based on multi-modal interaction context perception
CN116150747A (en) Intrusion detection method and device based on CNN and SLTM
CN113420179B (en) Semantic reconstruction video description method based on time sequence Gaussian mixture hole convolution
CN112396099B (en) Click rate estimation method based on deep learning and information fusion
CN112100486A (en) Deep learning recommendation system and method based on graph model
CN114998373A (en) Improved U-Net cloud picture segmentation method based on multi-scale loss function
CN113326748B (en) Neural network behavior recognition method adopting multidimensional correlation attention model
CN114155371A (en) Semantic segmentation method based on channel attention and pyramid convolution fusion
CN116453025A (en) Volleyball match group behavior identification method integrating space-time information in frame-missing environment
CN116958324A (en) Training method, device, equipment and storage medium of image generation model
CN115222998B (en) Image classification method
CN112598065B (en) Memory-based gating convolutional neural network semantic processing system and method
Wang et al. Ultra-short-term regional PV power forecasting based on fluctuation pattern recognition with satellite images
CN113901801B (en) Text content safety detection method based on deep learning
CN113505247B (en) Content-based high-duration video pornography content detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant