CN111177579A - Integrated diversity enhanced ultra-deep factorization machine model and construction method and application thereof - Google Patents

Integrated diversity enhanced ultra-deep factorization machine model and construction method and application thereof Download PDF

Info

Publication number
CN111177579A
CN111177579A CN201911304556.9A CN201911304556A CN111177579A CN 111177579 A CN111177579 A CN 111177579A CN 201911304556 A CN201911304556 A CN 201911304556A CN 111177579 A CN111177579 A CN 111177579A
Authority
CN
China
Prior art keywords
diversity
integrated
cross
vector
deep
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911304556.9A
Other languages
Chinese (zh)
Other versions
CN111177579B (en
Inventor
陈岭
施鸿裕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201911304556.9A priority Critical patent/CN111177579B/en
Publication of CN111177579A publication Critical patent/CN111177579A/en
Application granted granted Critical
Publication of CN111177579B publication Critical patent/CN111177579B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses an integrated diversity enhanced ultra-deep factorization machine model and a construction method and application thereof, and the integrated diversity enhanced ultra-deep factorization machine model specifically comprises the following steps: 1) constructing a training data set; 2) acquiring low-dimensional embedded vector representation corresponding to the original characteristic vector by using a full-connection network, and constructing an initial characteristic matrix; 3) acquiring an output matrix and an output vector of each cross layer by using the integrated diversity enhancement cross network, and calculating a diversity index; 4) calculating weight values for the diversity indexes of different cross layers by using a self-attention mechanism; 5) predicting based on the output vector of the integrated diversity enhancement cross network, and outputting a predicted value; 6) and training the model by using the overall loss consisting of the accuracy loss, the diversity loss and the regular term to obtain the integrated diversity enhanced ultra-deep factorization model with optimized parameters. The integrated diversity enhanced ultra-deep factorization machine model has wide application prospects in the fields of online advertisements, recommendation systems and the like.

Description

Integrated diversity enhanced ultra-deep factorization machine model and construction method and application thereof
Technical Field
The invention relates to the field of feature learning, in particular to an integrated diversity enhanced ultra-deep factorization machine model and application thereof.
Background
Feature learning is an important basis in the field of machine learning, and extraction and construction of effective features play an important role in prediction tasks. Feature crossing is a widely used way of feature construction. Feature crossing refers to the crossing and combining of two or more original features to obtain a new feature. For example, in the task of predicting house prices, house prices with superior "geographical location" and large "house type" are obviously higher. Based on the method, the advantages and disadvantages of the geographic position and the size of the house type are combined in a cross mode, and the geographic position and the house type which are constructed as new characteristics play a key role in predicting the house price. Therefore, how to construct and select effective cross features in feature learning becomes one of research hotspots in the field of machine learning, and has a wide application prospect.
Conventional methods of constructing cross-features can be divided into feature engineering based methods and decomposition based methods. Feature engineering based approaches typically use domain knowledge to manually construct cross-features by engineers. However, the feature dimension space is often large, and high time and labor cost are consumed; furthermore, manually constructed cross-features are often designed for specific tasks and are difficult to generalize to other application scenarios. The decomposition-based method utilizes the thought of matrix decomposition to model the cross features into the inner product of the hidden vectors after the feature weight coefficient matrix decomposition, thereby greatly reducing the number of model parameters. However, considering the problem of computational complexity, the decomposition-based method can only utilize the cross features of low order, and limits the performance of the model to some extent. To address this problem, researchers have proposed deep learning based methods to learn cross-features.
Deep learning based methods typically feed the original features directly into the deep neural network to obtain high-order cross information, but they ignore the importance of low-order cross features. The ultra-deep factorization machine model is a current advanced cross feature construction method based on deep learning. The ultra-deep factorization model combines a self-designed Compressed cross-Network (Compressed Interaction Network) and a deep neural Network to learn cross features. Compared with the traditional method for constructing the cross features, the extremely deep factorization model considers the cross features of low order and high order at the same time, can realize vector-wise crossing and is more interpretable. However, the extremely deep factorization model can be regarded as an integrated learning process when a plurality of cross feature vectors are learned, diversity information among different cross feature vectors is ignored, loss is reduced only by depending on single accuracy target driving, overfitting is easily caused, and the generalization capability of the model is limited to a certain extent.
Disclosure of Invention
The invention aims to solve the technical problem of how to effectively utilize diversity information and design a diversity measurement index to obtain more diverse cross characteristics, and provides an integrated diversity enhanced ultra-deep factorization machine model and a construction method and application thereof.
The technical scheme of the invention is as follows:
a method of constructing an integrated diversity enhanced extremely deep factorisation model, the method comprising the steps of:
(1) dividing original data into category type characteristics and numerical type characteristics, and respectively coding the category type characteristics and the numerical type characteristics to obtain a training set;
(2) sending each high-dimensional sparse feature vector after coding into a single-layer full-connection network to obtain corresponding low-dimensional embedded vector representation, and constructing an initial feature matrix;
(3) inputting the initial characteristic matrix into the integrated diversity enhancement cross network, obtaining an output matrix of each cross layer of the integrated diversity enhancement cross network according to the initial characteristic matrix, summing and pooling row vectors in the output matrix of each cross layer respectively to obtain an output vector of each cross layer, splicing the output vectors of all cross layers to obtain an output vector of the integrated diversity enhancement cross network, and calculating a predicted value of the output vector of the integrated diversity enhancement cross network by using a sigmoid activation function;
(4) calculating the diversity index of the output matrix of each cross layer, and calculating the weight values of the diversity indexes of different cross layers by adopting a self-attention mechanism;
(5) constructing an overall loss according to the diversity index of the output matrix of each cross layer, the corresponding weight value, and the difference between the predicted value and the label value of the sample;
(6) and according to the overall loss, utilizing all samples in the training set to iteratively optimize parameters of a full-connection network, an integrated diversity enhancement cross network, an attention mechanism and a sigmoid activation function, and obtaining an integrated diversity enhancement ultra-deep factorization model when the parameters are determined.
The integrated diversity enhanced ultra-deep factorization machine model is constructed by the method for constructing the integrated diversity enhanced ultra-deep factorization machine model.
The application of the integrated diversity enhanced ultra-deep factorization machine model in advertisement click rate prediction is characterized in that user advertisement click data and a corresponding click label are used as samples, the integrated diversity enhanced ultra-deep factorization machine model for predicting the advertisement click rate is constructed by the aid of the construction method of the integrated diversity enhanced ultra-deep factorization machine model, and when the integrated diversity enhanced ultra-deep factorization machine model is applied, the user advertisement click data is input into the integrated diversity enhanced ultra-deep factorization machine model, so that whether a user clicks an advertisement can be predicted.
The application of the integrated diversity enhanced ultra-deep factorization machine model in commodity purchasing prediction of a user is characterized in that the user purchasing behavior data and a corresponding purchasing label are used as samples, the integrated diversity enhanced ultra-deep factorization machine model for purchasing commodities is constructed by the aid of the construction method of the integrated diversity enhanced ultra-deep factorization machine model, and when the integrated diversity enhanced ultra-deep factorization machine model is applied, the user purchasing behavior data is input into the integrated diversity enhanced ultra-deep factorization machine model, so that whether the user purchases commodities can be predicted.
The invention considers diversity and accuracy at the same time in the learning process of the cross features. Compared with the prior art, the method has the advantages that:
1) the extremely deep factorization machine model with the enhanced integrated diversity is provided, diversity indexes are introduced, the diversity and the accuracy are considered in a target function at the same time, the over-fitting problem is relieved, and the generalization capability and the model performance are improved.
2) In the diversity index, a self-attention mechanism is introduced to distinguish the importance of the diversity index of the cross features of different cross layers so as to fully mine and utilize the diversity information in the output vectors of different cross layers.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a frame diagram of an ultra-deep factorization model with integrated diversity enhancement provided by an embodiment of the invention.
Fig. 2 is a diagram of an integrated diversity-enhanced cross network structure according to an embodiment of the present invention.
Fig. 3 is a flow chart illustrating the construction of an ultra-deep factorization model with integrated diversity enhancement according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the detailed description and specific examples, while indicating the scope of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Fig. 1 is a frame diagram of an ultra-deep factorization model with integrated diversity enhancement provided by an embodiment of the invention. Fig. 2 is a diagram of an integrated diversity-enhanced cross network structure according to an embodiment of the present invention. Fig. 3 is a flow chart illustrating the construction of an ultra-deep factorization model with integrated diversity enhancement according to an embodiment of the present invention. Referring to fig. 1,2 and 3, the method for constructing the integrated diversity enhanced ultra-deep factorization model provided by the embodiment includes the following steps:
step 1, dividing original data into category type characteristics and numerical type characteristics, and respectively coding the category type characteristics and the numerical type characteristics to obtain a complete training set.
Raw data can be classified into categorical and numerical characteristics. And respectively encoding single-value and multi-value characteristics by using a one-hot encoding (one-hot encoding) method and a multi-hot encoding (multi-hot encoding) method aiming at the class-type characteristics. And aiming at the numerical characteristics, carrying out independent thermal coding after discretization treatment by using a box dividing method.
And 2, batching the training set according to a fixed batch size, wherein the total number of batches is N.
Batch size S artificially set according to experiencebatThe training data set was batched for a total number of batches of N. The specific calculation method is as follows:
Figure BDA0002322735280000051
wherein N issamIs the total number of samples in the training dataset.
And step 3, sequentially selecting a batch of training samples with the index p from the training set, wherein p is 1,2, … and N. Steps 4-14 are repeated for each training sample in the batch.
Step 4, each high-dimensional sparse feature vector x after being codedfea_iSending into single-layer full-connection network to obtain corresponding low-dimensional embedded vector representation xemb_iAnd constructing an initial feature matrix X0
Firstly, each high-dimensional sparse feature vector x after being coded isfea_iSending the data into a single-layer full-connection network to obtain corresponding d-dimension dense embedded vector representation
Figure BDA0002322735280000061
Where i is 1,2, …, m, m is the number of feature vectors.
Second, the embedded vectors of all features are represented as xemb_iSplicing to obtain initial feature vector representation x0=[xemb_1,xemb_2,…,xemb_m]And an initial feature matrix
Figure BDA0002322735280000062
Wherein, X0Row i in (b) corresponds to the initial feature vector representation x0X in (2)emb_i
Step 5, utilizing two initial feature matrixes X0Calculating to obtain an output matrix X of the 1 st cross layer1
Output matrix of the 1 st cross layer
Figure BDA0002322735280000063
From two initial feature matrices X0The calculation is carried out in the following specific way:
Figure BDA0002322735280000064
wherein the content of the first and second substances,
Figure BDA0002322735280000065
representing the l row vector in the output matrix of the 1 st cross layer, l is more than or equal to 1 and less than or equal to e1,e1Is the output matrix X of the 1 st cross layer1The number of rows of (c).
Figure BDA0002322735280000066
Is X1The weight parameter of the l-th row in (c),
Figure BDA0002322735280000067
and
Figure BDA0002322735280000068
respectively represent initial feature matrix X0The ith and jth row vectors of (a),
Figure BDA0002322735280000069
representing the Hadamard Product (Hadamard Product) between vectors.
Step 6, utilizing the initial characteristic matrix X0And the output matrix X of the k-1 th cross layerk-1Calculating to obtain an output matrix X of the kth cross layerkWhere K is 2,3,4, …, K.
The output matrix of the k-th cross layer is composed of the initial characteristic matrix X0And the output matrix X of the previous cross layerk-1The calculation is carried out in the following specific way:
Figure BDA00023227352800000610
wherein the content of the first and second substances,
Figure BDA00023227352800000611
representing the l row vector in the output matrix of the k-th cross layer, l is more than or equal to 1 and less than or equal to ek,ekIs the output matrix X of the k-th cross layerkThe number of rows of (c). K is 2,3,4, …, K being the number of interleaved layers.
Figure BDA0002322735280000071
Output matrix X representing the k-1 th cross layerk-1The vector of the ith row of (a),
Figure BDA0002322735280000072
is XkThe weight parameter of the l-th row in (1).
Step 7, respectively outputting the matrix X to each cross layerkRow vector of
Figure BDA0002322735280000073
Performing summation pooling, wherein l is more than or equal to 1 and less than or equal to ek,k=1,2,3,…,K。
Output matrix X for each cross layer separatelykRow vector of
Figure BDA0002322735280000074
And (3) carrying out summation pooling, namely carrying out accumulation summation on each element in the row vector, wherein the specific calculation mode is as follows:
Figure BDA0002322735280000075
thus, the output matrix X of each cross layer can be adjustedkIs converted into an output vector skWherein, in the step (A),
Figure BDA0002322735280000076
k=1,2,3,…,K。
step 8, output vectors s of all the cross layers are processedkSplicing is carried out to obtain the output x of the integrated diversity enhanced cross networkdcin=[s1,s2,…,sK]。
Output x of integrated diversity enhanced crossbar networkdcinFrom the output vector s of K cross layerskIs formed of (a) xdcin=[s1,s2,…,sK]Containing the output vectors of the 1 st to K th interleaved layers.
Step 9, output matrix X of each cross layerkThe diversity index Div is calculated.
Based on Negative Correlation Learning (Negative Correlation Learning) theory in ensemble Learning, the output matrix X of different cross layerskAnd (3) calculating the diversity index Div in the following specific calculation mode:
Figure BDA0002322735280000077
Figure BDA0002322735280000078
Figure BDA0002322735280000079
wherein the content of the first and second substances,
Figure BDA0002322735280000081
measure the row vector in each cross layer
Figure BDA0002322735280000082
And all row vector means of the cross layer
Figure BDA0002322735280000083
The euclidean distance between.
Step 10, utilizing a self-attention mechanism to calculate a weighted value a for the diversity index Div of different cross layersk
By introducing a self-attention mechanism, the output vector s of each cross layer is converted into a vectorkSending the data into a multilayer perceptron to obtain the weight value a of the diversity index Div of different cross layersk. The specific calculation method is as follows:
a′k=hTReLU(Wsk+b) (8)
Figure BDA0002322735280000084
where h, W, and b are learnable parameters, and ReLU is a nonlinear activation function.
Step 11, integrating the output x of the diversity enhancement cross networkdcinSending the sigmoid activation function into the sigmoid activation function and obtaining a predicted value
Figure BDA0002322735280000085
Output x of cross network to be integrated diversity enhancementdcinSending the sigmoid activation function to obtain the final predicted value of the model
Figure BDA0002322735280000086
The specific calculation method is as follows:
Figure BDA0002322735280000087
wherein, WdcinTo be a weight parameter, σ (-) is a sigmoid activation function.
Step 12, calculating the loss of diversity
Figure BDA0002322735280000088
I.e. the sum of the diversity indicators based on different weight values.
The diversity index and weight value a of different cross layers obtained in step 9 and step 10kCalculating loss of diversity
Figure BDA0002322735280000089
Namely, the specific calculation method is as follows based on the sum of the diversity indexes with different weighted values:
Figure BDA00023227352800000810
wherein, D is all training sample sets in the batch, and N is the total number of samples.
Step 13, calculating accuracy loss
Figure BDA00023227352800000811
I.e. based on all sample tag values y and model prediction values in the batch
Figure BDA0002322735280000091
The average logarithmic loss in between, i.e., loss of accuracy;
calculating all sample label values y and model predicted values in the batch by using an accuracy loss function
Figure BDA0002322735280000092
Accuracy loss therebetween
Figure BDA0002322735280000093
The specific calculation method is as follows:
Figure BDA0002322735280000094
step 14, loss according to diversity
Figure BDA0002322735280000095
And loss of accuracy
Figure BDA0002322735280000096
Calculating the overall loss
Figure BDA0002322735280000097
Total loss of mass
Figure BDA0002322735280000098
By loss of accuracy
Figure BDA0002322735280000099
Loss of diversity
Figure BDA00023227352800000910
And L2 regular term λnThe method comprises the following three parts:
Figure BDA00023227352800000911
wherein λ isdTo control the parameters of the balance between loss of diversity and loss of accuracy, Θ represents all the parameters of the model, i.e. the parameters including the fully connected network, the diversity enhancement crossover network, the attention mechanism and the sigmoid activation function.
Step 15, according to the overall loss
Figure BDA00023227352800000912
Parameters in the entire model are adjusted.
And step 16, repeating the steps 3-15 until all batches of the training data set participate in model training.
And 17, repeating the steps 3-16 until the specified iteration times are reached, so that the integrated diversity enhanced ultra-deep factorization model can be obtained.
The integrated diversity enhanced ultra-deep factorization model can be applied to the field of online advertising and recommendation systems.
An application of an integrated diversity enhanced ultra-deep factorization machine model in advertisement click rate prediction. When the method is applied to the field of online advertisements, the original data are user advertisement click data (including demographic characteristics of users, basic attribute characteristics of advertisements and situation characteristics corresponding to click behaviors), and after preprocessing and model training, a trained integrated diversity enhanced ultra-deep factorization model is obtained and can be used for predicting the click rate of the user advertisement, namely predicting the probability of clicking one advertisement by the user.
An application of an integrated diversity enhanced ultra-deep factorization model in commodity purchasing prediction of a user. When the method is applied to a recommendation system, the original data is user purchasing behavior data (including user demographic characteristics, basic attribute characteristics of the commodity and situation characteristics corresponding to purchasing behaviors), and can be used for predicting whether the user will purchase a new commodity.
The above-mentioned embodiments are intended to illustrate the technical solutions and advantages of the present invention, and it should be understood that the above-mentioned embodiments are only the most preferred embodiments of the present invention, and are not intended to limit the present invention, and any modifications, additions, equivalents, etc. made within the scope of the principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A construction method of an integrated diversity enhanced ultra-deep factorization machine model, which is characterized by comprising the following steps:
(1) dividing original data into category type characteristics and numerical type characteristics, and respectively coding the category type characteristics and the numerical type characteristics to obtain a training set;
(2) sending each high-dimensional sparse feature vector after coding into a single-layer full-connection network to obtain corresponding low-dimensional embedded vector representation, and constructing an initial feature matrix;
(3) inputting the initial characteristic matrix into the integrated diversity enhancement cross network, obtaining an output matrix of each cross layer of the integrated diversity enhancement cross network according to the initial characteristic matrix, summing and pooling row vectors in the output matrix of each cross layer respectively to obtain an output vector of each cross layer, splicing the output vectors of all cross layers to obtain an output vector of the integrated diversity enhancement cross network, and calculating a predicted value of the output vector of the integrated diversity enhancement cross network by using a sigmoid activation function;
(4) calculating the diversity index of the output matrix of each cross layer, and calculating the weight values of the diversity indexes of different cross layers by adopting a self-attention mechanism;
(5) constructing an overall loss according to the diversity index of the output matrix of each cross layer, the corresponding weight value, and the difference between the predicted value and the label value of the sample;
(6) and according to the overall loss, utilizing all samples in the training set to iteratively optimize parameters of a full-connection network, an integrated diversity enhancement cross network, an attention mechanism and a sigmoid activation function, and obtaining an integrated diversity enhancement ultra-deep factorization model when the parameters are determined.
2. The method for constructing the integrated diversity enhanced ultra-deep factorization machine model according to claim 1, wherein the concrete process of the step (2) is as follows:
firstly, each high-dimensional sparse feature vector x after being coded isfea_iSending the data into a single-layer full-connection network to obtain corresponding d-dimension dense embedded vector representation
Figure FDA0002322735270000021
Wherein i is 1,2, …, m, m is the number of the feature vectors;
second, all the embedded vectors are represented as xemb_iSplicing to obtain initial feature vector representation x0=[xemb_1,xemb_2,…,xemb_m]And an initial feature matrix
Figure FDA0002322735270000022
Wherein, X0Row i in (b) corresponds to the initial feature vector representation x0X in (2)emb_i
3. The method for constructing an integrated diversity enhanced ultra-deep factorization model as claimed in claim 1, wherein in the step (3),
output matrix of the 1 st cross layer
Figure FDA0002322735270000023
From two initial feature matrices X0The calculation is carried out in the following specific way:
Figure FDA0002322735270000024
wherein the content of the first and second substances,
Figure FDA0002322735270000025
representing the l row vector in the output matrix of the 1 st cross layer, l is more than or equal to 1 and less than or equal to e1,e1Is the output matrix X of the 1 st cross layer1The number of rows of (a) to (b),
Figure FDA0002322735270000026
is X1The weight parameter of the l-th row in (c),
Figure FDA0002322735270000027
and
Figure FDA0002322735270000028
respectively represent initial feature matrix X0The ith and jth row vectors of (a),
Figure FDA0002322735270000029
representing the Hadamard product between the vectors;
output matrix X of k-th cross layerkFrom an initial feature matrix X0And the output matrix X of the previous cross layerk-1The calculation is carried out in the following specific way:
Figure FDA00023227352700000210
wherein the content of the first and second substances,
Figure FDA00023227352700000211
representing the l row vector in the output matrix of the k-th cross layer, l is more than or equal to 1 and less than or equal to ek,ekIs the output matrix X of the k-th cross layerkK is 2,3,4, …K, K is the number of crossing layers,
Figure FDA00023227352700000212
output matrix X representing the k-1 th cross layerk-1The vector of the ith row of (a),
Figure FDA00023227352700000213
is XkThe weight parameter of the l-th row in (1).
4. The method for constructing an integrated diversity enhanced ultra-deep factorization model as claimed in claim 1, wherein in the step (3),
output matrix X for each cross layer separatelykRow vector of
Figure FDA0002322735270000031
And (3) carrying out summation pooling, namely carrying out accumulation summation on each element in the row vector, wherein the specific calculation mode is as follows:
Figure FDA0002322735270000032
thus, the output matrix X of each cross layer can be adjustedkIs converted into an output vector skWherein, in the step (A),
Figure FDA0002322735270000033
5. the method for constructing an integrated diversity enhanced ultra-deep factorization model as claimed in claim 1, wherein in the step (4),
based on the negative correlation learning theory in ensemble learning, the output matrix X of different cross layerskAnd (3) calculating the diversity index Div in the following specific calculation mode:
Figure FDA0002322735270000034
Figure FDA0002322735270000035
Figure FDA0002322735270000036
wherein the content of the first and second substances,
Figure FDA0002322735270000037
measure the row vector in each cross layer
Figure FDA0002322735270000038
And all row vector means of the cross layer
Figure FDA0002322735270000039
The euclidean distance between.
6. The method for constructing an integrated diversity enhanced ultra-deep factorization model as claimed in claim 1, wherein in the step (4),
by introducing a self-attention mechanism, the output vector s of each cross layer is converted into a vectorkSending the data into a multilayer perceptron to obtain the weight value a of the diversity index Div of different cross layerskThe specific calculation method is as follows:
a′k=hTReLU(Wsk+b)
Figure FDA0002322735270000041
where h, W, and b are learnable parameters, and ReLU (-) is a non-linear activation function.
7. The method for constructing an integrated diversity enhanced ultra-deep factorization model as claimed in claim 1, wherein in the step (5),
according to the diversity index and the weight value a of different cross layerskComputing loss of diversity
Figure FDA0002322735270000042
Namely, the specific calculation method is as follows based on the sum of the diversity indexes with different weighted values:
Figure FDA0002322735270000043
wherein D is all training sample sets in the batch, and N is the total number of samples;
according to the sample label value y and the predicted value
Figure FDA0002322735270000044
Loss of computational accuracy
Figure FDA0002322735270000045
The specific calculation method is as follows:
Figure FDA0002322735270000046
total loss of mass
Figure FDA0002322735270000047
By loss of accuracy
Figure FDA0002322735270000048
Loss of diversity
Figure FDA0002322735270000049
And L2 regular term λnThe method comprises the following three parts:
Figure FDA00023227352700000410
wherein λ isdTo control parameters that balance between loss of diversity and loss of accuracyAnd Θ denotes all parameters.
8. An integrated diversity-enhanced extreme depth factorization machine model, which is constructed by the method for constructing the integrated diversity-enhanced extreme depth factorization machine model according to any one of claims 1 to 7.
9. An application of an integrated diversity enhanced ultra-deep factorization machine model in advertisement click rate prediction is characterized in that user advertisement click data and a corresponding click label are used as samples, the integrated diversity enhanced ultra-deep factorization machine model for predicting advertisement click rate is constructed by the construction method of the integrated diversity enhanced ultra-deep factorization machine model according to claims 1-7, and when the application is performed, the user advertisement click data is input into the integrated diversity enhanced ultra-deep factorization machine model, so that whether a user clicks an advertisement or not can be predicted.
10. An application of an integrated diversity enhanced ultra-deep factorization machine model in commodity purchasing prediction of a user is characterized in that user purchasing behavior data and a corresponding purchasing label are used as samples, the integrated diversity enhanced ultra-deep factorization machine model used for purchasing commodities is constructed by the method for constructing the integrated diversity enhanced ultra-deep factorization machine model according to claims 1-7, and in application, the user purchasing behavior data is input into the integrated diversity enhanced ultra-deep factorization machine model, so that whether the user purchases commodities can be predicted.
CN201911304556.9A 2019-12-17 2019-12-17 Application method of integrated diversity enhanced ultra-deep factorization machine model Active CN111177579B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911304556.9A CN111177579B (en) 2019-12-17 2019-12-17 Application method of integrated diversity enhanced ultra-deep factorization machine model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911304556.9A CN111177579B (en) 2019-12-17 2019-12-17 Application method of integrated diversity enhanced ultra-deep factorization machine model

Publications (2)

Publication Number Publication Date
CN111177579A true CN111177579A (en) 2020-05-19
CN111177579B CN111177579B (en) 2022-04-05

Family

ID=70650197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911304556.9A Active CN111177579B (en) 2019-12-17 2019-12-17 Application method of integrated diversity enhanced ultra-deep factorization machine model

Country Status (1)

Country Link
CN (1) CN111177579B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737578A (en) * 2020-06-22 2020-10-02 陕西师范大学 Recommendation method and system
CN112115371A (en) * 2020-09-30 2020-12-22 山东建筑大学 Neural attention mechanism mobile phone application recommendation model based on factorization machine
CN112884513A (en) * 2021-02-19 2021-06-01 上海数鸣人工智能科技有限公司 Marketing activity prediction model structure and prediction method based on depth factorization machine
CN113076944A (en) * 2021-03-11 2021-07-06 国家电网有限公司 Document detection and identification method based on artificial intelligence
CN113889217A (en) * 2021-10-19 2022-01-04 天津大学 Medicine recommendation method based on twin neural network and depth factorization machine
CN114282687A (en) * 2021-12-31 2022-04-05 复旦大学 Multi-task time sequence recommendation method based on factorization machine

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180095967A1 (en) * 2016-10-04 2018-04-05 Yahoo Holdings, Inc. Online ranking of queries for sponsored search
CN109711883A (en) * 2018-12-26 2019-05-03 西安电子科技大学 Internet advertising clicking rate predictor method based on U-Net network
CN110263243A (en) * 2019-01-23 2019-09-20 腾讯科技(深圳)有限公司 Media information recommending method, apparatus, storage medium and computer equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180095967A1 (en) * 2016-10-04 2018-04-05 Yahoo Holdings, Inc. Online ranking of queries for sponsored search
CN109711883A (en) * 2018-12-26 2019-05-03 西安电子科技大学 Internet advertising clicking rate predictor method based on U-Net network
CN110263243A (en) * 2019-01-23 2019-09-20 腾讯科技(深圳)有限公司 Media information recommending method, apparatus, storage medium and computer equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
万满: "基于卷积神经网络的广告点击率预测模型研究", 《知网优秀硕士学位论文库》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737578A (en) * 2020-06-22 2020-10-02 陕西师范大学 Recommendation method and system
CN111737578B (en) * 2020-06-22 2024-04-02 陕西师范大学 Recommendation method and system
CN112115371A (en) * 2020-09-30 2020-12-22 山东建筑大学 Neural attention mechanism mobile phone application recommendation model based on factorization machine
CN112884513A (en) * 2021-02-19 2021-06-01 上海数鸣人工智能科技有限公司 Marketing activity prediction model structure and prediction method based on depth factorization machine
CN113076944A (en) * 2021-03-11 2021-07-06 国家电网有限公司 Document detection and identification method based on artificial intelligence
CN113889217A (en) * 2021-10-19 2022-01-04 天津大学 Medicine recommendation method based on twin neural network and depth factorization machine
CN113889217B (en) * 2021-10-19 2024-06-04 天津大学 Drug recommendation method based on twin neural network and depth factor decomposition machine
CN114282687A (en) * 2021-12-31 2022-04-05 复旦大学 Multi-task time sequence recommendation method based on factorization machine
CN114282687B (en) * 2021-12-31 2023-03-07 复旦大学 Multi-task time sequence recommendation method based on factorization machine

Also Published As

Publication number Publication date
CN111177579B (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN111177579B (en) Application method of integrated diversity enhanced ultra-deep factorization machine model
Wu et al. Evolving RBF neural networks for rainfall prediction using hybrid particle swarm optimization and genetic algorithm
Donate et al. Time series forecasting by evolving artificial neural networks with genetic algorithms, differential evolution and estimation of distribution algorithm
Chau Application of a PSO-based neural network in analysis of outcomes of construction claims
Hong et al. SVR with hybrid chaotic genetic algorithms for tourism demand forecasting
Chou et al. Shear strength prediction of reinforced concrete beams by baseline, ensemble, and hybrid machine learning models
CN111538761A (en) Click rate prediction method based on attention mechanism
CN111737578B (en) Recommendation method and system
CN109272332B (en) Client loss prediction method based on recurrent neural network
Gad et al. A robust deep learning model for missing value imputation in big NCDC dataset
CN110619540A (en) Click stream estimation method of neural network
CN110955826A (en) Recommendation system based on improved recurrent neural network unit
CN110110372B (en) Automatic segmentation prediction method for user time sequence behavior
CN110796499A (en) Advertisement conversion rate estimation model and training method thereof
CN110175689A (en) A kind of method of probabilistic forecasting, the method and device of model training
Peng et al. An automatic hyperparameter optimization DNN model for precipitation prediction
CN110738314A (en) click rate prediction method and device based on deep migration network
Sugasawa Grouped heterogeneous mixture modeling for clustered data
CN115510322A (en) Multi-objective optimization recommendation method based on deep learning
CN115080868A (en) Product pushing method, product pushing device, computer equipment, storage medium and program product
CN114511387A (en) Product recommendation method and device, electronic equipment and storage medium
Xue et al. Machine learning embedded semiparametric mixtures of regressions with covariate-varying mixing proportions
CN113869943A (en) Article recommendation method, device, equipment and storage medium
Kuo et al. An application of differential evolution algorithm-based restricted Boltzmann machine to recommendation systems
CN115564532A (en) Training method and device of sequence recommendation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant