Disclosure of Invention
In order to solve the above technical problems, an embodiment of the present application provides a data recommendation method and apparatus, so as to achieve the purpose of improving accuracy of data to be recommended, and the technical scheme is as follows:
a data recommendation method, comprising:
acquiring sample characteristics required by a data recommendation task;
processing sample characteristics required by the data recommending task by using a Pade approximate model to obtain a first predicted value, processing the sample characteristics required by the data recommending task by using a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value;
acquiring data to be recommended based on the third predicted value;
the Pade approximation model is a model obtained by training a Pade approximation algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a second input feature and a second target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature.
Preferably, the pade approximation model includes:
pade approximation relation one
Wherein y is pade Representing a first predicted value, x i Representing the data recommendation task requirementsIs selected from the group consisting of a plurality of sample features,represents the summation function, m represents the number of sample features required by the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni And different n is an integer greater than 2.
Preferably, the pade approximation model includes:
pade approximation relation II
Wherein y is pade Representing a first predicted value, x i Representing one of the sample features required for the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni Different, NN (x i The method comprises the steps of carrying out a first treatment on the surface of the { w }) represents the neural network model, w represents the weight coefficients of different dimensions,representing a summation function, m represents the number of sample features required by the data recommendation task, and n is an integer greater than 2.
Preferably, the sample features required by the data recommendation task are obtained, including;
acquiring sample characteristics required by a data recommendation task, wherein each sample characteristic comprises at least one sub-characteristic;
Selecting sample features with the total number of sub-features not larger than a set number threshold value from sample features required by the data recommendation task as thin sample features;
selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from sample characteristics required by the data recommendation task as thick sample characteristics;
the processing the sample characteristics required by the data recommending task by using the Pade approximation model to obtain a first predicted value, and processing the sample characteristics required by the data recommending task by using a neural network learning model to obtain a second predicted value, comprising:
processing each thin sample characteristic by using a Pade approximate model to obtain a first predicted value, and processing each thin sample characteristic by using a neural network learning model to obtain a second predicted value;
the obtaining the data to be recommended based on the third predicted value includes:
inputting the thick sample characteristics into a pre-trained XGboost model to obtain a fourth predicted value output by the XGboost model;
and obtaining data to be recommended based on the third predicted value and the fourth predicted value.
Preferably, the sample features required for acquiring the data recommendation task include:
Acquiring sample characteristics required by a data recommendation task;
inputting sample characteristics required by the data recommendation task into a pre-trained XGboost model to obtain the importance of each sample characteristic output by the XGboost model;
based on the importance of each sample feature, selecting sample features meeting set conditions from sample features required by the data recommendation task, and taking the selected sample features as target sample features;
the selecting sample features with the total number of sub-features not larger than a set number threshold value from the sample features required by the data recommendation task as thin sample features comprises the following steps:
selecting sample features with the total number of sub-features not larger than a set number threshold value from the target sample features as thin sample features;
the selecting sample features with the total number of sub-features larger than the set number threshold from the sample features required by the data recommendation task as thick sample features comprises the following steps:
and selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from the target sample characteristics as thick sample characteristics.
A data recommendation device, comprising:
The acquisition module is used for acquiring sample characteristics required by the data recommendation task;
the prediction module is used for processing sample characteristics required by the data recommendation task by utilizing a Pade approximation model to obtain a first predicted value, processing the sample characteristics required by the data recommendation task by utilizing a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value;
the recommending module is used for acquiring data to be recommended based on the third predicted value;
the Pade approximation model is a model obtained by training a Pade approximation algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a second input feature and a second target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature.
Preferably, the pade approximation model includes:
pade approximation relation one
Wherein y is pade Representing a first predicted value, x i Representing one of the sample features required for the data recommendation task, Represents the summation function, m represents the number of sample features required by the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni And different n is an integer greater than 2.
Preferably, the pade approximation model includes:
pade approximation relation II
Wherein y is pade Representing a first predicted value, x i Representing one of the sample features required for the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni Different, NN (x i The method comprises the steps of carrying out a first treatment on the surface of the { w }) represents the neural network model, w represents the weight coefficients of different dimensions,representing a summation function, m represents the number of sample features required by the data recommendation task, and n is an integer greater than 2.
Preferably, the acquiring module is specifically configured to:
acquiring sample characteristics required by a data recommendation task, wherein each sample characteristic comprises at least one sub-characteristic;
selecting sample features with the total number of sub-features not larger than a set number threshold value from sample features required by the data recommendation task as thin sample features;
selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from sample characteristics required by the data recommendation task as thick sample characteristics;
The prediction module is specifically configured to:
processing each thin sample characteristic by using a Pade approximate model to obtain a first predicted value, and processing each thin sample characteristic by using a neural network learning model to obtain a second predicted value;
the recommendation module is specifically configured to:
inputting the thick sample characteristics into a pre-trained XGboost model to obtain a fourth predicted value output by the XGboost model;
and obtaining data to be recommended based on the third predicted value and the fourth predicted value.
Preferably, the acquiring module is specifically configured to:
acquiring sample characteristics required by a data recommendation task;
inputting sample characteristics required by the data recommendation task into a pre-trained XGboost model to obtain the importance of each sample characteristic output by the XGboost model;
based on the importance of each sample feature, selecting sample features meeting set conditions from sample features required by the data recommendation task, and taking the selected sample features as target sample features;
selecting sample features with the total number of sub-features not larger than a set number threshold value from the target sample features as thin sample features;
And selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from the target sample characteristics as thick sample characteristics.
Compared with the prior art, the application has the beneficial effects that:
in the application, the technical means of obtaining the data to be recommended by adopting the sample characteristics required by the data recommending task, processing the sample characteristics by utilizing a Pade approximate model to obtain a first predicted value, processing the sample characteristics by utilizing a neural network learning model to obtain a second predicted value, obtaining a third predicted value based on the first predicted value and the second predicted value and obtaining the data to be recommended based on the third predicted value can be realized.
And because the Pade approximation algorithm supports more effective learning of the association relation between the multi-order features, the Pade approximation algorithm is trained in advance based on the mapping relation between the first input features higher than the second order and the first target value, the trained Pade approximation model is ensured to process the sample features higher than the second order, the accuracy of the first predicted value is improved, and the accuracy of the obtained data to be recommended is ensured on the basis of the improvement of the accuracy of the first predicted value.
Detailed Description
The recommendation system generally utilizes an FM algorithm in the deep FM algorithm to process low-order features, such as first-order features and second-order features formed by combining the first-order features in pairs to obtain corresponding predicted values; and the DNN algorithm in the deep FM algorithm is used for processing the high-order features to obtain corresponding predicted values, and the inventor finds that the structure is limited by the FM algorithm, and the FM algorithm is difficult to learn the association relation between the low-order features with the dimension higher than the second order, so that the accuracy is not high when the recommendation system processes the low-order features with the dimension higher than the second order, and further the accuracy of data recommended by the recommendation system is not high. In order to improve the above problems and improve the accuracy of the recommended data obtained by the recommendation system, the present application proposes a data recommendation method, and the data recommendation method provided by the present application is described in detail below.
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description.
As shown in fig. 1, a flowchart of an embodiment 1 of a data recommendation method provided by the present application may include the following steps:
and S11, acquiring sample characteristics required by a data recommendation task.
The recommendation system recommends tasks for different data, possibly requiring different types of sample features. Therefore, after the data recommendation task is determined, the sample characteristics required by the data recommendation task need to be acquired, so that the acquisition of redundant sample characteristics can be reduced, and the workload is reduced.
For example, the recommendation system may need to describe different types of sample features for different data recommendation tasks, for example, if the data recommendation task is a task of recommending a user who can apply for cash credits, the user features, such as user portraits, user applists, user searches, or product liveness, need to be acquired.
And step S12, processing sample characteristics required by the data recommending task by utilizing a Pade approximate model to obtain a first predicted value, processing the sample characteristics required by the data recommending task by utilizing a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value.
In this embodiment, the pade approximation model is a model obtained by training a pade approximation algorithm in advance based on a mapping relationship between a first input feature and a first target value, where the first input feature is an input feature with a dimension higher than second order.
The neural network learning model is a model obtained by training a neural network learning algorithm in advance based on a mapping relation between a second input feature and a second target value, wherein the second input feature is an input feature with a dimension higher than that of the first input feature.
Wherein the first target value is different from the second target value.
Because the Pade approximation algorithm supports more effective learning of the association relation between the multi-order features, the Pade approximation algorithm can be trained in advance based on the mapping relation between the first input features higher than the second order and the first target value, the trained Pade approximation model can be ensured to process the sample features higher than the second order, and the accuracy of the first predicted value is improved.
In addition, the Pade approximation algorithm is trained in advance based on the mapping relation between the first input feature and the first target value, the first input feature is an input feature with the dimension higher than the second order, the neural network learning algorithm is trained in advance based on the mapping relation between the second input feature and the second target value, and under the condition that the second input feature is an input feature with the dimension higher than the first input feature, the Pade approximation algorithm is trained based on lower-order input features with more dimensions compared with the neural network learning algorithm, the Pade approximation model obtained through training is ensured to process lower-order features with more dimensions, the neural network learning model obtained through training is ensured to process higher-order features, and further the recommendation system is ensured to process lower-order features with more dimensions, and the accuracy of recommendation data obtained by the recommendation system is improved.
The determining of the first input feature may include:
s121, selecting a first positive sample feature and a first negative sample feature from the training sample features.
And S122, under the condition that the proportion of the first positive sample characteristic and the first negative sample characteristic does not meet the first set proportion threshold value, updating the first positive sample characteristic and the second negative sample characteristic until the proportion of the first positive sample characteristic and the first negative sample characteristic meets the first set proportion threshold value.
Wherein the first positive sample feature and the first negative sample feature may be updated in a weighted sampling manner in the time domain. Updating the first positive sample feature and the first negative sample feature in a weighted sampling manner in the time domain can be understood as: sample characteristics are acquired at different times according to different sampling proportions, and the first positive sample characteristics and the first negative sample characteristics are updated. For example, sampling the sample characteristics in the last week according to the sampling proportion of 100% of the proportion of the positive sample characteristics to the negative sample characteristics; or, sample features are collected at the last two weeks according to a sampling ratio of 80% of the ratio of positive sample features to negative sample features.
The determining of the second input feature may include:
s123, selecting a second positive sample feature and a second negative sample feature from the training sample features.
And S124, updating the second positive sample characteristic and the second negative sample characteristic until the proportion of the second positive sample characteristic and the second negative sample characteristic meets the second set proportion threshold value under the condition that the proportion of the second positive sample characteristic and the second negative sample characteristic does not meet the second set proportion threshold value.
Wherein the second positive sample feature and the second negative sample feature may be updated in a weighted sampling manner in the time domain. Updating the second positive sample feature and the second negative sample feature in a weighted sampling manner in the time domain can be understood as: and collecting sample characteristics at different times according to different sampling ratios, and updating the second positive sample characteristics and the second negative sample characteristics.
The process of obtaining the third predicted value based on the first predicted value and the second predicted value may be, but is not limited to:
inputting the first predicted value and the second predicted value into a preset sigmoid function to obtain a value output by the sigmoid function, and taking the value output by the sigmoid function as a third predicted value.
Specifically, the third predicted value may be obtained based on the first predicted value and the second predicted value using the following functional relation:
y(x)=sigmoid(y pade (x)+y NN (x;{w})
wherein y is pade (x) Representing the first predicted value, y NN (x; { w } represents the second predicted value, y (x) represents the third predicted value, sigmoid represents the sigmoid function, and x represents the sample characteristics required for the data recommendation task.
The preset sigmoid function may be set as needed, and is not limited in this embodiment.
And step S13, obtaining data to be recommended based on the third predicted value.
Based on the third predicted value, the detailed process of obtaining the data to be recommended (for example, the recommending system obtains the data to be recommended by using the predicted value output by the deep fm model) may refer to the recommending system in the prior art based on the predicted value, which is not described in detail in this embodiment. However, since the third predicted value is obtained based on the first predicted value and the second predicted value, and in the case where the accuracy of the first predicted value obtained based on the pade approximation model is improved, the accuracy of the obtained data to be recommended is improved based on the third predicted value, compared with the predicted value output by the deep fm model.
In the application, the technical means of obtaining the data to be recommended by adopting the sample characteristics required by the data recommending task, processing the sample characteristics by utilizing a Pade approximate model to obtain a first predicted value, processing the sample characteristics by utilizing a neural network learning model to obtain a second predicted value, obtaining a third predicted value based on the first predicted value and the second predicted value and obtaining the data to be recommended based on the third predicted value can be realized.
And because the Pade approximation algorithm supports more effective learning of the association relation between the multi-order features, the Pade approximation algorithm is trained in advance based on the mapping relation between the first input features higher than the second order and the first target value, the trained Pade approximation model is ensured to process the sample features higher than the second order, the accuracy of the first predicted value is improved, and the accuracy of the obtained data to be recommended is ensured on the basis of the improvement of the accuracy of the first predicted value.
And training the Pade approximation algorithm based on the mapping relation between the first input characteristic and the first target value in advance, so that the Pade approximation model can be ensured to be converged more quickly, the training efficiency is improved, and the overall efficiency is further improved.
As another optional embodiment 2 of the present application, mainly a refinement of the data recommendation method described in the foregoing embodiment 1, the method may include, but is not limited to, the following steps:
and S21, acquiring sample characteristics required by a data recommendation task.
The detailed process of step S21 may be referred to the description related to step S11 in the embodiment, and will not be described herein.
Step S22, utilizing Pade approximation relation oneAnd processing sample characteristics required by the data recommending task to obtain a first predicted value, processing the sample characteristics required by the data recommending task by using a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value.
The neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature;
y pade representing a first predicted value, x i Representing one of the sample features required for the data recommendation task,represents the summation function, m represents the number of sample features required by the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni Differently, n is an integer greater than 2.
Pade approximation relation oneIs one embodiment of the Pade approximation model in step S12 of example 1.
The detailed process of processing the sample features required by the data recommendation task by using the neural network learning model to obtain the second predicted value and obtaining the third predicted value based on the first predicted value and the second predicted value can be referred to the related description of step S12 in embodiment 1, which is not repeated here.
And step S23, obtaining data to be recommended based on the third predicted value.
The detailed process of step S23 may be referred to the description related to step S13 in the embodiment, and will not be described herein.
In this embodiment, pade's approximate relation oneCan perform processing on sample features higher than second orderAnd improving the accuracy of the first predicted value, and ensuring the accuracy of the obtained data to be recommended on the basis of improving the accuracy of the first predicted value.
As another optional embodiment 3 of the present application, mainly a refinement of the data recommendation method described in the foregoing embodiment 1, the method may include, but is not limited to, the following steps:
and S31, acquiring sample characteristics required by a data recommendation task.
The detailed process of step S31 may be referred to the description related to step S11 in the embodiment, and will not be described herein.
Step S32, utilizing Pade approximation relation IIAnd processing sample characteristics required by the data recommending task to obtain a first predicted value, processing the sample characteristics required by the data recommending task by using a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value.
The neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature;
y pade representing a first predicted value, x i Representing one of the sample features required for the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni Different, NN (x i ;{w}) represents a neural network model, w represents weight coefficients of different dimensions,representing a summation function, m represents the number of sample features required by the data recommendation task, and n is an integer greater than 2.
Note that NN (x i The method comprises the steps of carrying out a first treatment on the surface of the { w }) the neural network model represented may be the same as or different from the aforementioned neural network learning model.
Pade approximation relation IIIs one embodiment of the Pade approximation model in step S12 of example 1.
The detailed process of processing the sample features required by the data recommendation task by using the neural network learning model to obtain the second predicted value and obtaining the third predicted value based on the first predicted value and the second predicted value can be referred to the related description of step S12 in embodiment 1, which is not repeated here.
In this embodiment, the third predicted value may be obtained based on the first predicted value and the second predicted value based on the following relation:
y(x)=sigmoid(y pade (x)+y NN (x;{w})
wherein y is pade (x) Representing the first predicted value, y NN (x; { w } represents the second predicted value, and y (x) represents the third predicted value.
And step S33, obtaining data to be recommended based on the third predicted value.
The detailed process of step S33 may be referred to the description related to step S13 in the embodiment, and will not be described herein.
In this embodiment, pade's approximate relationship IIIn combination with Taylor expansion term (>And->) And the neural network model can process more sample features higher than the second order, further improve the accuracy of the first predicted value, and ensure the accuracy of the obtained data to be recommended on the basis of improving the accuracy of the first predicted value. In addition, when the requirement for processing the higher-order features is low, the processing can be performed by NN (x i The method comprises the steps of carrying out a first treatment on the surface of the { w }) shares part of the functions of the neural network learning model, so that the neural network learning model can be constructed without deep and wide, the calculated amount is reduced, and the recommendation efficiency is improved.
As another alternative embodiment of the present application, referring to fig. 2, a flow chart of an embodiment 4 of a data recommendation method provided by the present application is mainly a refinement of the data recommendation method described in the foregoing embodiment 1, and as shown in fig. 2, the method may include, but is not limited to, the following steps:
step S41, acquiring sample characteristics required by a data recommendation task, wherein each sample characteristic comprises at least one sub-characteristic.
And S42, selecting sample features with the total number of sub-features not larger than a set number threshold value from sample features required by the data recommendation task as thin sample features.
The set number threshold may be set as needed, and is not limited in this embodiment.
And S43, selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from sample characteristics required by the data recommendation task as thick sample characteristics.
Steps S41 to S43 are a specific embodiment of step S11 in example 1.
And S44, processing each thin sample characteristic by using a Pade approximation model to obtain a first predicted value, processing each thin sample characteristic by using a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value.
The Pade approximation model is a model obtained by training a Pade approximation algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a second input feature and a second target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature.
Processing each thin sample feature by using a Pade approximation model to obtain a first predicted value, processing each thin sample feature by using a neural network learning model to obtain a second predicted value, wherein the second predicted value is a specific implementation mode of the second predicted value obtained by processing sample features required by the data recommendation task by using the Pade approximation model in step S12 in embodiment 1, obtaining the first predicted value, and processing sample features required by the data recommendation task by using the neural network learning model.
And step S45, inputting the thick sample characteristics into a pre-trained XGboost model to obtain a fourth predicted value output by the XGboost model.
And step S46, obtaining data to be recommended based on the third predicted value and the fourth predicted value.
Steps S45-S46 are a specific embodiment of step S13 in example 1.
The process of obtaining the data to be recommended based on the third predicted value and the fourth predicted value may refer to the process of obtaining the data to be recommended based on the third predicted value, which is not described herein. Based on the third predicted value and the fourth predicted value, data to be recommended is obtained, and compared with the data to be recommended based on the third predicted value, accuracy of the obtained data to be recommended can be improved.
As another alternative embodiment of the present application, referring to fig. 3, a flow chart of an embodiment 5 of a data recommendation method provided by the present application is mainly a refinement of the data recommendation method described in the foregoing embodiment 4, and as shown in fig. 3, the method may include, but is not limited to, the following steps:
step S51, sample characteristics required by a data recommendation task are obtained, and each sample characteristic comprises at least one sub-characteristic.
And step S52, inputting sample characteristics required by the data recommendation task into a pre-trained XGboost model, and obtaining the importance of each sample characteristic output by the XGboost model.
And step S53, selecting sample characteristics meeting set conditions from the sample characteristics required by the data recommendation task based on the importance of the sample characteristics, and taking the selected sample characteristics as target sample characteristics.
The set conditions may be, but are not limited to: ranking obtained by sequencing the sample features according to the importance of the sample features is before setting the ranking; or, the importance of the sample feature is greater than a set importance threshold.
Steps S51-S53 are a specific embodiment of step S41 in example 4.
And S54, selecting sample features with the total number of sub-features not larger than a set number threshold value from the target sample features as thin sample features.
Step S54 is a specific implementation of step S42 in example 4.
And step S55, selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from the target sample characteristics as thick sample characteristics.
Step S55 is a specific implementation of step S43 in example 4.
And step S56, processing each thin sample characteristic by using a Pade approximate model to obtain a first predicted value, processing each thin sample characteristic by using a neural network learning model to obtain a second predicted value, and obtaining a third predicted value based on the first predicted value and the second predicted value.
The Pade approximation model is a model obtained by training a Pade approximation algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a second input feature and a second target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature.
Processing each thin sample feature by using a Pade approximation model to obtain a first predicted value, processing each thin sample feature by using a neural network learning model to obtain a second predicted value, wherein the second predicted value is a specific implementation mode of the second predicted value obtained by processing sample features required by the data recommendation task by using the Pade approximation model in step S12 in embodiment 1, obtaining the first predicted value, and processing sample features required by the data recommendation task by using the neural network learning model.
And step S57, inputting the thick sample characteristics into a pre-trained XGboost model to obtain a fourth predicted value output by the XGboost model.
And step S58, obtaining data to be recommended based on the third predicted value and the fourth predicted value.
Steps S45-S46 are a specific embodiment of step S13 in example 1.
The process of obtaining the data to be recommended based on the third predicted value and the fourth predicted value may refer to the process of obtaining the data to be recommended based on the third predicted value, which is not described herein. Based on the third predicted value and the fourth predicted value, data to be recommended is obtained, and compared with the data to be recommended based on the third predicted value, accuracy of the obtained data to be recommended can be improved.
In this embodiment, the sample features required by the data recommendation task are input to the pre-trained XGboost model, the importance of each sample feature output by the XGboost model is obtained, based on the importance of each sample feature, the sample feature meeting the set condition is selected from the sample features required by the data recommendation task, and the selected sample feature is used as the target sample feature, so that the number of sample features can be reduced while the quality of the sample feature is ensured, the operand is further reduced, and the recommendation efficiency is improved.
Next, description will be made of a data recommendation device provided by the present application, and the data recommendation device described below and the data recommendation method described above may be referred to correspondingly.
Referring to fig. 4, the data recommendation device includes: an acquisition module 11, a prediction module 12 and a recommendation module 13.
The acquiring module 11 is configured to acquire sample features required by the data recommending task.
The prediction module 12 is configured to process sample features required by the data recommendation task by using a pard approximation model to obtain a first predicted value, process sample features required by the data recommendation task by using a neural network learning model to obtain a second predicted value, and obtain a third predicted value based on the first predicted value and the second predicted value.
And the recommending module 13 is configured to obtain data to be recommended based on the third predicted value.
The Pade approximation model is a model obtained by training a Pade approximation algorithm on the basis of a mapping relation between a first input feature and a first target value in advance, the neural network learning model is a model obtained by training a neural network learning algorithm on the basis of a mapping relation between a second input feature and a second target value in advance, the first input feature is an input feature with a dimension higher than second order, and the second input feature is an input feature with a dimension higher than the first input feature.
In this embodiment, the pade approximation model may include:
pade approximation relation one
Wherein y is pade Representing a first predicted value, x i Representing one of the sample features required for the data recommendation task,represents the summation function, m represents the number of sample features required by the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni And different n is an integer greater than 2.
In this embodiment, the pade approximation model may include:
pade approximation relation II
Wherein y is pade Representing a first predicted value, x i Representing one of the sample features required for the data recommendation task, p 0i 、p 1i 、…、p ni Respectively represent weight coefficients of different dimensions, q 0i 、q 1i 、…、q ni Respectively represent weight coefficients of different dimensions, p 0i 、p 1i 、…、p ni And q 0i 、q 1i 、…、q ni Different, NN (x i The method comprises the steps of carrying out a first treatment on the surface of the { w }) represents the neural network model, w represents the weight coefficients of different dimensions,representing a summation function, m represents the number of sample features required by the data recommendation task, and n is an integer greater than 2.
In this embodiment, the obtaining module 11 may specifically be configured to:
acquiring sample characteristics required by a data recommendation task, wherein each sample characteristic comprises at least one sub-characteristic;
Selecting sample features with the total number of sub-features not larger than a set number threshold value from sample features required by the data recommendation task as thin sample features;
and selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from sample characteristics required by the data recommendation task as thick sample characteristics.
Accordingly, the prediction module 12 may be specifically configured to:
and processing each thin sample characteristic by using a Pade approximate model to obtain a first predicted value, and processing each thin sample characteristic by using a neural network learning model to obtain a second predicted value.
Accordingly, the recommendation module 13 may specifically be configured to:
inputting the thick sample characteristics into a pre-trained XGboost model to obtain a fourth predicted value output by the XGboost model;
and obtaining data to be recommended based on the third predicted value and the fourth predicted value.
In this embodiment, the obtaining module 11 may specifically be configured to:
acquiring sample characteristics required by a data recommendation task;
inputting sample characteristics required by the data recommendation task into a pre-trained XGboost model to obtain the importance of each sample characteristic output by the XGboost model;
Based on the importance of each sample feature, selecting sample features meeting set conditions from sample features required by the data recommendation task, and taking the selected sample features as target sample features;
selecting sample features with the total number of sub-features not larger than a set number threshold value from the target sample features as thin sample features;
and selecting sample characteristics with the total number of sub-characteristics larger than the set number threshold value from the target sample characteristics as thick sample characteristics.
It should be noted that, in each embodiment, the differences from the other embodiments are emphasized, and the same similar parts between the embodiments are referred to each other. For the apparatus class embodiments, the description is relatively simple as it is substantially similar to the method embodiments, and reference is made to the description of the method embodiments for relevant points.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in the same piece or pieces of software and/or hardware when implementing the present application.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.
The foregoing has described in detail a data recommendation method and apparatus provided by the present application, and specific examples have been applied herein to illustrate the principles and embodiments of the present application, and the above description of the examples is only for helping to understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.