CN111143685A - Recommendation system construction method and device - Google Patents

Recommendation system construction method and device Download PDF

Info

Publication number
CN111143685A
CN111143685A CN201911394281.2A CN201911394281A CN111143685A CN 111143685 A CN111143685 A CN 111143685A CN 201911394281 A CN201911394281 A CN 201911394281A CN 111143685 A CN111143685 A CN 111143685A
Authority
CN
China
Prior art keywords
optimization
model
optimized
sequencing
individuals
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911394281.2A
Other languages
Chinese (zh)
Other versions
CN111143685B (en
Inventor
刘正夫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201911394281.2A priority Critical patent/CN111143685B/en
Publication of CN111143685A publication Critical patent/CN111143685A/en
Application granted granted Critical
Publication of CN111143685B publication Critical patent/CN111143685B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a construction method and a construction device of a recommendation system, relates to the technical field of data mining, and mainly aims to improve the optimization effect of the recommendation system and simultaneously avoid the model structure from becoming complex by carrying out multi-objective optimization on the model structure and the effect. The main technical scheme of the invention is as follows: acquiring a sample data set according to transaction data; determining a plurality of targets to be optimized and a plurality of objects to be optimized of a sequencing model in a recommendation system, wherein the targets to be optimized at least comprise a first optimization target and a second optimization target, the first optimization target is a target representing the effect of the model, the second optimization target is a target representing the structure of the model, and the objects to be optimized are hyper-parameters of the sequencing model; optimizing a plurality of objects to be optimized by utilizing a multi-objective optimization algorithm based on the sample data set to obtain a plurality of optimization models; integrating a plurality of optimization models into an integration model according to a preset strategy, and applying the integration model to the recommendation system.

Description

Recommendation system construction method and device
Technical Field
The invention relates to the technical field of data mining, in particular to a method and a device for constructing a recommendation system.
Background
In the big data age, the recommendation system can provide personalized recommendation results for different customers, so that the customers are better served. Especially in the field of electronic commerce, how to accurately recommend different products to different customers is significant, and with the explosive growth of internet data volume and the breakthrough of machine learning technology, the construction of an automatic recommendation system becomes practical.
At present, when a recommendation system is constructed, a machine learning algorithm is generally used for constructing a model, and then a training sample is used for training the model, so that an available recommendation system is obtained. The accuracy of recommending products to a user by the recommendation system is mainly embodied by the training effect of the model, so that the optimization of the recommendation system is usually to optimize the model effect, and the model becomes more and more complex in the process of optimizing the model effect. The complex model is easy to cause the time consumed by the training and prediction of the model to increase, and needs to consume more computing resources, and meanwhile, the complex model is also easy to cause overfitting, namely the complex model is good in effect when verified under the online condition, but the complex model is poor in performance on the online condition. It follows that further improvements are needed for how existing recommendation systems are optimized.
Disclosure of Invention
In view of the above problems, the present invention provides a method and an apparatus for constructing a recommendation system, and a main object of the present invention is to improve an optimization effect of the recommendation system and avoid a complex model structure by performing multi-objective optimization on the model structure and the effect.
In order to achieve the purpose, the invention mainly provides the following technical scheme:
in one aspect, the present invention provides a method for constructing a recommendation system, including:
acquiring a sample data set according to transaction data;
determining a plurality of targets to be optimized and a plurality of objects to be optimized of a sequencing model in a recommendation system, wherein the targets to be optimized at least comprise a first optimization target and a second optimization target, the first optimization target is a target representing a model effect, the second optimization target is a target representing a model structure, and the objects to be optimized are hyper-parameters of the sequencing model;
optimizing the plurality of objects to be optimized by utilizing a multi-objective optimization algorithm based on the sample data set to obtain a plurality of optimization models;
and integrating the optimization models into an integrated model according to a preset strategy, and applying the integrated model to the recommendation system.
On the other hand, the invention provides a construction device of a recommendation system, which specifically comprises:
the acquisition unit is used for acquiring a sample data set according to the transaction data;
the system comprises a determining unit, a calculating unit and a recommending unit, wherein the determining unit is used for determining a plurality of targets to be optimized and a plurality of objects to be optimized of a sequencing model in a recommending system, the targets to be optimized at least comprise a first optimization target and a second optimization target, the first optimization target is a target representing the effect of the model, the second optimization target is a target representing the structure of the model, and the objects to be optimized are hyper-parameters of the sequencing model;
the optimization unit is used for optimizing the plurality of objects to be optimized determined by the determination unit by utilizing a multi-objective optimization algorithm based on the sample data set obtained by the obtaining unit to obtain a plurality of optimization models;
and the synthesis unit is used for integrating the optimization models into an integrated model according to a preset strategy and applying the integrated model to the recommendation system.
In another aspect, the present invention provides a storage medium for storing a computer program, where the computer program controls, when running, a device on which the storage medium is located to execute the method for constructing the recommendation system.
In another aspect, the present invention provides a processor, configured to execute a program, where the program executes the method for constructing a recommendation system described above.
By means of the technical scheme, the method and the device for constructing the recommendation system mainly perform multi-objective optimization on the ranking model in the recommendation system, so that the recommendation system with a simple structure and high efficiency is constructed, recommendation accuracy is guaranteed, and the situations of time consumption, resource consumption, overfitting and the like caused by the complex model structure are reduced. In the invention, when a plurality of optimization targets in the sequencing model are determined, division and selection are carried out at least according to the classification mode of the model effect and the model structure, so that multi-target optimization is carried out, and a plurality of optimized models obtained after optimization are integrated to adapt to different input data to obtain the optimal recommendation result.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a flowchart illustrating a method for constructing a recommendation system according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating the optimization of a plurality of objects to be optimized according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating the integration of multiple optimization models into an integrated model according to an embodiment of the present invention;
FIG. 4 is a block diagram showing a construction apparatus of a recommendation system according to an embodiment of the present invention;
fig. 5 is a block diagram showing a construction apparatus of another recommendation system according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The embodiment of the invention provides a method for constructing a recommendation system, which is mainly applied to optimizing a ranking model in the recommendation system, so that a recommended result is better and more accurate, and has higher response speed and lower resource occupation. The method comprises the following specific steps as shown in figure 1:
101. and acquiring a sample data set according to the transaction data.
The positive sample and the negative sample are generally marked in the sample data set, generally, the transaction data record is data of a commodity purchased by a user and is used as the positive sample in the sample data set, and the negative sample in the sample data set is generated based on the positive sample, for example, a corresponding relationship that the user and the commodity do not have the transaction record is generated as the negative sample.
The sample generation in the sample data set of this step is illustrated in the following table as an example:
user number Product number
u1 p1
u2 p2
u3 p3
TABLE 1
User number Product number f1 f2 f3 f4 Identification
u1 p1 2 1 3 11 1
u1 p2 4 2 5 12 0
u1 p3 20 10 15 10 0
u2 p1 25 130 12 23 1
u2 p2 34 32 13 22 0
u2 p3 52 17 15 27 0
u3 p1 29 83 32 23 1
u3 p2 96 27 36 27 0
u3 p3 25 32 35 22 0
TABLE 2
Table 1 records transaction data, that is, user u1 purchases product p1, user u2 purchases product p2, and user u3 purchases product p3, and according to the transaction data, table 2 is generated, each row in table 2 is represented as a sample, and table 2 constitutes a sample data set. In table 2, positive and negative attributes for marking a sample are identified, and f1, f2, f3, etc. are used to record the related attributes of the sample, such as the number of times the user views a product, the browsing duration, etc.
102. A plurality of targets to be optimized and a plurality of objects to be optimized of a ranking model in a recommendation system are determined.
The ranking model in the recommendation system is generally constructed based on ranking objectives of multiple dimensions, and therefore, the optimization of the ranking model is to optimize the ranking objective of constructing the ranking model. In this step, when optimizing the ranking model, a plurality of targets are optimized simultaneously, that is, a plurality of targets to be optimized are determined, and the targets to be optimized are generally preset. Moreover, the optimization of one target is generally related to one or more hyper-parameters in the ranking model, so that the hyper-parameters to be optimized in the ranking model, namely a plurality of objects to be optimized, are determined while the target to be optimized is determined.
It should be particularly noted that, in the present invention, the determined multiple objects to be optimized at least include a first optimization object and a second optimization object, the first optimization object is an object representing a model effect, the second optimization object is an object representing a model structure, and the objects to be optimized are hyper-parameters of the order model. That is, the target to be optimized determined in the present invention is at least one target selected from two dimensions of the model effect and the model structure as the target to be optimized. By optimizing the method, the sequencing model can have higher accuracy and a simpler structure, so that the response time and the resource consumption of the sequencing model are lower.
The first optimization target comprises but is not limited to one or more of recall rate, accuracy and AUC of the sequencing model, and the AUC is a common evaluation index of the model and can objectively reflect the comprehensive prediction capability of positive samples and negative samples and eliminate the influence of sample inclination; the second optimization objective is a model structure determined according to an algorithm adopted by the ranking model, for example, when the ranking model is a gradient-boosted tree model (GBDT), the second optimization objective may be the product of the maximum depth of the tree and the number of the tree and the sum of all nodes of the tree.
103. And optimizing the plurality of objects to be optimized by utilizing a multi-objective optimization algorithm based on the sample data set to obtain a plurality of optimization models.
Among them, the general multi-objective optimization algorithms such as NSGA, PAES, SPEA and NSGA-II, and clone selection algorithms, etc. The selection of different optimization algorithms needs to be determined according to a specific application scenario, and thus, embodiments of the present invention are not specifically limited.
104. Integrating the optimization models into an integration model according to a preset strategy, and applying the integration model to a recommendation system.
The purpose of the step is to integrate a plurality of optimization models, and as a result, the recommendation prediction can be carried out on different input samples through the combination of different optimization models, so that a more accurate recommendation result is obtained. Thus, the integrated model may also be regarded as a combined model of at least one optimization model selected from a plurality of optimization models for processing for different input data. The specific combination mode needs to be determined according to a preset strategy, such as a combination mode of weighting, averaging, and the like.
Based on the embodiment shown in fig. 1, it can be known that the method for constructing a recommendation system provided by the present invention mainly optimizes and constructs a ranking model in the recommendation system, synchronously optimizes a plurality of optimization targets of the determined effect and structure dimension and corresponding optimization objects according to the obtained sample dataset, and integrates the obtained plurality of optimization models, so that the constructed ranking model can select optimization models of different combinations for processing according to different inputs, and input an optimal ranking result. Because the optimization of the sequencing model in the invention is based on the synchronous optimization of two dimensions of the model effect and the model structure, the optimized sequencing model has both effect and structure, thereby realizing that the sequencing model with a simpler structure has more sequencing effect and improving the overall processing efficiency of the recommendation system.
Further, regarding step 103 in the embodiment shown in fig. 1, a preferred embodiment of the present invention is a process for optimizing a plurality of objects to be optimized based on the NSGA-II algorithm, and specific steps thereof are shown in fig. 2, and include:
201. and carrying out N groups of random assignments on the plurality of objects to be optimized to generate N individuals to form an initial population.
The object to be optimized is a ranking model hyper-parameter determined based on a plurality of targets to be optimized, for example, when the ranking model is a gradient lifting tree model (GBDT), the hyper-parameter required to be optimized by the GBDT includes: learning rate, sampling rate, maximum number of trees, maximum depth of trees, etc.
202. And iterating the initial population to obtain iteration individuals.
Wherein the individual number of iteration individuals may be greater than N. The iterative process includes conventional selection, crossover, and mutation operations.
203. And triggering iterative individual screening according to preset probability, and selecting individuals with the optimization result difference of a plurality of targets to be optimized larger than a threshold value.
The specific screening method may include:
first, a random probability value is obtained.
And then, when the random probability value is determined to be smaller than or equal to the preset probability value, the optimization results corresponding to each target to be optimized of the iteration individuals are sorted from high to low.
And finally, deleting the M iteration individuals sequenced at the last, wherein M is the ratio of the preset total number of the deletions to the total number of the targets to be optimized.
According to the steps, the optimization results of each iteration individual at each target to be optimized are ranked so as to select the iteration individual with the worst optimization relative to each target to be optimized, and the number of the iteration individuals is controlled to be M. The screening process can be referred to the following example steps:
1) a random integer n of 1,100 is generated, and when n λ (assuming λ is empirically 5), the following steps are performed.
2) Assuming that the number of targets to be optimized is m, t is1,t2,…,tn. Respectively at the ith target tiThe sorting is based on tiAnd (5) applying the effect after the model is sorted. Rejecting at target tiOf course, the deleting mode is not limited to the ratio of the preset total number of deletes to the total number of the targets to be optimized, and the deleting mode can be weighted according to the importance degree of the targets, for example, when m is 2, the weight of the first target is 0.8, the weight of the second target is 0.2, then the last 16 individuals in the first target sequence are deleted according to the sequence, and only the last 4 individuals in the second target sequence are deleted.
The step is to eliminate individuals with obvious defects and accelerate the convergence speed of the iterative process.
204. And sequencing the screened iteration individuals according to the optimization result of the sequencing model.
Specifically, the ranking process in the NSGA-II algorithm is as follows:
firstly, fast non-dominant sequencing is carried out on the screened iteration individuals, and the iteration individuals are divided into a plurality of layers according to an optimization result of a sequencing model. The basic principle of the fast non-dominated sorting is as follows: all non-dominant individuals are selected in the population and are divided into the same level, the order value is 1, namely the first layer, then the individuals in the first layer are removed, new non-dominant individuals are found from the rest of the individuals, the order value is 2, and the like, until all the individuals in the population are ordered.
Second, the crowdedness of the iteration individual in the specified layer is calculated.
And finally, sequencing the iteration individuals in the specified layer according to the crowdedness.
205. And selecting N iterative individuals with the best optimization result according to the sequence to form a secondary population.
As can be seen from the steps shown in fig. 2, the optimization process is an iterative process of the population, while fig. 2 only exemplarily illustrates a process of iterating from the initial generation population to the secondary generation population, and in the actual application process, iteration needs to be repeated for multiple times until there are individuals meeting the optimization condition in the individuals, or after a specified number of iterations, at least one of the individuals is determined as the optimization result and applied to the ranking model, so as to obtain the optimization model. Generally, through the above iterative process, a plurality of available individuals are obtained, and applying the individuals to the ranking model results in a plurality of optimization models.
Further, based on the above embodiments shown in fig. 1 and fig. 2, a preferred embodiment of the present invention is a feasible manner described with respect to step 104, that is, integrating multiple optimization models to obtain an integrated model, and the specific steps of the embodiment are shown in fig. 3, and include:
301. and determining recognizable intervals of each optimization model for all sample data.
The identifiable interval is used for distinguishing whether each optimization model can obtain an effective sequencing result on the sample data.
The process of specifically determining the identifiable interval is as follows:
firstly, the optimization model sorts the sorting results of all sample data from high to low.
And then, determining an identifiable interval, wherein two endpoints of the identifiable interval are a k-th big sequencing result and a k-th small sequencing result according to sequencing, and k is a preset value.
302. And judging whether the multiple optimization models can be identified to the same sample data or not by utilizing the identifiable interval.
For a sample data, if the sorting result in an optimization model is outside the recognizable interval, namely the sorting result with the sorting order larger than the kth big sorting result or the sorting result with the sorting order smaller than the kth small sorting result, the sample data is considered to be recognizable by the recognizable interval.
Because each optimization model has a corresponding recognizable interval, for the same sample data, whether the optimization model capable of recognizing the sample data exists can be judged through the recognizable intervals corresponding to different optimization models.
303. If identifiable optimization models exist, the sequencing result of the integrated model is the ratio of the sum of the sequencing results of all identifiable optimization models to the number of identifiable optimization models.
304. And if the recognizable optimization model does not exist, the sequencing result of the integrated model is the average value of the sequencing results of all the optimization models.
For the above description of the steps, reference may be made to the data in the following table:
assume that the samples that need to be predicted are shown in table 3:
Figure BDA0002345868480000081
TABLE 3
Assuming that the number of the obtained optimization models is 3, the predicted values of the samples in table 3 are shown in table 4:
sample numbering f1 f2 f3 f4 Model 1 Model 2 Model 3
1 21 5 3 66 0.1 0.1 0.2
2 2 6 2 72 0.3 0.3 0.5
3 3 72 4 20 0.2 0.4 0.4
4 52 11 7 21 0.4 0.3 0.6
5 6 23 87 4 0.5 0.5 0.7
6 27 52 91 8 0.6 0.6 0.3
7 14 8 3 4 0.4 0.7 0.8
8 9 4 456 2 0.7 0.2 0.6
9 8 2 39 7 0.2 0.2 0.3
10 12 22 32 7 0.3 0.4 0.5
TABLE 4
Assuming k is 10, then the recognizable interval for model 1 is 0.1-0.7; the recognizable interval of the model 2 is 0.1-0.7; the recognizable interval of the model 3 is 0.2-0.8, and accordingly, the model 1 can be recognized as a sample 1 and a sample 8, and the model 2 and the model 3 can be recognized as a sample 1 and a sample 7. Based on this, the integration model is to integrate the model 1, the model 2, and the model 3 according to the preset strategy to obtain the predicted values of each sample, i.e. the sorting results, and the results are shown in table 5 according to the embodiments of steps 303 and 304:
sample numbering f1 f2 f3 f4 Model 1 Model 2 Model 3 Final predicted result
1 21 5 3 66 0.1 0.1 0.2 (0.1+0.1+0.2)/3
2 2 6 2 72 0.3 0.3 0.5 (0.3+0.3+0.5)/3
3 3 72 4 20 0.2 0.4 0.4 (0.2+0.4+0.4)/3
4 52 11 7 21 0.4 0.3 0.6 (0.4+0.3+0.6)/3
5 6 23 87 4 0.5 0.5 0.7 (0.5+0.5+0.7)/3
6 27 52 91 8 0.6 0.6 0.3 (0.6+0.6+0.3)/3
7 14 8 3 4 0.4 0.7 0.8 (0.7+0.8)/2
8 9 4 456 2 0.7 0.2 0.6 0.7/1
9 8 2 39 7 0.2 0.2 0.3 (0.2+0.2+0.3)/3
10 12 22 32 7 0.3 0.4 0.5 (0.3+0.4+0.5)/3
TABLE 5
It should be noted that the above embodiments of steps 303 and 304 are only exemplary, and different preset strategies may be integrated into different integration models, for example, besides the averaging method, a weighting method may be used to calculate the ranking result.
As can be seen from the above description of fig. 1 to 3, the construction method of the recommendation system provided by the present invention optimizes the ranking model in the recommendation system, mainly optimizes the effect and structure of the model by using a multi-objective optimization algorithm, and integrates the obtained multiple optimization models to be suitable for different input data, optimizes the model structure while improving the model identification accuracy, improves the response speed of the model, reduces the occupation of computing resources, and integrally improves the recommendation result of the recommendation system.
Further, as an implementation of the method for constructing the recommendation system, an embodiment of the present invention provides a device for constructing a recommendation system, which is mainly used for performing multi-objective optimization on a model structure and an effect, so as to improve an optimization effect of the recommendation system and avoid the model structure from becoming complicated. For convenience of reading, details in the foregoing method embodiments are not described in detail again in this apparatus embodiment, but it should be clear that the apparatus in this embodiment can correspondingly implement all the contents in the foregoing method embodiments. As shown in fig. 4, the apparatus specifically includes:
an obtaining unit 41, configured to obtain a sample data set according to transaction data;
a determining unit 42, configured to determine a plurality of targets to be optimized and a plurality of objects to be optimized of a ranking model in a recommendation system, where the targets to be optimized at least include a first optimization target and a second optimization target, the first optimization target is a target representing a model effect, the second optimization target is a target representing a model structure, and the objects to be optimized are hyper-parameters of the ranking model;
an optimizing unit 43, configured to optimize, by using a multi-objective optimization algorithm, the multiple objects to be optimized determined by the determining unit 42 based on the sample data set obtained by the obtaining unit 41, so as to obtain multiple optimization models;
and a synthesizing unit 44, configured to integrate the multiple optimization models obtained by the optimizing unit 43 into an integrated model according to a preset policy, and apply the integrated model to the recommendation system.
Further, as shown in fig. 5, the optimization unit 43 includes:
a generating module 431, configured to perform N groups of random assignments on the multiple objects to be optimized, generate N individuals, and form an initial population;
an iteration module 432, configured to iterate the initial population obtained by the generation module 431 to obtain iteration individuals, where the number of the iteration individuals is greater than N;
a screening module 433, configured to trigger to screen the iterative individuals obtained by the iterative module 432 according to a preset probability, and select individuals with optimization result differences of the multiple targets to be optimized being greater than a threshold;
a sorting module 434, configured to sort the iteration individuals screened by the screening module 433 according to the optimization result of the sorting model;
the generating module 431 is further configured to select the N iteration individuals with the best optimization results according to the sorting obtained by the sorting module to form a secondary population.
Further, the screening module 433 is further configured to:
acquiring a random probability value;
when the random probability value is less than or equal to a preset probability value, respectively sequencing the optimization results corresponding to each target to be optimized in the iteration individuals from high to low;
and deleting the M iteration individuals sequenced at the last, wherein M is the ratio of the preset total number of the deletions to the total number of the targets to be optimized.
Further, the sorting module 434 is further configured to:
performing fast non-dominated sorting on the screened iteration individuals, and dividing the iteration individuals into a plurality of layers according to an optimization result of the sorting model;
calculating the crowding degree of iteration individuals in a specified layer;
and sequencing the iteration individuals in the specified layer according to the crowdedness.
Further, as shown in fig. 5, the synthesizing unit 44 includes:
a determining module 441, configured to determine an identifiable interval of each optimization model for all sample data, where the identifiable interval is used to distinguish whether the optimization model can obtain a valid ranking result for the sample data;
a judging module 442, configured to judge whether multiple optimization models are identifiable for the same sample data by using the identifiable interval determined by the determining module 441;
a synthesizing module 443, configured to, if there are identifiable optimization models, determine that the ranking result of the integrated model is a ratio of a sum of ranking results of all identifiable optimization models to the number of identifiable optimization models;
the synthesizing module 443 is further configured to, if there is no identifiable optimization model, determine that the ranking result of the integrated model is an average of the ranking results of all the optimization models.
Further, the determining module 441 is further configured to:
sequencing the sequencing results of all sample data from high to low by the optimization model;
and determining an identifiable interval, wherein two endpoints of the identifiable interval are a k-th big sequencing result and a k-th small sequencing result according to the sequencing, and k is a preset value.
Further, the first optimization goal determined by the determining unit 42 is to include one or more of recall, accuracy, AUC of the ranking model;
the second optimization goal determined by the determining unit 42 is a model structure determined according to an algorithm adopted by the ranking model, and when the ranking model is a gradient lifting tree model, the second optimization goal is a sum of a product of a maximum depth of the tree and a number of the tree and all nodes of the tree.
Further, an embodiment of the present invention further provides a storage medium, where the storage medium is used for storing a computer program, where the computer program controls, when running, a device on which the storage medium is located to execute the above-mentioned method for constructing a recommendation system.
In addition, the embodiment of the invention also provides a processor, wherein the processor is used for running the program, and the program executes the construction method of the recommendation system when running.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
It will be appreciated that the relevant features of the method and apparatus described above are referred to one another. In addition, "first", "second", and the like in the above embodiments are for distinguishing the embodiments, and do not represent merits of the embodiments.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In addition, the memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method of constructing a recommendation system, the method comprising:
acquiring a sample data set according to transaction data;
determining a plurality of targets to be optimized and a plurality of objects to be optimized of a sequencing model in a recommendation system, wherein the targets to be optimized at least comprise a first optimization target and a second optimization target, the first optimization target is a target representing a model effect, the second optimization target is a target representing a model structure, and the objects to be optimized are hyper-parameters of the sequencing model;
optimizing the plurality of objects to be optimized by utilizing a multi-objective optimization algorithm based on the sample data set to obtain a plurality of optimization models;
and integrating the optimization models into an integrated model according to a preset strategy, and applying the integrated model to the recommendation system.
2. The method of claim 1, wherein said optimizing said plurality of objects to be optimized using a multi-objective optimization algorithm based on said set of sample data, resulting in a plurality of optimization models, comprises:
carrying out N groups of random assignments on the plurality of objects to be optimized to generate N individuals to form an initial population;
iterating the initial population to obtain iteration individuals, wherein the number of the iteration individuals is larger than N;
triggering the iterative individuals to be screened according to a preset probability, and selecting the individuals with the optimization result difference of the multiple targets to be optimized larger than a threshold value;
sorting the screened iteration individuals according to the optimization result of the sorting model;
and selecting N iterative individuals with the best optimization result according to the sorting to form a secondary generation population.
3. The method of claim 2, wherein the triggering of the iterative individual filtering according to a preset probability comprises:
acquiring a random probability value;
when the random probability value is less than or equal to a preset probability value, respectively sequencing the optimization results corresponding to each target to be optimized in the iteration individuals from high to low;
and deleting the M iteration individuals sequenced at the last, wherein M is the ratio of the preset total number of the deletions to the total number of the targets to be optimized.
4. The method of claim 2, wherein the sorting the filtered iteration individuals according to the optimization result of the sorting model comprises:
performing fast non-dominated sorting on the screened iteration individuals, and dividing the iteration individuals into a plurality of layers according to an optimization result of the sorting model;
calculating the crowding degree of iteration individuals in a specified layer;
and sequencing the iteration individuals in the specified layer according to the crowdedness.
5. The method of claim 1, wherein integrating the plurality of optimization models into one integrated model according to a predetermined strategy comprises:
determining an identifiable interval of each optimization model for all sample data, wherein the identifiable interval is used for distinguishing whether the optimization model can obtain an effective sequencing result for the sample data;
judging whether a plurality of optimization models can be identified to the same sample data or not by utilizing the identifiable interval;
if the identifiable optimization models exist, the sequencing result of the integrated model is the ratio of the sum of the sequencing results of all the identifiable optimization models to the number of the identifiable optimization models;
and if the identifiable optimization model does not exist, the sequencing result of the integrated model is the average value of the sequencing results of all the optimization models.
6. The method of claim 5, wherein said determining an identifiable interval for each optimization model for all sample data comprises:
sequencing the sequencing results of all sample data from high to low by the optimization model;
and determining an identifiable interval, wherein two endpoints of the identifiable interval are a k-th big sequencing result and a k-th small sequencing result according to the sequencing, and k is a preset value.
7. The method of any of claims 1-6, wherein the first optimization objective is to include one or more of a recall, an accuracy, an AUC of the ranking model;
and the second optimization target is a model structure determined according to an algorithm adopted by the sequencing model, and when the sequencing model is a gradient lifting tree model, the second optimization target is the sum of the product of the maximum depth of the tree and the number of the trees and all nodes of the tree.
8. An apparatus for constructing a recommendation system, the apparatus comprising:
the acquisition unit is used for acquiring a sample data set according to the transaction data;
the system comprises a determining unit, a calculating unit and a recommending unit, wherein the determining unit is used for determining a plurality of targets to be optimized and a plurality of objects to be optimized of a sequencing model in a recommending system, the targets to be optimized at least comprise a first optimization target and a second optimization target, the first optimization target is a target representing the effect of the model, the second optimization target is a target representing the structure of the model, and the objects to be optimized are hyper-parameters of the sequencing model;
the optimization unit is used for optimizing the plurality of objects to be optimized determined by the determination unit by utilizing a multi-objective optimization algorithm based on the sample data set obtained by the obtaining unit to obtain a plurality of optimization models;
and the synthesis unit is used for integrating the optimization models into an integrated model according to a preset strategy and applying the integrated model to the recommendation system.
9. A storage medium for storing a computer program, wherein the computer program controls a device on which the storage medium is installed to execute the method for constructing a recommendation system according to any one of claims 1-7 when running.
10. A processor for executing a computer program, wherein the computer program executes the method of constructing a recommendation system according to any one of claims 1-7.
CN201911394281.2A 2019-12-30 2019-12-30 Commodity recommendation method and device Active CN111143685B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911394281.2A CN111143685B (en) 2019-12-30 2019-12-30 Commodity recommendation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911394281.2A CN111143685B (en) 2019-12-30 2019-12-30 Commodity recommendation method and device

Publications (2)

Publication Number Publication Date
CN111143685A true CN111143685A (en) 2020-05-12
CN111143685B CN111143685B (en) 2024-01-26

Family

ID=70521787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911394281.2A Active CN111143685B (en) 2019-12-30 2019-12-30 Commodity recommendation method and device

Country Status (1)

Country Link
CN (1) CN111143685B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753215A (en) * 2020-06-28 2020-10-09 海南大学 Multi-objective recommendation optimization method and readable medium
CN111797318A (en) * 2020-07-01 2020-10-20 喜大(上海)网络科技有限公司 Information recommendation method, device, equipment and storage medium
CN113627900A (en) * 2021-08-10 2021-11-09 未鲲(上海)科技服务有限公司 Model training method, device and storage medium
CN115145153A (en) * 2022-07-06 2022-10-04 广东省十九建建设有限公司 Intelligent energy-saving control method, system, terminal and medium for building house

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050278357A1 (en) * 2004-06-10 2005-12-15 Brown Paul G Detecting correlation from data
US20120185330A1 (en) * 2011-01-14 2012-07-19 Platformation, Inc. Discovery and Publishing Among Multiple Sellers and Multiple Buyers
CN106022865A (en) * 2016-05-10 2016-10-12 江苏大学 Goods recommendation method based on scores and user behaviors
CN106844637A (en) * 2017-01-19 2017-06-13 北京工业大学 Method is recommended based on the film for just giving cluster to prune improvement multi-objective genetic algorithm
CN107256241A (en) * 2017-05-26 2017-10-17 北京工业大学 The film recommendation method for improving multi-objective genetic algorithm is replaced based on grid and difference
CN107544981A (en) * 2016-06-25 2018-01-05 华为技术有限公司 Content recommendation method and device
WO2018060967A1 (en) * 2016-09-29 2018-04-05 Inesc Tec - Instituto De Engenharia De Sistemas E Computadores, Tecnologia E Ciência Big data self-learning methodology for the accurate quantification and classification of spectral information under complex varlability and multi-scale interference
US20180137219A1 (en) * 2016-11-14 2018-05-17 General Electric Company Feature selection and feature synthesis methods for predictive modeling in a twinned physical system
US20190042079A1 (en) * 2017-08-01 2019-02-07 Samsung Electronics Co., Ltd. Electronic device and method for providing search result thereof
CN109614609A (en) * 2018-11-06 2019-04-12 阿里巴巴集团控股有限公司 Method for establishing model and device
US20190205701A1 (en) * 2017-12-29 2019-07-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for Training Model and Information Recommendation System
CN109977283A (en) * 2019-03-14 2019-07-05 中国人民大学 A kind of the tourism recommended method and system of knowledge based map and user's footprint
CN110457545A (en) * 2019-08-16 2019-11-15 第四范式(北京)技术有限公司 The method and device of the parameter of order models in a kind of determining recommender system
CN110457329A (en) * 2019-08-16 2019-11-15 第四范式(北京)技术有限公司 A kind of method and device for realizing personalized recommendation
CN110559664A (en) * 2019-09-19 2019-12-13 湘潭大学 game hero outgoing recommendation method and system based on multi-objective optimization

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050278357A1 (en) * 2004-06-10 2005-12-15 Brown Paul G Detecting correlation from data
US20120185330A1 (en) * 2011-01-14 2012-07-19 Platformation, Inc. Discovery and Publishing Among Multiple Sellers and Multiple Buyers
CN106022865A (en) * 2016-05-10 2016-10-12 江苏大学 Goods recommendation method based on scores and user behaviors
CN107544981A (en) * 2016-06-25 2018-01-05 华为技术有限公司 Content recommendation method and device
WO2018060967A1 (en) * 2016-09-29 2018-04-05 Inesc Tec - Instituto De Engenharia De Sistemas E Computadores, Tecnologia E Ciência Big data self-learning methodology for the accurate quantification and classification of spectral information under complex varlability and multi-scale interference
US20180137219A1 (en) * 2016-11-14 2018-05-17 General Electric Company Feature selection and feature synthesis methods for predictive modeling in a twinned physical system
CN106844637A (en) * 2017-01-19 2017-06-13 北京工业大学 Method is recommended based on the film for just giving cluster to prune improvement multi-objective genetic algorithm
CN107256241A (en) * 2017-05-26 2017-10-17 北京工业大学 The film recommendation method for improving multi-objective genetic algorithm is replaced based on grid and difference
US20190042079A1 (en) * 2017-08-01 2019-02-07 Samsung Electronics Co., Ltd. Electronic device and method for providing search result thereof
US20190205701A1 (en) * 2017-12-29 2019-07-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for Training Model and Information Recommendation System
CN109614609A (en) * 2018-11-06 2019-04-12 阿里巴巴集团控股有限公司 Method for establishing model and device
CN109977283A (en) * 2019-03-14 2019-07-05 中国人民大学 A kind of the tourism recommended method and system of knowledge based map and user's footprint
CN110457545A (en) * 2019-08-16 2019-11-15 第四范式(北京)技术有限公司 The method and device of the parameter of order models in a kind of determining recommender system
CN110457329A (en) * 2019-08-16 2019-11-15 第四范式(北京)技术有限公司 A kind of method and device for realizing personalized recommendation
CN110559664A (en) * 2019-09-19 2019-12-13 湘潭大学 game hero outgoing recommendation method and system based on multi-objective optimization

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HAI-CANH VU ET.AL.: "A Study on the Impacts of Maintenance Duration on Dynamic Grouping Modeling and Optimization of Multicomponent Systems", IEEE TRANSACTIONS ON RELIABILITY ( VOLUME: 67, ISSUE: 3, SEPTEMBER 2018) *
唐真: "基于 hadoop 的推荐 系统设计与实现", 中国优秀硕士学位论文全 文数据库 (信息科技辑) *
肖诗伯;郭秀英;: "基于用户特征的文献个性化推荐系统研究", 网络新媒体技术, no. 04 *
贾晓光;: "基于Spark的并行化协同深度推荐模型", 计算机工程与应用, no. 14 *
邱煜炎;: "Lambda架构下基于用户行为的机构知识库推荐系统建设研究", 中国教育信息化, no. 07 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753215A (en) * 2020-06-28 2020-10-09 海南大学 Multi-objective recommendation optimization method and readable medium
CN111797318A (en) * 2020-07-01 2020-10-20 喜大(上海)网络科技有限公司 Information recommendation method, device, equipment and storage medium
CN111797318B (en) * 2020-07-01 2024-02-23 喜大(上海)网络科技有限公司 Information recommendation method, device, equipment and storage medium
CN113627900A (en) * 2021-08-10 2021-11-09 未鲲(上海)科技服务有限公司 Model training method, device and storage medium
CN115145153A (en) * 2022-07-06 2022-10-04 广东省十九建建设有限公司 Intelligent energy-saving control method, system, terminal and medium for building house
CN115145153B (en) * 2022-07-06 2023-03-10 广东省十九建建设有限公司 Intelligent energy-saving control method, system, terminal and medium for building house

Also Published As

Publication number Publication date
CN111143685B (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN111143685B (en) Commodity recommendation method and device
CN108320171B (en) Hot-sold commodity prediction method, system and device
CN107292186B (en) Model training method and device based on random forest
CN105718490A (en) Method and device for updating classifying model
CN110717535B (en) Automatic modeling method and system based on data analysis processing system
CN108629436B (en) Method and electronic equipment for estimating warehouse goods picking capacity
CN104281664B (en) Distributed figure computing system data segmentation method and system
CN110991474A (en) Machine learning modeling platform
CN112396428B (en) User portrait data-based customer group classification management method and device
US20180247226A1 (en) Classifier
CN109325020A (en) Small sample application method, device, computer equipment and storage medium
CN107622326A (en) User's classification, available resources Forecasting Methodology, device and equipment
CN110414627A (en) A kind of training method and relevant device of model
CN111461225A (en) Clustering system and method thereof
CN105808582A (en) Parallel generation method and device of decision tree on the basis of layered strategy
CN113256409A (en) Bank retail customer attrition prediction method based on machine learning
CN110674178B (en) Method and system for constructing user portrait tag
US20150088789A1 (en) Hierarchical latent variable model estimation device, hierarchical latent variable model estimation method, supply amount prediction device, supply amount prediction method, and recording medium
CN107066328A (en) The construction method of large-scale data processing platform
CN107193940A (en) Big data method for optimization analysis
CN114490786A (en) Data sorting method and device
CN116611678B (en) Data processing method, device, computer equipment and storage medium
CN108074116B (en) Information providing method and device
CN108229572B (en) Parameter optimization method and computing equipment
CN116089713A (en) Recommendation model training method, recommendation device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant