CN109344328A

CN109344328A - Obtain the method and device of recommender system best parameter group

Info

Publication number: CN109344328A
Application number: CN201811110220.4A
Authority: CN
Inventors: 刘峰; 金慈航
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd; Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2018-09-21
Filing date: 2018-09-21
Publication date: 2019-02-15
Anticipated expiration: 2038-09-21
Also published as: CN109344328B

Abstract

The present invention proposes a kind of method, apparatus, equipment and storage medium for obtaining recommender system best parameter group, and wherein method includes: the multiple experiment users of screening；Parameter experiment is carried out by granularity of the session of experiment user, obtains learning sample, the learning sample includes the part value and its corresponding system effect assessed value of parameter combination；Using the learning sample training machine learning model, the mapping relations between parameter combination space and system effect space are obtained, the parameter combination space includes whole values of parameter combination；The optimal system effect and its corresponding parameter combination value in system effect space, best parameter group of the parameter combination value that will acquire as the recommender system are obtained using the mapping relations.The embodiment of the present invention can efficiently and accurately obtain the best parameter group of recommender system.

Description

Obtain the method and device of recommender system best parameter group

Technical field

The present invention relates to recommender system technical field more particularly to a kind of sides for obtaining recommender system best parameter group Method, device, equipment and computer readable storage medium.

Background technique

It the appearance of internet and popularizes and brings a large amount of information to user, meet user in the information age to information Demand, still, since information content is excessive so that user face bulk information when can not therefrom obtain it is actually useful to oneself Part information reduces the service efficiency of information instead, and here it is so-called information overload problems.

A method for solving information overload problem is using recommender system, it is information requirement according to user, interest Deng the interested information of user, product etc. to be recommended to the Personalized Information Recommendation System of user.

Recommender system is when being ranked up article, other than the clicking rate to user is estimated, can also introduce more A parameter adjusts the weight (such as have to different type resource and different mention weight coefficient) of sequence, more meets user to generate The recommendation list of interest.

It is determined currently, the best parameter group of recommender system generallys use following manner:

Selected a batch experiment user observes these realities then to the value of this batch of experiment user continuous adjusting parameter combination Overall performance of the user in recommender system is tested, determines system effect corresponding to the value of the parameter combination；It is by optimal Best parameter group of the corresponding parameter combination value of effect of uniting as recommender system.For example, replacing parameter at regular intervals Combined value, dispensed amount of this part Experiment user on this group of parameter combination value in test this period, per capita when It is long etc., one group of optimal parameter combination value is then selected, all users are extended to.

The shortcomings that this mode is the best parameter group that can not efficiently and accurately determine recommender system, and concrete reason is: If the time interval of replacement parameter combination value is short, the inadequate confidence of performance data of user cannot evaluating system well Effect；If the time interval for replacing parameter combination value is long, although Evaluated effect more confidence, due in parameter combination Comprising many parameters, the number of parameter combination value exponentially increases with number of parameters, substantial amounts, therefore is difficult limited Time in traverse all parameter combination values.

Summary of the invention

The embodiment of the present invention provides a kind of method, apparatus, equipment and computer for obtaining recommender system best parameter group Readable storage medium storing program for executing, at least to solve the above technical problem in the prior art.

In a first aspect, the embodiment of the invention provides a kind of methods for obtaining recommender system best parameter group, comprising:

Screen multiple experiment users；

Parameter experiment is carried out by granularity of the session of experiment user, obtains learning sample, the learning sample includes parameter Combined part value and its corresponding system effect assessed value；

Using the learning sample training machine learning model, obtain between parameter combination space and system effect space Mapping relations, the parameter combination space include whole values of parameter combination；

The optimal system effect and its corresponding parameter group conjunction in system effect space are obtained using the mapping relations Value, best parameter group of the parameter combination value that will acquire as the recommender system

With reference to first aspect, the present invention is in the first embodiment of first aspect, the session with experiment user Parameter experiment is carried out for granularity, obtains learning sample, comprising: test phase and statistics stage；

Wherein, the test phase executes following steps for each experiment user respectively:

When a session start of experiment user, a value of parameter combination is randomly selected, the recommender system is adopted Recommended with the value of the parameter combination of selection；

The behavioral data for recording the experiment user, in the conversation end of the experiment user, by the behavioral data Real system effect of the value of parameter combination as selection in the session of the experiment user；

The value that the statistics stage is directed to the parameters combination randomly selected executes following steps respectively:

Get parms real system effect of the value in the session of different experiments user of combination；

The average value for calculating the real system effect, using the average value of the real system effect as the parameter combination Value corresponding to system effect assessed value.

First embodiment with reference to first aspect, the present invention is in the second embodiment of first aspect, when experiment is used When the interval of the initial time currently refreshed at family and the last active instances of a upper refreshing is more than preset threshold value, by institute At the beginning of stating current sessions of the initial time of experiment user currently refreshed as the experiment user, and by the reality Test the finish time of a upper session of the upper one last active instances refreshed of user as the experiment user.

With reference to first aspect, the first embodiment of first aspect, second of embodiment of first aspect, the present invention exist In the third embodiment of first aspect, the optimal system effect obtained using the mapping relations in system effect space And its corresponding parameter combination value, comprising:

The value that multiple parameters combination is randomly choosed in the parameter combination space, as seed specimen；

For each seed specimen, being found in the mapping relations using climbing method can reach by the seed specimen The system effect extreme point arrived；

Optimal value is determined from the system effect extreme point that each seed specimen can reach, and the optimal value is made For the optimal system effect in the system effect space, and obtain parameter group conjunction corresponding to the optimal system effect Value.

Second aspect, the embodiment of the invention provides a kind of devices for obtaining recommender system best parameter group, comprising:

Experiment user screening module, for screening multiple experiment users；

Learning sample generation module, for obtaining learning sample using the session of experiment user as granularity progress parameter experiment, The learning sample includes the part value and its corresponding system effect assessed value of parameter combination；

Mapping relations generation module obtains parameter combination sky for using the learning sample training machine learning model Between mapping relations between system effect space, the parameter combination space includes whole values of parameter combination；

Optimized parameter obtains module, for obtaining the optimal system effect in system effect space using the mapping relations And its corresponding parameter combination value, best parameter group of the parameter combination value that will acquire as the recommender system.

In conjunction with second aspect, the present invention is in the first embodiment of second aspect, the learning sample generation module packet It includes: test submodule and statistic submodule；

Wherein, the test submodule executes following operation for each experiment user respectively:

When a session start of experiment user, a value of parameter combination is randomly selected, indicates the recommendation system System is recommended using the value for the parameter combination chosen；The behavioral data for recording the experiment user, in the experiment user Conversation end when, using the behavioral data as reality of the value in the session of the experiment user for the parameter combination chosen Border system effect；

The value that the statistic submodule is directed to the parameters combination randomly selected executes following operation respectively:

Get parms real system effect of the value in the session of different experiments user of combination；Calculate the practical system The average value for effect of uniting, using the average value of the real system effect as system effect corresponding to the value of the parameter combination Assessed value.

In conjunction with the first embodiment of second aspect, the present invention is in the second embodiment of second aspect, the test The interval of submodule, the last active instances for the initial time currently refreshed an and upper refreshing when experiment user is more than pre- When the threshold value first set, using the initial time of the experiment user currently refreshed as the current sessions of the experiment user Start time, and using the upper one last active instances refreshed of the experiment user as a upper session for the experiment user Finish time.

In conjunction with second aspect, the first embodiment of second aspect, second of embodiment of second aspect, the present invention exists In the third embodiment of second aspect, the optimized parameter obtains module, comprising:

Seed specimen chooses submodule, for randomly choosing taking for multiple parameters combination in the parameter combination space Value, as seed specimen；

Extreme point acquisition submodule is found in the mapping relations for being directed to each seed specimen using climbing method The system effect extreme point that can reach by the seed specimen；

Optimal acquisition submodule, for being obtained in the system effect extreme point that can reach by each seed specimen Optimal value using the optimal value as the optimal system effect in the system effect space, and obtains the optimal system effect Parameter combination value corresponding to fruit, best parameter group of the parameter combination value that will acquire as the recommender system.

The function can also execute corresponding software realization by hardware realization by hardware.The hardware or Software includes one or more modules corresponding with above-mentioned function.

In a possible design, obtain recommender system best parameter group device structure in include processor and Memory, the memory are used to store the device for supporting to obtain recommender system best parameter group and execute in above-mentioned first aspect The program of the method for recommender system best parameter group is obtained, the processor is configured to depositing in the memory for executing The program of storage.The device for obtaining recommender system best parameter group can also include communication interface, recommend system for obtaining The device and other equipment or communication of system best parameter group.

The third aspect, the embodiment of the invention provides a kind of computer readable storage mediums, recommend system for storing to obtain Computer software instructions used in the device of system best parameter group comprising recommend for executing to obtain in above-mentioned first aspect The method of system optimal parameter combination is to obtain program involved in the device of recommender system best parameter group.

A technical solution in above-mentioned technical proposal have the following advantages that or the utility model has the advantages that

The embodiment of the present invention carries out parameter experiment, the part value for the combination that gets parms by granularity of the session of experiment user And its corresponding system effect assessed value, this mode compare the side that parameter combination value is uniformly replaced for all experiment users Formula is highly efficient.Also, on-line parameter learning art is utilized, the parameter combination value and corresponding system of test will have been completed Recruitment evaluation value constructs machine learning model and is learnt, obtain the parameter combination space (institute comprising parameter combination as sample Have value) to the mapping relations between system effect space, it avoids and the parameter combination value of substantial amounts is traversed, because This can efficiently and accurately determine the best parameter group of recommender system.

Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.

Detailed description of the invention

In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.

Fig. 1 is the method flow diagram of the acquisition recommender system best parameter group of the embodiment of the present invention one；

Fig. 2 is the implementation flow chart of the embodiment of the present invention two；

Fig. 3 is the division mode schematic diagram of the session of an experiment user；

Fig. 4 is the implementation flow chart of the embodiment of the present invention three；

Fig. 5 is to obtain system effect extreme point E using climbing method in the embodiment of the present invention three_{i_max}A realization process show It is intended to；

Fig. 6 is the apparatus structure schematic diagram of the acquisition recommender system best parameter group of the embodiment of the present invention four；

Fig. 7 is the apparatus structure schematic diagram of the acquisition recommender system best parameter group of the embodiment of the present invention five；

Fig. 8 is the device structure schematic diagram of the acquisition recommender system best parameter group of the embodiment of the present invention six.

Specific embodiment

Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.

The embodiment of the present invention mainly provides a kind of method and apparatus for obtaining recommender system best parameter group, divides below It is not described by the expansion that following embodiment carries out technical solution.

Embodiment one

It is the method flow diagram of the acquisition recommender system best parameter group of the embodiment of the present invention one, packet referring to Fig. 1, Fig. 1 Include following steps:

S110: multiple experiment users are screened.

According to actual needs, can from full dose user a certain proportion of user of random sampling as experiment user.For example, The user of random sampling 1% is as experiment user from full dose user.The ratio of sampling can be arranged according to actual demand, sampling Ratio is higher, and the treating capacity of the present embodiment is bigger, but the effect for obtaining best parameter group is more preferable.

S120: it is that granularity carries out parameter experiment with the session (Session) of experiment user, obtains learning sample, Practise the part value and its corresponding system effect assessed value that sample includes parameter combination.

From in the prior art using experiment user as granularity replace the value of parameter combination it is different, in this step, with user Session be that granularity carries out parameter experiment, the number that parameter experiment is carried out in the same time is more, therefore can be more efficiently Determine system effect assessed value corresponding to the value of parameter combination.

Due to the substantial amounts of parameter combination value, it is difficult to all values of parameter combination are traversed within the limited time, Therefore, system effect assessed value corresponding to the part value of parameter combination is only determined by parameter experiment in this step.Extremely The system effect corresponding to remaining parameter combination value is then obtained by the utilization machine learning model in subsequent step S130 It arrives.

S130: the learning sample training machine learning model is used, parameter combination space and system effect space are obtained Between mapping relations, the parameter combination space includes whole values of parameter combination.

This step utilizes the generalization ability of machine learning model, has obtained system corresponding to whole values of parameter combination Effect is avoided and is traversed to whole values of parameter combination, therefore can efficiently and accurately determine recommender system most Excellent parameter combination.

S140: the optimal system effect and its corresponding parameter group in system effect space are obtained using the mapping relations Conjunction value, using the parameter combination value as the best parameter group of the recommender system.

Specific embodiment is used below, is described in detail to the part steps in embodiment one.

Embodiment two

The present embodiment introduces a kind of specific implementation of step S120 in embodiment one.

For ease of understanding, before introducing specific steps, two related notions are introduced first.

First, the session of user:

In recommender system, the behavior pattern of user is usually to be refreshed and browsed as unit of session.That is: from certain A moment, the continuous simultaneously brose and reading article that refreshes for a period of time, then stop, starting to continue refreshing after a period of time again Browsing a period of time, and so on.

If user is u, i-th of session of user u is denoted as s^u(i).One user conversation may include and refresh many times, i.e., s^u(i)={ r^u(i, j) | 1≤j≤n^u(i) }, wherein r^u(i, j) is that the jth time in i-th of session of user u refreshes, n^u(i) it is Total refreshing frequency in i-th of session of user u.

Remember st^u(i, j) is the initial time that jth time refreshes in i-th of session of user u, et^u(i, j) is i-th of user u The last active time that jth time refreshes in session (in the time for finally having user action of client record).In a user In session, all users refresh to be arranged sequentially in time, i.e. st^u(i,j)<et^u(i,j)<st^u(i,j+1)<et^u (i,j+1)。

The division of user conversation can be carried out according to formula (1):

st^u(i+1,1)-et^u(i,n^u(i)) >=thre&&st^u(i,j+1)-et^u(i,j)<thre (1)

Wherein, 1≤j < n^u(i)；

Thre is preset threshold value；

Formula (1) is meant that: the last active time refreshed for the last time in i-th of session, in i+1 session The time interval of the initial time refreshed for the first time is greater than or is equal to preset threshold value thre.And for a session It is interior it is adjacent refresh twice, the time interval of the preceding last active time once refreshed and the initial time once refreshed afterwards is wanted Less than threshold value thre.

The occurrence of time interval threshold value (thre) can be arranged according to the actual situation, such as take 30 minutes.

Second, the parameter combination of recommender system:

If in the parameter combination of recommender system including m adjustable parameter.By the value of a parameter combination is defined as: P= {p₁=v₁,p₂=v₂,…,p_m=v_m}.Wherein, v_iCertain value of i-th of parameter in expression parameter combination.

For any one parameter, it is assumed that its value range is x to y.The section of x to y is uniformly cut into c parts, it can To obtain c+1 discrete values of the parameter: x, x+1/c, x+2/c ..., x+ (c-1)/c, y.

In this way, when parameter combination includes m parameter, each parameter p_iAll there is the possible value of c+1, the parameter combination is then In the presence of (c+1)^mA possibility value.

Above-mentioned two related notion has been introduced, referring to Fig. 2, has introduced the specific embodiment of embodiment two.Fig. 2 is this The implementation flow chart of inventive embodiments two.In the present embodiment, parameter experiment is carried out by granularity of the session of experiment user, obtained Learning sample, the learning sample include the part value and its corresponding system effect assessed value of parameter combination.

As shown in Fig. 2, whole flow process includes 2 stages: test phase and statistics stage；

Wherein, following steps are executed respectively for each experiment user in test phase:

S121: when a session start of experiment user, a value of parameter combination is randomly selected, recommender system is adopted Recommended with the value of the parameter combination of selection.

S122: recording the behavioral data of the experiment user, in the conversation end of the experiment user, by the behavior Value real system effect in the session of the experiment user of the data as the parameter combination.

Session for experiment user can be divided in the following way: as of experiment user currently refreshed When the interval of the last active instances of moment beginning and a upper refreshing is more than preset threshold value, by the current of the experiment user At the beginning of current sessions of the initial time of refreshing as the experiment user, and upper the one of the experiment user is refreshed Last active instances as the experiment user a upper session finish time.

Such as the division mode schematic diagram for the session that Fig. 3 is an experiment user.Straight line with the arrow indicates the time in Fig. 3 The rectangle of axis, oblique line filling indicates the refreshing of experiment user.On a timeline, the behavior of an experiment user includes repeatedly brush Newly.When dividing session, when the interval between the initial time once refreshed and the last last active instances refreshed is greater than When preset threshold value, then it will this time refresh and be divided in next session；If the initial time once refreshed with it is upper When the interval between last active instances once refreshed is less than or equal to preset threshold value, then without dividing, currently Session still continues.

When a new session start of experiment user, a value of parameter combination is randomly selected, that is, with meeting Words are the value that granularity replaces parameter combination.In this session of experiment user, recommender system uses this group of parameter chosen Combined value is ranked up article, and is that experiment user recommends article according to the sequence after sequence.

In the multiple refreshing of the session of experiment user, experiment user may click multiple articles and be read.Step The click behavior record is got off in rapid S122, at the end of current sessions, will click on value of the number as the parameter combination Real system effect in the session of experiment user.Real system effect can be the total click time of experiment user in a session Number, is also possible to the total reading duration of experiment user in a session.Tuning mesh specifically with which kind of index, depending on recommender system Mark.

The duration of test phase can be configured in advance according to certain standard.By test phase, obtain Real system effect corresponding to multiple values of parameter combination.It, may be multiple due to the value for a parameter combination It is surveyed in the session (sessions of different sessions and different experiments user including same experiment user) of experiment user Examination, therefore the value of the same parameter combination may correspond to multiple real system effects.In order to obtain for parameter combination The system effect assessed value of value, continues to execute the statistics stage.It is specific as follows:

Following steps are executed respectively in each value that the statistics stage is directed to the parameter combination randomly selected:

S123: real system effect of the value for the combination that gets parms in the session of different experiments user.

It is assumed that having done n times test in above-mentioned test phase for the value of a certain parameter combination, then can have been obtained in this step Get N number of real system effect of the value of parameter combination in the presence of N number of session.N is just whole more than or equal to 1 Number.

S124: calculating the average value of the real system effect, using the average value of the real system effect as the ginseng System effect assessed value corresponding to the value that array is closed.

In this way, the step of by test phase and statistics stage, the part value and its correspondence of available parameter combination System effect assessed value: (P₁,E₁), (P₂,E₂) ... ....Wherein P_iRepresent the value of a parameter combination, E_iRepresent this group ginseng System effect assessed value corresponding to the value that array is closed.

Later, abovementioned steps S130 can be executed, it may be assumed that imitate the part value of above-mentioned parameter combination and its corresponding system Fruit assessed value obtains reflecting between parameter combination space and system effect space as learning sample, training machine learning model Penetrate relationship f:P- > E, wherein parameter combination space includes whole values of parameter combination.

The advantages of establishing this mapping is: when above-mentioned steps S121 randomly selects the value of parameter combination, due to ginseng The quantity that array closes possible value increases by geometric progression, and some values of parameter combination may be never selected, because This can not directly obtain system effect assessed value corresponding to the value of these parameter combinations.It is possible to pass through machine learning This mapping relations of model foundation, utilize the generalization ability of machine learning model, so that it may no parameter combination tested Value make up, to obtain system effect corresponding to all values of parameter combination.

Embodiment three

The present embodiment introduces a kind of specific implementation of step S140 in embodiment one.Referring to fig. 4, Fig. 4 is the present invention The implementation flow chart of embodiment three.

In the present embodiment, it needs to find out optimal system effect and its corresponding parameter group from above system effect space Conjunction value, the corresponding parameter combination value are exactly the best parameter group of recommender system.Due to including in system effect space Data volume it is very big, optimal system effect can not be found out by way of directly comparing, therefore, the present embodiment use climbing method Obtain the optimal system effect in system effect space.It is discussed in detail referring to Fig. 4.The embodiment of the present invention three includes following step It is rapid:

S141: the value P of multiple parameters combination is randomly choosed in parameter combination space_i, as seed specimen.

S142: it is directed to each seed specimen P_i, using climbing method in above-mentioned mapping relationship f: finding in P- > E by the seed Sample P_iSet out the system effect extreme point E that can reach_{i_max}。

Climbing method is a kind of method for solving multivariable Unconstrained Optimization Problem, also known as direct search method, is by point The directly mobile point for generating target value and making moderate progress gradually reaches the point for keeping objective function optimal by such movement.If The geometric figure of objective function is regarded as a mountain peak, then that puts direct mobile just as people is climbing the mountain, choice direction, gradually to Mountain top is mobile.Climbing method is according to the method that following principles are soundd out: being all by the step final state achieved In the attainable final state of tolerable step institute, closest to one of final goal.Such step is known as optimal step.It presses The successively selecting step of principle after this manner, sequence is soundd out, until final goal.It, can if final goal is not achieved To turn back, since a certain intermediate state having been subjected to, the more slightly worse suboptimum step of direct effect is used instead, along Another branch's approach again sound out by row.It is of course also possible to the initial state of entire problem is turned back to quickly, along another The completely new approach of item is soundd out.

System effect extreme point E is obtained using climbing method as Fig. 5 is shown in the embodiment of the present invention three_{i_max}One realization Process.It include three reference axis: x-axis, y-axis and z-axis in Fig. 5.Wherein, x-axis, y-axis respectively correspond two in seed specimen Parameter (for convenience of describing, in the present embodiment, the parameter combination as seed specimen includes two parameters), z-axis correspondence system Effect.

Broken line in Fig. 5 shows the realization track that a system effect extreme point is determined by a seed specimen.It should The point at broken line most lower-left end is starting point, and x-axis, the value of y-axis of starting point are determined by the value of two parameters in seed specimen, The value of the z-axis of starting point is system effect corresponding to the seed specimen.It is gradually soundd out by using climbing method, is finally reached end Point, the i.e. point of the broken line most upper right side.The value of the z-axis of terminal be the point system effect, it can be seen that the system effect be by The seed specimen sets out the maximum that can reach.

S143: the system effect extreme point E searched out from multiple seed specimens_{i_max}Middle acquisition optimal value, will be described optimal Value obtains parameter combination corresponding to the optimal system effect as the optimal system effect in the system effect space Value P_best。

The P determined in the above process_bestIt is exactly the best parameter group of recommender system, full dose is used in recommender system It comes into force at family.

In the present embodiment, why to be obtained for multiple seed specimens using climbing method can by seed specimen The system effect extreme point reached, then optimal value is found in these system effect extreme points, the reason is that: mapping function f:P- > E is often multi-peak.If finding optimal system effect using only a seed specimen, then it is likely used only to find Only local best points, rather than globe optimum.

Example IV

The present embodiment introduces a kind of device for obtaining recommender system best parameter group, if Fig. 6 is the embodiment of the present invention four Acquisition recommender system best parameter group apparatus structure schematic diagram, which includes:

Experiment user screening module 610, for screening multiple experiment users.

Learning sample generation module 620 obtains study sample for carrying out parameter experiment by granularity of the session of experiment user This, the learning sample includes the part value and its corresponding system effect assessed value of parameter combination.

Mapping relations generation module 630 obtains parameter combination for using the learning sample training machine learning model Mapping relations between space and system effect space, the parameter combination space include whole values of parameter combination.

Optimized parameter obtains module 640, for obtaining the optimal system in system effect space using the mapping relations Effect and its corresponding parameter combination value, optimized parameter group of the parameter combination value that will acquire as the recommender system It closes.

Wherein, experiment user screening module 610 can according to actual needs, the random sampling certain proportion from full dose user User as experiment user.For example, the user of random sampling 1% is as experiment user from full dose user.The ratio of sampling It can be arranged according to actual demand, sampling proportion is higher, and the treating capacity of the present embodiment device is bigger, but determines best parameter group Effect it is more preferable.

Due to the substantial amounts of parameter combination value, all values of parameter combination can not be traversed within the limited time, Therefore, learning sample generation module 620 is only got parms system effect corresponding to the part value of combination by parameter experiment Assessed value then utilizes machine by mapping relations generation module 630 as system effect corresponding to remaining parameter combination value Learning model obtains.

Embodiment five

The present embodiment introduces another device for obtaining recommender system best parameter group, if Fig. 7 is the embodiment of the present invention The apparatus structure schematic diagram of five acquisition recommender system best parameter group, the device include:

Experiment user screening module 610, for screening multiple experiment users.

Learning sample generation module 620 may include: test submodule 621 and statistic submodule 622；

Wherein, test submodule 621 executes following operation for each experiment user respectively:

When a session start of experiment user, a value of parameter combination is randomly selected, instruction recommender system is adopted Recommended with the value of the parameter combination of selection；The behavioral data for recording the experiment user, in the meeting of the experiment user Practical system at the end of words, using the behavioral data as the value for the parameter combination chosen in the session of the experiment user System effect.

The value that statistic submodule 622 is directed to the parameters combination randomly selected executes following operation respectively:

Different from such a way that experiment user replaces parameter combination value as granularity, testing submodule 621 in the prior art Parameter experiment is carried out by granularity of the session of user, therefore the number that parameter experiment is carried out in the same time is more, Neng Gougeng Efficiently get parms system effect assessed value corresponding to the value of combination.

About the division of experiment user session, above-mentioned test submodule 621 can be used for the current refreshing when experiment user Initial time and it is upper one refresh last active instances interval be more than preset threshold value when, by the experiment user At the beginning of current sessions of the initial time currently refreshed as the experiment user, and by upper the one of the experiment user Finish time of the last active instances refreshed as a upper session for the experiment user.

The apparatus structure for the recommender system best parameter group that the present embodiment proposes further include:

Mapping relations generation module 630 utilizes the generalization ability of machine learning model, and the whole for having obtained parameter combination takes The corresponding system effect of value, avoids and traverses to whole values of parameter combination.

Mapping relations generation module 630 generates the mapping relations in parameter combination space and system effect space, due to being The data volume for including in system effect space is very big, and optimal system effect can not be found out by way of directly comparing, therefore, this Embodiment can by optimized parameter obtain module 640 using climbing method obtain system effect space in optimal system effect and its The value of corresponding parameter combination.Specifically, optimized parameter acquisition module 640 includes:

Seed specimen chooses submodule 641, for randomly choosing multiple parameters combination in the parameter combination space Value, as seed specimen.

Extreme point acquisition submodule 642 is sought in the mapping relations for being directed to each seed specimen using climbing method Look for the system effect extreme point by that can reach the seed specimen.

Optimal acquisition submodule 643, for from the system effect extreme point that can be reached by each seed specimen Optimal value is obtained, using the optimal value as the optimal system effect in the system effect space, and obtains the most major clique Parameter combination value corresponding to effect of uniting, optimized parameter group of the parameter combination value that will acquire as the recommender system It closes.

By foregoing description as it can be seen that the device for the acquisition recommender system best parameter group that the embodiment of the present invention five proposes, with The session of user is that granularity carries out parameter testing, obtains the part value for including parameter combination and its corresponding system effect is assessed The learning sample of value；And the learning sample training machine learning model is used, obtain the ginseng of whole values comprising parameter combination Mapping relations between number interblock space and system effect space realize the optimized parameter for accurately and efficiently obtaining recommender system Combination.

Embodiment six

The embodiment of the present invention six provides a kind of equipment for obtaining recommender system best parameter group, if Fig. 8 is that the present invention is real The device structure schematic diagram of the acquisition recommender system best parameter group of example five is applied, which includes: memory 810 and processor 820, memory 810 is stored with the computer program that can be run on processor 820.The processor 820 executes the calculating The method of the acquisition recommender system best parameter group in above-described embodiment is realized when machine program.The memory 810 and processing The quantity of device 820 can be one or more.

The equipment can also include:

Communication interface 830 carries out data exchange transmission for being communicated with external device.

Memory 810 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.

If memory 810, processor 820 and the independent realization of communication interface 830, memory 810,820 and of processor Communication interface 830 can be connected with each other by bus and complete mutual communication.The bus can be industrial standard body Architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component Interconnect) bus or extended industry-standard architecture (EISA, Extended Industry Standard Architecture) etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for It indicates, is only indicated with a thick line in Fig. 8, be not offered as only a bus or a type of bus.

Optionally, in specific implementation, if memory 810, processor 820 and communication interface 830 are integrated in one piece of core On piece, then memory 810, processor 820 and communication interface 830 can complete mutual communication by internal interface.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.

In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.

Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention Embodiment person of ordinary skill in the field understood.

Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media Suitable method is handled electronically to obtain described program, is then stored in computer storage.

It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..

Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.

It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..

In conclusion the embodiment of the present invention propose the method, apparatus of acquisition recommender system best parameter group, equipment and Storage medium is replaced the value of parameter combination using user conversation as granularity, obtains the part value and system effect of parameter combination The corresponding relationship of assessed value, can be carried out efficiently parameter experiment；Also, using the corresponding relationship as learning sample training machine Learning model obtains system effect corresponding to whole values of parameter combination, carries out so as to avoid to all parameter combinations Actual test reaches and accurately and efficiently obtains best parameter group.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims

1. a kind of method for obtaining recommender system best parameter group, which is characterized in that the described method includes:

Screen multiple experiment users；

Parameter experiment is carried out by granularity of the session of experiment user, obtains learning sample, the learning sample includes parameter combination Part value and its corresponding system effect assessed value；

Using the learning sample training machine learning model, the mapping between parameter combination space and system effect space is obtained Relationship, the parameter combination space include whole values of parameter combination；

The optimal system effect and its corresponding parameter combination value in system effect space are obtained using the mapping relations, it will Best parameter group of the parameter combination value of acquisition as the recommender system.

2. according to method described in the 1 of claim, which is characterized in that described to carry out parameter by granularity of the session of experiment user Test, obtains learning sample, comprising: test phase and statistics stage；

Wherein, the test phase executes following steps for each experiment user respectively: when a session of experiment user is opened When the beginning, a value of parameter combination is randomly selected, the recommender system is recommended using the value for the parameter combination chosen； The behavioral data for recording the experiment user, in the conversation end of the experiment user, using the behavioral data as selection Parameter combination real system effect of the value in the session of the experiment user；

The value that the statistics stage is directed to the parameters combination randomly selected executes following steps respectively: get parms combination Real system effect of the value in the session of different experiments user；The average value for calculating the real system effect, by institute State system effect assessed value corresponding to value of the average value of real system effect as the parameter combination.

3. according to method described in the 2 of claim, which is characterized in that further include:

When the interval for the last active instances that the initial time of experiment user currently refreshed refreshes with upper one is more than to preset Threshold value when, using the initial time of the experiment user currently refreshed as the current sessions of the experiment user at the beginning of Carve, and using the experiment user it is upper one refresh last active instances as a upper session for the experiment user at the end of It carves.

4. according to the 1 to 3 of claim any method, which is characterized in that described to obtain system using the mapping relations Optimal system effect and its corresponding parameter combination value in system effect space, comprising:

For each seed specimen, being found in the mapping relations using climbing method can reach by the seed specimen System effect extreme point；

Obtain optimal value from the system effect extreme point that can be reached by each seed specimen, using the optimal value as Optimal system effect in the system effect space, and obtain parameter combination value corresponding to the optimal system effect.

5. a kind of device for obtaining recommender system best parameter group, which is characterized in that described device includes:

Experiment user screening module, for screening multiple experiment users；

Learning sample generation module, it is described for obtaining learning sample using the session of experiment user as granularity progress parameter experiment Learning sample includes the part value and its corresponding system effect assessed value of parameter combination；

Mapping relations generation module, for use the learning sample training machine learning model, obtain parameter combination space with Mapping relations between system effect space, the parameter combination space include whole values of parameter combination；

Optimized parameter obtain module, for using the mapping relations obtain system effect space in optimal system effect and its Corresponding parameter combination value, best parameter group of the parameter combination value that will acquire as the recommender system.

6. according to device described in the 5 of claim, which is characterized in that the learning sample generation module includes: test submodule Block and statistic submodule；

Wherein, the test submodule executes following operation for each experiment user respectively: when a session of experiment user When beginning, randomly select a value of parameter combination, indicate the recommender system using the parameter combination chosen value into Row is recommended；The behavioral data for recording the experiment user makees the behavioral data in the conversation end of the experiment user For real system effect of the value in the session of the experiment user of the parameter combination of selection；

The value that the statistic submodule is directed to the parameters combination randomly selected executes following operation respectively: get parms group Real system effect of the value of conjunction in the session of different experiments user；The average value of the real system effect is calculated, it will System effect assessed value corresponding to value of the average value of the real system effect as the parameter combination.

7. according to device described in the 6 of claim, which is characterized in that the test submodule, for working as when experiment user When the interval of the initial time of preceding refreshing and the last active instances of a upper refreshing is more than preset threshold value, by the experiment At the beginning of current sessions of the initial time of user currently refreshed as the experiment user, and by the experiment user A upper one upper session of the last active instances as the experiment user refreshed finish time.

8. according to the 5 to 7 of claim any devices, which is characterized in that the optimized parameter obtains module, comprising:

Seed specimen chooses submodule, for randomly choosing the value of multiple parameters combination in the parameter combination space, makees For seed specimen；

Extreme point acquisition submodule is found in the mapping relations by this for being directed to each seed specimen using climbing method Seed specimen sets out the system effect extreme point that can reach；

Optimal acquisition submodule, it is optimal in the system effect extreme point that can reach by each seed specimen for obtaining Value, using the optimal value as the optimal system effect in the system effect space, and obtains the optimal system effect institute Corresponding parameter combination value, best parameter group of the parameter combination value that will acquire as the recommender system.

9. a kind of equipment for obtaining recommender system best parameter group, which is characterized in that the equipment includes:

One or more processors；

Storage device, for storing one or more programs；

When one or more of programs are executed by one or more of processors, so that one or more of processors Realize the method as described in any in claim 1-4.

10. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor The method as described in any in claim 1-4 is realized when row.