CN110111152A - A kind of content recommendation method, device and server - Google Patents
A kind of content recommendation method, device and server Download PDFInfo
- Publication number
- CN110111152A CN110111152A CN201910390223.6A CN201910390223A CN110111152A CN 110111152 A CN110111152 A CN 110111152A CN 201910390223 A CN201910390223 A CN 201910390223A CN 110111152 A CN110111152 A CN 110111152A
- Authority
- CN
- China
- Prior art keywords
- recommendation
- target user
- feedback information
- message
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0269—Targeted advertisements based on user profile or attribute
- G06Q30/0271—Personalized advertisement
Abstract
The embodiment of the invention provides a kind of content recommendation method, device and servers, wherein method includes: to show the first recommendation to target user by the browser interface of user terminal, the feedback information that target user is directed to the first recommendation in navigation process is obtained, and the quantity for the recommendation message that the first recommendation includes can be adjusted according to the characteristic information of feedback information, the characteristic information of target user and the first recommendation.Through the embodiment of the present invention can be according to factors such as user characteristics, recommendation feature, Real-time Feedbacks, dynamic and the quantity for accurately adjusting the recommendation message that recommendation includes can maximize the income for promoting platform in the case where not influencing user experience.
Description
Technical field
The present invention relates to Internet technical field more particularly to a kind of content recommendation methods, device and server.
Background technique
When carrying out content (such as advertisement etc.) recommendation to user, the tolerance value of each user is different.It is wide to recommend
For announcement, clutter (i.e. how many information flows insert an advertisement to user) how is accurately determined, both can guarantee the library of advertisement
It deposits, while can guarantee that the not big decaying of the activity of the user is a problem in the urgent need to address again.
Current solution is specifically included that first is that can be made using artificial experience value by policymaker is rule of thumb subjective
The value of a fixed clutter divides crowd, formulates several crowd's correlations and be oriented dispensing, so after having user's portrait
The value is adjusted by AB Test experimental verification afterwards, this mode expends a large amount of manpowers and time, depend on artificial experience unduly, and
And it is unable to accurate balancing user experience and platform income;Second is that the method based on regression analysis, using user characteristics to clutter
Regression analysis is carried out, obtains the value of clutter, however the method based on regression analysis, mainly user's row from the past period
Learnt to obtain prediction model in, and applied in the advertisement serving policy in future, real-time is poor, and can not optimize length
Phase accumulated earnings.
Summary of the invention
The embodiment of the present invention provides a kind of content recommendation method, device and server, dynamic and can accurately adjust
The quantity for the recommendation message that recommendation includes can maximize the receipts for promoting platform in the case where not influencing user experience
Benefit.
On the one hand, a kind of content recommendation method, comprising:
Show that the first recommendation, first recommendation include to target user by the browser interface of user terminal
At least one recommendation message;
Obtain the feedback information that the target user is directed to first recommendation in navigation process;
According to the feedback information, the characteristic information pair of the characteristic information of the target user and first recommendation
The quantity for the recommendation message that first recommendation includes is adjusted.
On the other hand, the embodiment of the invention also provides a kind of content recommendation devices, comprising:
Display module, for showing the first recommendation to target user by the browser interface of user terminal, described the
One recommendation includes at least one recommendation message;
Module is obtained, for obtaining the target user in navigation process for the feedback letter of first recommendation
Breath;
Processing module, in being recommended according to the characteristic information of the feedback information, the target user and described first
The characteristic information of appearance is adjusted the quantity for the recommendation message that first recommendation includes.
Another aspect, the embodiment of the invention also provides a kind of servers, including processor, network interface and storage dress
It sets, the processor, the network interface and the storage device are connected with each other, wherein the network interface is by the processing
The control of device is used for sending and receiving data, and for the storage device for storing computer program, the computer program includes that program refers to
It enables, the processor is configured for calling described program instruction, for executing above-mentioned content recommendation method.
Another aspect is deposited in the computer storage medium the embodiment of the invention also provides a kind of computer storage medium
Program instruction is contained, which is performed, for realizing above-mentioned content recommendation method.
The embodiment of the present invention shows the first recommendation to target user by the browser interface of user terminal, obtains target
User is directed to the feedback information of the first recommendation in navigation process, and can be according to feedback information, the feature of target user
The characteristic information of information and the first recommendation is adjusted the quantity for the recommendation message that the first recommendation includes, so as to
With according to the factors such as user characteristics, recommendation feature, Real-time Feedback, what dynamic and accurately adjustment recommendation included is pushed away
The quantity of message is recommended, thus make quantity of the different user with different recommendation messages, it can be in the feelings for not influencing user experience
The income for promoting platform is maximized under condition.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is a kind of flow diagram of content recommendation method provided in an embodiment of the present invention;
Fig. 2 a is a kind of structural schematic diagram of DQN network provided in an embodiment of the present invention;
Fig. 2 b is the structural schematic diagram of another kind DQN network provided in an embodiment of the present invention;
Fig. 2 c is a kind of schematic diagram for optimizing DQN network provided in an embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of content recommendation device provided in an embodiment of the present invention;
Fig. 4 is a kind of structural schematic diagram of server provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Wherein, intensified learning refers to that intelligent body (Agent) is learnt in a manner of trial and error, by interacting with environment
Subsequent behavior is instructed in the award of acquisition, and target is so that intelligent body is obtained maximum award, and dynamically adjust parameter,
To reach enhanced signal maximum.
Wherein, server described in the embodiment of the present invention can refer to social interaction server device, Domestic News server etc.
The server of information browse is provided, may also mean that special advertisement releasing server etc..
Referring to Fig. 1, being a kind of flow diagram of content recommendation method provided in an embodiment of the present invention.In the present embodiment
Described content recommendation method, comprising:
101, server shows the first recommendation to target user by the browser interface of user terminal, and described first pushes away
Recommending content includes at least one recommendation message.
Specifically, the content (being denoted as the first recommendation) of recommendation can be sent to user terminal by server, user is whole
End in the navigation process of corresponding user (being denoted as target user) by browser interface by the first recommendation include it is a plurality of
Recommendation message shows the target user in an orderly manner.
By taking advertisement is recommended as an example, which includes a plurality of advertisement, it is assumed that target user is existed by user terminal
Domestic News are browsed, then user terminal shows an advertisement after the certain Domestic News of the every browsing of target user.Wherein,
The advertisement bar number that one recommendation includes is more, then the frequency for occurring advertisement in the navigation process of target user is higher.
102, the server obtains the feedback that the target user is directed to first recommendation in navigation process
Information.
Wherein, target user can take certain measure after seeing the first recommendation of displaying, and target user adopts
The measure taken can be considered that target user checks the feedback information of the first recommendation, such as click, or close recommendation
Content, or selection reduce the frequency of commending contents, if target user selects to click the content for checking recommendation, feedback information
It can further include access depth, access duration etc., in addition, if the frequency that current recommendation occurs in target user
More satisfied or without the resentment such as generating dislike, detesting, then target user, which is likely to continue to refresh browser interface, adds
New content is carried, i.e. refreshing frequency can also be considered as target user to the feedback information of the first recommendation.
It should be noted that feedback information here can specifically include target user for including in the first recommendation
Each recommendation message feedback information.
103, the server according to the feedback information, the target user characteristic information and it is described first recommend in
The characteristic information of appearance is adjusted the quantity for the recommendation message that first recommendation includes.
Wherein, the characteristic information of target user may include age, gender, region, income, hobby etc., and first
The characteristic information of recommendation may include content type, text size etc..
Specifically, server can be recommended according to the feedback information of target user, the characteristic information of target user and first
The characteristic information of content is adjusted the quantity for the recommendation message that the first recommendation includes, since feedback information reflects mesh
User is marked to the real-time attitude of recommendation, including is willing to accept these recommendations or more dislikes these recommendations,
To utilize the newest feedback information of target user, the feature of characteristic information and the first recommendation in conjunction with target user is believed
Breath can in time, the quantity of the recommendation message that dynamically includes to the first recommendation accurately adjusted.
In the embodiment of the present invention, server is shown in the first recommendation by the browser interface of user terminal to target user
Hold, obtains the feedback information that target user is directed to the first recommendation in navigation process, and can be according to feedback information, target
The quantity for the recommendation message that the characteristic information of the characteristic information of user and the first recommendation includes to the first recommendation carries out
Adjustment, so as to quickly and accurately adjust recommendation according to the factors such as user characteristics, recommendation feature, Real-time Feedback
The quantity for the recommendation message for including can not influence user so that different user be made to have the quantity of different recommendation messages
The income for promoting platform is maximized in the case where experience.
It in some possible embodiments, as shown in Figure 2 a, is a kind of deeply study (Deep provided by the invention
Q-Learning, DQN) network structural schematic diagram.Intelligent body (Agent) will act (Action) and be applied to environment
(Environment), environment gives intelligent body and rewards (Reward) accordingly, and reward includes user to advertisement (i.e. recommendation)
Click (AD click) and the activity of the user (User active), the state (State) of environment changes therewith,
Intelligent body by experience (Exploitation) and explore (Exploration) in the way of according to the variation of ambient condition and receive
Reward generate and new movement and be applied in environment.
In some possible embodiments, server shows first to target user by the browser interface of user terminal
Before recommendation, the characteristic information of available target user determines that second pushes away using the first DQN network and characteristic information
Content is recommended, third recommendation is determined using the 2nd DQN network and characteristic information, is pushed away further according to the second recommendation and third
Recommend the first recommendation that content determines target user.
In some possible embodiments, the refresh requests that browser interface is submitted can be directed to detecting target user
When, server just obtains the characteristic information of target user, and determines the first recommendation, that is to say, that when target user brushes
When new browser interface, server just pushes newest recommendation to the target user.
Wherein, framework of the present invention using double DQN (Double-DQN) networks, the first DQN network empirically network, i.e.,
Experience (Exploitation) based on user's history evaluates user, the 2nd DQN network is explored as prediction network
(Exploration) to predict the possible variation tendency of user, so that the first recommendation for issuing target user both includes base
The recommendation determined is showed in user's history, also comprising the recommendation predicted user, to not only may be used
To exercise supervision study based on historical data, the dynamic change of user can also be handled.
In some possible embodiments, server is pushed away according to feedback information, the characteristic information of target user and first
Recommending the concrete mode that the characteristic information of content is adjusted the quantity for the recommendation message that the first recommendation includes may is that clothes
Business device obtains the corresponding contextual information of feedback information, and contextual information for example may include that target user clicks recommendation
Time, place etc., server by feedback information, the characteristic information of target user, the first recommendation characteristic information and up and down
Literary information inputs the first DQN network and the 2nd DQN network respectively, to obtain the prediction reward value of the first recommendation, server
The quantity for the recommendation message that can include to the first recommendation according to prediction reward value is adjusted.
Specifically, as shown in Figure 2 b, being the structural schematic diagram of another kind DQN network provided by the invention, feature being set
Part is counted, each input of DQN network has following four Partial Feature: the feature AD feature of advertisement (corresponds to above-mentioned first
The characteristic information of recommendation), the feature User of user (characteristic information of corresponding above-mentioned target user), advertisement and user
Interaction feature User*AD (corresponding above-mentioned feedback information), contextual feature Context (corresponding above-mentioned contextual information).
From the point of view of Fig. 2 a, in this four groups of features, the feature and contextual feature of user is used to indicate current State, advertisement
Feature, advertisement and the interaction feature of user are used to indicate an Action of current State, by the model of double DQN networks
(current state State can be exported after V (s) and A (s, a)) processing to take the prediction Q value Q of this Action (s a) (is predicted
Reward value).The real value of Q includes two parts: the reward and future obtained immediately obtains discounting for reward:
ys,a=Q (s, a)=rimmediate+γrfuture
Wherein, reward immediately may include two parts, i.e. the click reward of user and user activity reward.Due to taking
The structure of Double-DQN, the calculating of Q reality value become:
Wherein, the present invention feeds back user activity and user as a kind of new feedback information to the click of advertisement, uses
Family liveness can be understood as refreshing frequency, number of clicks, and good result can increase user's frequency of use, therefore can be used as
One feedback index.If user is within a certain period of time without click behavior, liveness can decline, but once have click to go
For before liveness can rise, and after considering click and liveness, the reward immediately mentioned becomes: (click reward and
Liveness reward)
rtotal=rclick+βractive
In some possible embodiments, server obtains target user and is directed to the first recommendation in navigation process
The concrete mode of feedback information may is that server obtains target user in navigation process to coming from the first recommendation
First feedback information of the recommendation message of the second recommendation, and pushed away in the first recommendation from third recommendation
Recommend the second feedback information of message, and using the first feedback information and the second feedback information as target user the needle in navigation process
To the feedback information of the first recommendation, pushed away so as to obtain target user for what two DQN network models provided respectively
Recommend the degree of recognition of content.
In some possible embodiments, server determines first according to the first feedback information and the second feedback information
The recommendation effect of the recommendation effect of DQN network and the 2nd DQN network, recommendation effect can pass through number of clicks, access depth, visit
It asks that the dimensions such as duration are evaluated, is better than first using historical experience in the recommendation effect of the 2nd DQN network for exploring prediction
In the case where the recommendation effect of DQN network, server carries out the parameter of the first DQN network according to the parameter of the 2nd DQN network
The optimization to the first DQN network is realized in adjustment, and the recommendation that the first DQN network subsequent is provided is more accurate, easily
Accuracy in being easily accepted by a user, when guaranteeing that the quantity for the recommendation message for including to recommendation is adjusted.
In some possible embodiments, server determines that target is used according to the second recommendation and third recommendation
The concrete mode of first recommendation at family may is that server selects first from the recommendation message that the second recommendation includes
Recommendation message selects the second recommendation message from the recommendation message that third recommendation includes, by the first recommendation message and second
First recommendation of the recommendation message as target user, so that the first recommendation for issuing target user had both included based on use
Family history shows the recommendation determined, also comprising the recommendation predicted user, thus not only can be with base
It exercises supervision study in historical data, the dynamic change of user can also be handled.
Specifically, being a kind of schematic diagram for optimizing DQN network provided by the invention, by experience DQN network as shown in Figure 2 c
MODEL C urrent Network Q (i.e. Exploitation Network Q) and prediction DQN network model Explore
The recommendation List that Network Q~(can also be denoted as Exploration Network Q~) the two network models generate
L (including recommendation message A, B, C) and List L~(including recommendation message C, D, B) carries out mixing based on probability
(Probabilistic Interleave) is pushed away with obtaining final recommendation List L ∧ (comprising recommendation message A, C, D)
After giving user (Push to user), the feedback (Collect feedback) of user is recorded, mould is carried out according to user feedback
Type optimize (Model choice), in the case that Explore Network Q~recommendation effect more preferably (i.e. user feedback
In Feedback user only to recommendation message D carried out click checks operation), then by the parameter of Current Network Q to
Explore Network Q~parametric direction be updated (Step towards Q~);Otherwise, experience Current
The parameter of Network Q remains unchanged (Keep Q), and specific calculating is as follows:
Wherein, Explore Network Q~this network parameterIt is to be joined by the network of Current Network Q
It is added what certain noise generated on the basis of number w.It specifically calculates as follows:
Δ w=α rand (- 1,1) w
In some possible embodiments, the dynamic that the present invention comes when effective modeling contents are recommended using DQN network becomes
Change attribute, DQN will can be returned in short term and return is effectively simulated for a long time.Model is divided into online part and offline part,
Several committed steps of online part: by taking advertisement is recommended as an example, at each moment, when user sends interface refresh requests,
Agent generates k advertisement according to current State and recommends user, this recommendation results is experience Exploitation and in advance
The combination for surveying Exploration, obtains feedback result to the click of recommended advertisements and browsing behavior by user, according to user's
The advertisement of information and recommendation and obtained feedback, Agent can assess experience network model Current Network Q and pre- survey grid
Network model E xplore Network Q~performance, if the effect of Current Network Q is more preferable, Current
Network Q is remained stationary, if Explore Network Q~perform better than, the parameter of Current Network Q
It will be to Explore Network Q~variation.In addition, (such as 5 minutes or 10 minutes etc.), Ke Yigen after a period of time
According to the historical experience stored in experience pond, Current Network Q model parameter is updated.
In some possible embodiments, disclosed by the invention to be realized using deeply learning method to different user
Varying environment dynamic self-adapting recommendation message quantity scheme can be extended to the change of similar other deeplies study
The Parametric optimization problem of kind model, such as DN, DDQN, DDQN+U, DDQN+U+EG, the embodiment of the present invention is without limitation.
Referring to Fig. 3, being a kind of structural schematic diagram of content recommendation device provided in an embodiment of the present invention.In the present embodiment
Described content recommendation device, comprising:
Display module 301 shows the first recommendation to target user for the browser interface by user terminal, described
First recommendation includes at least one recommendation message;
Module 302 is obtained, for obtaining the target user in navigation process for the anti-of first recommendation
Feedforward information;
Processing module 303, for being recommended according to the characteristic information of the feedback information, the target user and described first
The characteristic information of content is adjusted the quantity for the recommendation message that first recommendation includes.
Optionally, described device further include: determining module 304, in which:
The acquisition module 302, is also used to obtain the characteristic information of target user;
The determining module 304 determines for learning DQN network and the characteristic information using the first deeply
Two recommendations;
The determining module 304 is also used to determine using the 2nd DQN network and the characteristic information in third recommendation
Hold;
The determining module 304 is also used to according to second recommendation and third recommendation determination
The first recommendation of target user.
Optionally, the processing module 303, is specifically used for:
Obtain the corresponding contextual information of the feedback information;
By the feedback information, the characteristic information of the target user, the characteristic information of first recommendation and institute
It states contextual information and inputs the first DQN network and the 2nd DQN network respectively, to obtain first recommendation
Predict reward value;
The quantity for the recommendation message that first recommendation includes is adjusted according to the prediction reward value.
Optionally, the acquisition module 302, is specifically used for:
Obtain the target user in navigation process in first recommendation come from second recommendation
Recommendation message the first feedback information, and disappear to the recommendation from the third recommendation in first recommendation
Second feedback information of breath;
It is directed in navigation process using first feedback information and second feedback information as the target user
The feedback information of first recommendation.
Optionally, the determining module 304 is also used to true according to first feedback information and second feedback information
The recommendation effect of the fixed first DQN network and the recommendation effect of the 2nd DQN network;
The processing module 303 is also used to the recommendation effect in the 2nd DQN network better than the first DQN network
Recommendation effect in the case where, the parameter of the first DQN network is adjusted according to the parameter of the 2nd DQN network.
Optionally, the determining module 304, is specifically used for:
The first recommendation message is selected from the recommendation message that second recommendation includes;
The second recommendation message is selected from the recommendation message that the third recommendation includes;
Using first recommendation message and second recommendation message as the first recommendation of the target user.
Optionally, the feedback information include number of clicks, access depth, access one of duration and refreshing frequency or
It is a variety of.
It is understood that the function of each functional module of the content recommendation device of the present embodiment can be according to above method reality
The method specific implementation in example is applied, specific implementation process is referred to the associated description of above method embodiment, herein no longer
It repeats.
In the embodiment of the present invention, display module 301 shows that first pushes away to target user by the browser interface of user terminal
Content is recommended, module 302 is obtained and obtains the feedback information that target user is directed to the first recommendation in navigation process, processing module
303 can be according to the characteristic information of feedback information, the characteristic information of target user and the first recommendation to the first recommendation
The quantity for the recommendation message for including is adjusted, so as to according to user characteristics, recommendation feature, Real-time Feedback etc. because
Element, dynamic and the quantity for accurately adaptively adjusting the recommendation message that recommendation includes, so that different user be made to have not
The quantity of same recommendation message can maximize the income for promoting platform in the case where not influencing user experience.
Referring to Fig. 4, being a kind of structural schematic diagram of server provided in an embodiment of the present invention.Described in the present embodiment
Server, comprising: processor 401, network interface 402 and memory 403.Wherein, it processor 401, network interface 402 and deposits
Reservoir 403 can be connected by bus or other modes, and the embodiment of the present invention by bus for being connected.
Wherein, processor 401 (or central processing unit (Central Processing Unit, CPU)) is server
Calculate core and control core.Network interface 402 optionally may include standard wireline interface and wireless interface (such as WI-
FI, mobile communication interface etc.), sending and receiving data is used for by the control of processor 401.Memory 403 (Memory) is server
Memory device, for storing program and data.It is understood that memory 403 herein can be high speed RAM memory,
It is also possible to non-labile memory (non-volatile memory), for example, at least a magnetic disk storage;It is optional to go back
It can be at least one storage device for being located remotely from aforementioned processor 401.Memory 403 provides memory space, and the storage is empty
Between store the operating system and executable program code of server, it may include but be not limited to: a kind of (operation of Windows system
System), Linux (a kind of operating system) system etc., the present invention is to this and is not construed as limiting.
In embodiments of the present invention, processor 401 is executed such as by the executable program code in run memory 403
Lower operation:
Show that the first recommendation, first recommendation include to target user by the browser interface of user terminal
At least one recommendation message;
Obtain the feedback information that the target user is directed to first recommendation in navigation process;
According to the feedback information, the characteristic information pair of the characteristic information of the target user and first recommendation
The quantity for the recommendation message that first recommendation includes is adjusted.
Optionally, the processor 401 is shown in the first recommendation in the browser interface by user terminal to target user
Before appearance, it is also used to:
Obtain the characteristic information of target user;
Learn DQN network using the first deeply and the characteristic information determines the second recommendation;
Third recommendation is determined using the 2nd DQN network and the characteristic information;
The first recommendation of the target user is determined according to second recommendation and the third recommendation.
Optionally, the processor 401 is according to the feedback information, the characteristic information of the target user and described first
The concrete mode that the characteristic information of recommendation is adjusted the quantity for the recommendation message that first recommendation includes are as follows:
Obtain the corresponding contextual information of the feedback information;
By the feedback information, the characteristic information of the target user, the characteristic information of first recommendation and institute
It states contextual information and inputs the first DQN network and the 2nd DQN network respectively, to obtain first recommendation
Predict reward value;
The quantity for the recommendation message that first recommendation includes is adjusted according to the prediction reward value.
Optionally, the processor 401 obtains the target user in navigation process for first recommendation
Feedback information concrete mode are as follows:
Obtain the target user in navigation process in first recommendation come from second recommendation
Recommendation message the first feedback information, and disappear to the recommendation from the third recommendation in first recommendation
Second feedback information of breath;
It is directed in navigation process using first feedback information and second feedback information as the target user
The feedback information of first recommendation.
Optionally, the processor 401, is also used to:
According to first feedback information and second feedback information determine the first DQN network recommendation effect and
The recommendation effect of the 2nd DQN network;
In the case where the recommendation effect of the 2nd DQN network is better than the recommendation effect of the first DQN network, according to
The parameter of the 2nd DQN network is adjusted the parameter of the first DQN network.
Optionally, the processor 401 determines the mesh according to second recommendation and the third recommendation
Mark the concrete mode of the first recommendation of user are as follows:
The first recommendation message is selected from the recommendation message that second recommendation includes;
The second recommendation message is selected from the recommendation message that the third recommendation includes;
Using first recommendation message and second recommendation message as the first recommendation of the target user.
Optionally, the feedback information include number of clicks, access depth, access one of duration and refreshing frequency or
It is a variety of.
In the specific implementation, processor 401 described in the embodiment of the present invention, network interface 402 and memory 403 can be held
It is real that the present invention also can be performed in implementation described in a kind of row process of content recommendation method provided in an embodiment of the present invention
Implementation described in a kind of content recommendation device of example offer is applied, details are not described herein.
In the embodiment of the present invention, processor 401 shows that first recommends to target user by the browser interface of user terminal
Content obtains the feedback information that target user is directed to the first recommendation in navigation process, and can be according to feedback information, mesh
The quantity for the recommendation message that the characteristic information of the characteristic information and the first recommendation of marking user includes to the first recommendation into
Row adjustment, it is dynamically and accurately adaptive so as to the foundation factors such as user characteristics, recommendation feature, Real-time Feedback
The quantity of the adjustment recommendation recommendation message that includes can be with so that different user be made to have the quantity of different recommendation messages
The income for promoting platform is maximized in the case where not influencing user experience.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium
In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access
Memory, RAM) etc..
Above disclosed is only a preferred embodiment of the present invention, cannot limit the power of the present invention with this certainly
Sharp range, those skilled in the art can understand all or part of the processes for realizing the above embodiment, and weighs according to the present invention
Benefit requires made equivalent variations, still belongs to the scope covered by the invention.
Claims (10)
1. a kind of content recommendation method, which is characterized in that the described method includes:
Show that the first recommendation, first recommendation include at least to target user by the browser interface of user terminal
One recommendation message;
Obtain the feedback information that the target user is directed to first recommendation in navigation process;
According to the characteristic information of the feedback information, the characteristic information of the target user and first recommendation to described
The quantity for the recommendation message that first recommendation includes is adjusted.
2. the method according to claim 1, wherein the browser interface by user terminal is to target user
Before showing the first recommendation, the method also includes:
Obtain the characteristic information of target user;
Learn DQN network using the first deeply and the characteristic information determines the second recommendation;
Third recommendation is determined using the 2nd DQN network and the characteristic information;
The first recommendation of the target user is determined according to second recommendation and the third recommendation.
3. according to the method described in claim 2, it is characterized in that, described according to the feedback information, the target user
The quantity for the recommendation message that the characteristic information of characteristic information and first recommendation includes to first recommendation into
Row adjustment, comprising:
Obtain the corresponding contextual information of the feedback information;
By the feedback information, the characteristic information of the target user, first recommendation characteristic information and it is described on
Context information inputs the first DQN network and the 2nd DQN network respectively, to obtain the prediction of first recommendation
Reward value;
The quantity for the recommendation message that first recommendation includes is adjusted according to the prediction reward value.
4. according to the method in claim 2 or 3, which is characterized in that described to obtain the target user in navigation process
For the feedback information of first recommendation, comprising:
The target user is obtained to push away in first recommendation from second recommendation in navigation process
The first feedback information of message is recommended, and to the recommendation message from the third recommendation in first recommendation
Second feedback information;
It is directed in navigation process using first feedback information and second feedback information as the target user described
The feedback information of first recommendation.
5. according to the method described in claim 4, it is characterized in that, the method also includes:
The recommendation effect of the first DQN network and described is determined according to first feedback information and second feedback information
The recommendation effect of 2nd DQN network;
In the case where the recommendation effect of the 2nd DQN network is better than the recommendation effect of the first DQN network, according to described
The parameter of 2nd DQN network is adjusted the parameter of the first DQN network.
6. the method according to any one of claim 2~5, which is characterized in that described according to second recommendation
The first recommendation of the target user is determined with the third recommendation, comprising:
The first recommendation message is selected from the recommendation message that second recommendation includes;
The second recommendation message is selected from the recommendation message that the third recommendation includes;
Using first recommendation message and second recommendation message as the first recommendation of the target user.
7. method described according to claim 1~any one of 6, which is characterized in that the feedback information include number of clicks,
Access depth, access one of duration and refreshing frequency or a variety of.
8. a kind of content recommendation device characterized by comprising
Display module shows the first recommendation to target user for the browser interface by user terminal, and described first pushes away
Recommending content includes at least one recommendation message;
Module is obtained, for obtaining the target user in navigation process for the feedback information of first recommendation;
Processing module, for according to the characteristic information of the feedback information, the target user and first recommendation
Characteristic information is adjusted the quantity for the recommendation message that first recommendation includes.
9. a kind of server, which is characterized in that including processor, network interface and storage device, the processor, the network
Interface and the storage device are connected with each other, wherein the network interface is used for sending and receiving data, institute by the control of the processor
Storage device is stated for storing computer program, the computer program includes program instruction, and the processor is configured for
Described program instruction is called, such as content recommendation method according to any one of claims 1 to 7 is executed.
10. a kind of computer storage medium, which is characterized in that program instruction is stored in the computer storage medium, it is described
Program instruction is performed, for executing such as content recommendation method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910390223.6A CN110111152A (en) | 2019-05-10 | 2019-05-10 | A kind of content recommendation method, device and server |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910390223.6A CN110111152A (en) | 2019-05-10 | 2019-05-10 | A kind of content recommendation method, device and server |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110111152A true CN110111152A (en) | 2019-08-09 |
Family
ID=67489354
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910390223.6A Pending CN110111152A (en) | 2019-05-10 | 2019-05-10 | A kind of content recommendation method, device and server |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110111152A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110941760A (en) * | 2019-11-27 | 2020-03-31 | 深圳市看见网络科技有限公司 | User evaluation method, system, server and storage medium based on information recommendation |
CN111292122A (en) * | 2020-01-16 | 2020-06-16 | 支付宝(杭州)信息技术有限公司 | Method and apparatus for facilitating user to perform target behavior for target object |
CN111523050A (en) * | 2020-04-16 | 2020-08-11 | 咪咕文化科技有限公司 | Content recommendation method, server and storage medium |
CN111626776A (en) * | 2020-05-26 | 2020-09-04 | 创新奇智(西安)科技有限公司 | Method for training strategy model, method and device for determining advertisement putting strategy |
CN112528131A (en) * | 2019-09-18 | 2021-03-19 | 北京达佳互联信息技术有限公司 | Aggregated page recommendation method and device, electronic equipment and storage medium |
CN112559880A (en) * | 2020-12-24 | 2021-03-26 | 百果园技术(新加坡)有限公司 | Information recommendation management method, system, equipment and storage medium |
CN114071119A (en) * | 2020-07-31 | 2022-02-18 | 北京达佳互联信息技术有限公司 | Resource testing method and device, electronic equipment and storage medium |
WO2022105780A1 (en) * | 2020-11-23 | 2022-05-27 | 中兴通讯股份有限公司 | Recommendation method and apparatus, electronic device, and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104717616A (en) * | 2013-12-13 | 2015-06-17 | 腾讯科技(深圳)有限公司 | Push message management method and device |
CN107833077A (en) * | 2017-11-29 | 2018-03-23 | 努比亚技术有限公司 | Advertisement insertion and mobile terminal |
CN109062919A (en) * | 2018-05-31 | 2018-12-21 | 腾讯科技(深圳)有限公司 | A kind of content recommendation method and device based on deeply study |
-
2019
- 2019-05-10 CN CN201910390223.6A patent/CN110111152A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104717616A (en) * | 2013-12-13 | 2015-06-17 | 腾讯科技(深圳)有限公司 | Push message management method and device |
CN107833077A (en) * | 2017-11-29 | 2018-03-23 | 努比亚技术有限公司 | Advertisement insertion and mobile terminal |
CN109062919A (en) * | 2018-05-31 | 2018-12-21 | 腾讯科技(深圳)有限公司 | A kind of content recommendation method and device based on deeply study |
Non-Patent Citations (2)
Title |
---|
GUANJIE ZHENG等: "DRN: A Deep Reinforcement Learning Framework for News Recommendation", 《WWW "18: PROCEEDINGS OF THE 2018 WORLD WIDE WEB CONFERENCE》 * |
GUANJIE ZHENG等: "DRN: A Deep Reinforcement Learning Framework for News Recommendation", 《WWW \'18: PROCEEDINGS OF THE 2018 WORLD WIDE WEB CONFERENCE》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112528131A (en) * | 2019-09-18 | 2021-03-19 | 北京达佳互联信息技术有限公司 | Aggregated page recommendation method and device, electronic equipment and storage medium |
CN110941760A (en) * | 2019-11-27 | 2020-03-31 | 深圳市看见网络科技有限公司 | User evaluation method, system, server and storage medium based on information recommendation |
CN111292122A (en) * | 2020-01-16 | 2020-06-16 | 支付宝(杭州)信息技术有限公司 | Method and apparatus for facilitating user to perform target behavior for target object |
CN111523050A (en) * | 2020-04-16 | 2020-08-11 | 咪咕文化科技有限公司 | Content recommendation method, server and storage medium |
CN111523050B (en) * | 2020-04-16 | 2023-09-19 | 咪咕文化科技有限公司 | Content recommendation method, server and storage medium |
CN111626776A (en) * | 2020-05-26 | 2020-09-04 | 创新奇智(西安)科技有限公司 | Method for training strategy model, method and device for determining advertisement putting strategy |
CN111626776B (en) * | 2020-05-26 | 2024-03-08 | 创新奇智(西安)科技有限公司 | Method for training strategy model, method and device for determining advertisement putting strategy |
CN114071119A (en) * | 2020-07-31 | 2022-02-18 | 北京达佳互联信息技术有限公司 | Resource testing method and device, electronic equipment and storage medium |
CN114071119B (en) * | 2020-07-31 | 2024-03-19 | 北京达佳互联信息技术有限公司 | Resource testing method and device, electronic equipment and storage medium |
WO2022105780A1 (en) * | 2020-11-23 | 2022-05-27 | 中兴通讯股份有限公司 | Recommendation method and apparatus, electronic device, and storage medium |
CN112559880A (en) * | 2020-12-24 | 2021-03-26 | 百果园技术(新加坡)有限公司 | Information recommendation management method, system, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110111152A (en) | A kind of content recommendation method, device and server | |
US11775843B2 (en) | Directed trajectories through communication decision tree using iterative artificial intelligence | |
CN110941740B (en) | Video recommendation method and computer-readable storage medium | |
US8924318B2 (en) | Online asynchronous reinforcement learning from concurrent customer histories | |
US9367820B2 (en) | Online temporal difference learning from incomplete customer interaction histories | |
US11042898B2 (en) | Clickstream purchase prediction using Hidden Markov Models | |
US10509837B2 (en) | Modeling actions for entity-centric search | |
CN102262661B (en) | Web page access forecasting method based on k-order hybrid Markov model | |
CN107590243B (en) | The personalized service recommendation method to be sorted based on random walk and diversity figure | |
KR20200058530A (en) | Methods and systems for constructing communication decision trees based on connected positionable elements on the canvas | |
TWI617927B (en) | Method and device for collecting and transmitting user behavior information | |
US20220245141A1 (en) | Interactive search experience using machine learning | |
US20230153857A1 (en) | Recommendation model training method, recommendation method, apparatus, and computer-readable medium | |
RU2720954C1 (en) | Search index construction method and system using machine learning algorithm | |
WO2014105622A2 (en) | Selecting an advertisement for a traffic source | |
US10402465B1 (en) | Content authority ranking using browsing behavior | |
JP7350590B2 (en) | Using iterative artificial intelligence to specify the direction of a path through a communication decision tree | |
CN113065882A (en) | Commodity processing method and device and electronic equipment | |
CN109075987A (en) | Optimize digital assembly analysis system | |
US10146876B2 (en) | Predicting real-time change in organic search ranking of a website | |
Dong et al. | Improving sequential recommendation with attribute-augmented graph neural networks | |
US11175807B1 (en) | Intelligent contextual video thumbnail display | |
Wan et al. | A Contextual Multi-armed Bandit Approach Based on Implicit Feedback for Online Recommendation | |
Micchi et al. | A new optimization layer for real-time bidding advertising campaigns | |
WO2013059517A1 (en) | Online temporal difference learning from incomplete customer interaction histories |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20221118 Address after: 1402, Floor 14, Block A, Haina Baichuan Headquarters Building, No. 6, Baoxing Road, Haibin Community, Xin'an Street, Bao'an District, Shenzhen, Guangdong 518000 Applicant after: Shenzhen Yayue Technology Co.,Ltd. Address before: 518057 Tencent Building, No. 1 High-tech Zone, Nanshan District, Shenzhen City, Guangdong Province, 35 floors Applicant before: TENCENT TECHNOLOGY (SHENZHEN) Co.,Ltd. |
|
TA01 | Transfer of patent application right |