CN112528131A - Aggregated page recommendation method and device, electronic equipment and storage medium - Google Patents

Aggregated page recommendation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112528131A
CN112528131A CN201910881954.0A CN201910881954A CN112528131A CN 112528131 A CN112528131 A CN 112528131A CN 201910881954 A CN201910881954 A CN 201910881954A CN 112528131 A CN112528131 A CN 112528131A
Authority
CN
China
Prior art keywords
target
target account
information
feedback information
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910881954.0A
Other languages
Chinese (zh)
Inventor
王天驹
叶璨
杨乃君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201910881954.0A priority Critical patent/CN112528131A/en
Publication of CN112528131A publication Critical patent/CN112528131A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The disclosure relates to a method and a device for recommending an aggregated page, an electronic device and a storage medium, wherein the method comprises the following steps: acquiring current first state information of a target account using an application program, wherein the first state information comprises behavior characteristic information of the target account, and the behavior characteristic information is determined based on historical operations performed by the target account on a historically recommended aggregation page; according to the first state information, determining feedback information of the target account executing operation on the application program after the target aggregation page is recommended to the target account, wherein the feedback information is used as first feedback information, and according to the first state information, determining that the target aggregation page is not recommended to the target account, and the feedback information of the target account executing operation on the application program is used as second feedback information; and determining whether to recommend the target aggregated page to the target account based on the first feedback information and the second feedback information, wherein based on the processing, the flexibility of recommending the aggregated page can be improved.

Description

Aggregated page recommendation method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of internet technologies, and in particular, to a method and an apparatus for recommending an aggregated page, an electronic device, and a storage medium.
Background
With the rapid development of internet technology, an account can browse information which is of interest to the account through a client. To facilitate account browsing information, the server may recommend an aggregated page to the account based on the form of the aggregated page, which may be a collection of the same type of information, e.g., a game aggregated page, a music aggregated page, a short video aggregated page, etc.
In one manner, for a game aggregation page, whether the account is interested in the game aggregation page may be determined according to historical information of account browsing, if it is determined that the account is interested in the game aggregation page, the game aggregation page may be periodically recommended to the account, and if it is determined that the account is not interested in the game aggregation page, the game aggregation page may not be recommended to the account.
It can be seen that in the related art, once an account is determined to be interested in a certain aggregation page, the aggregation page is recommended to the account periodically, or once an account is determined to be uninteresting in a certain aggregation page, the aggregation page is not recommended to the account, which results in low flexibility of aggregation page recommendation.
Disclosure of Invention
In order to overcome the problems in the related art, the present disclosure provides a method and an apparatus for recommending an aggregated page, an electronic device, and a storage medium, which can improve the flexibility of recommending an aggregated page.
According to a first aspect of the embodiments of the present disclosure, there is provided an aggregated page recommendation method, the method including:
acquiring current first state information of a target account using an application program, wherein the first state information comprises behavior characteristic information of the target account, the behavior characteristic information is determined based on historical operations performed by the target account on a historically recommended aggregated page, and the aggregated page is used for displaying multiple pieces of information of the same type through the same page;
according to the first state information, after determining that a target aggregation page is recommended to the target account, feedback information of the target account executing operation on the application program is used as first feedback information, and according to the first state information, determining that the target aggregation page is not recommended to the target account, and the feedback information of the target account executing operation on the application program is used as second feedback information;
determining whether to recommend the target aggregation page to the target account based on the first feedback information and the second feedback information.
Optionally, the determining whether to recommend the target aggregation page to the target account based on the first feedback information and the second feedback information includes:
converting the first feedback information and the second feedback information into operation feedback values, respectively;
recommending the target aggregation page to the target account if the operation feedback value corresponding to the first feedback information is larger than the operation feedback value corresponding to the second feedback information;
and if the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, not recommending the target aggregation page to the target account.
Optionally, the converting the first feedback information and the second feedback information into operation feedback values respectively includes:
determining an operation feedback value corresponding to the first feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the first feedback information;
and determining an operation feedback value corresponding to the second feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the second feedback information.
Optionally, before the obtaining the current first state information of the target account using the application program, the method further includes:
counting the number of times of recommending a target aggregation page to the target account within a preset historical time period, the number of times of browsing the target aggregation page by the target account within the preset historical time period, the time of browsing the target aggregation page by the target account within the preset historical time period, and the total time of browsing the target aggregation page by the target account within the preset historical time period;
and taking the information obtained by statistics as the behavior characteristic information of the target account.
Optionally, before the obtaining the current first state information of the target account using the application program, the method further includes:
acquiring account characteristic information of the target account;
obtaining environment information of an environment where the target account is located, wherein the environment information includes at least one of: device information of a device used by the target account, network information of a network to which the device is accessed, and a current time;
and taking the behavior characteristic information, the account characteristic information and the environment information of the environment of the target account as the current first state information of the target account.
According to a second aspect of the embodiments of the present disclosure, there is provided an aggregated page recommendation apparatus, the apparatus including:
the acquisition module is configured to execute acquisition of current first state information of a target account using an application program, wherein the first state information comprises behavior characteristic information of the target account, the behavior characteristic information is determined based on historical operations performed by the target account on a historically recommended aggregated page, and the aggregated page is used for displaying multiple pieces of information of the same type through the same page;
a first determining module configured to execute feedback information of an operation executed on the application program by the target account after determining that a target aggregation page is recommended to the target account according to the first state information as first feedback information, and determine that the target aggregation page is not recommended to the target account according to the first state information, wherein the feedback information of the operation executed on the application program by the target account is used as second feedback information;
a second determination module configured to perform a determination whether to recommend the target aggregated page to the target account based on the first feedback information and the second feedback information.
Optionally, the second determining module is configured to perform conversion of the first feedback information and the second feedback information into operation feedback values, respectively;
recommending the target aggregation page to the target account if the operation feedback value corresponding to the first feedback information is larger than the operation feedback value corresponding to the second feedback information;
and if the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, not recommending the target aggregation page to the target account.
Optionally, the second determining module is configured to determine, according to a duration that the target account browses the target aggregation page, and/or a probability that the target account browses the target aggregation page, which is included in the first feedback information, an operation feedback value corresponding to the first feedback information;
and determining an operation feedback value corresponding to the second feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the second feedback information.
Optionally, the apparatus further comprises:
the first processing module is configured to perform statistics on the number of times of recommending a target aggregation page to the target account within a preset historical time period, the number of times of browsing the target aggregation page by the target account within the preset historical time period, the time of browsing the target aggregation page by the target account within the preset historical time period, and the total time of browsing the target aggregation page by the target account within the preset historical time period;
and taking the information obtained by statistics as the behavior characteristic information of the target account.
Optionally, the apparatus further comprises:
a second processing module configured to perform obtaining account characteristic information of the target account;
obtaining environment information of an environment where the target account is located, wherein the environment information includes at least one of: device information of a device used by the target account, network information of a network to which the device is accessed, and a current time;
and taking the behavior characteristic information, the account characteristic information and the environment information of the environment of the target account as the current first state information of the target account.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to: when the instructions stored in the memory are executed, the method for recommending the aggregated page according to the first aspect is implemented.
According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium, where instructions, when executed by a processor of an electronic device, enable the electronic device to perform the aggregated page recommendation method of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, wherein instructions of the computer program product, when executed by a processor of an electronic device, enable the electronic device to perform the aggregated page recommendation method of the first aspect.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: acquiring current first state information of a target account using an application program, wherein the first state information comprises behavior characteristic information of the target account, and the behavior characteristic information is determined based on historical operations performed by the target account on a historically recommended aggregation page; according to the first state information, determining feedback information of the target account executing operation on the application program after the target aggregation page is recommended to the target account, wherein the feedback information is used as first feedback information, and according to the first state information, determining that the target aggregation page is not recommended to the target account, and the feedback information of the target account executing operation on the application program is used as second feedback information; and determining whether to recommend the target aggregation page to the target account based on the first feedback information and the second feedback information.
Based on the above processing, whether the target aggregation page is recommended to the target account can be determined according to the feedback information of the target account, and since the feedback information of the target account is related to the state information of the target account, that is, the strategy for recommending the target aggregation page can be adjusted according to the change of the state information of the target account, and thus, the flexibility of recommending the aggregation page can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a flow diagram illustrating a method of aggregated page recommendation, according to an example embodiment.
Fig. 2 is a block diagram illustrating an aggregated page recommendation apparatus according to an exemplary embodiment.
FIG. 3 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
In the related art, the server periodically recommends the target aggregation page to the determined account interested in the target aggregation page, however, the content interested in the account changes, and the policy for recommending the target aggregation page by the server does not change accordingly, that is, in the prior art, the flexibility of recommending the target aggregation page is low.
In order to solve the above problem, an embodiment of the present disclosure provides an aggregated page recommendation method, which may be applied to a server.
The server may obtain current state information of a target account using the application (i.e., first state information in the embodiment of the present disclosure), and then, the server may determine, according to the first state information, feedback information (i.e., first feedback information in the embodiment of the present disclosure) that the target account performs an operation on the application after recommending a target aggregation page to the target account, and determine, according to the first state information, that a target aggregation page is not recommended to the target account, and that the target account performs an operation on the application (i.e., second feedback information in the embodiment of the present disclosure), and further, the server may determine whether to recommend to the target account based on the first feedback information and the second feedback information.
Based on the above processing, the server can determine whether to recommend the target aggregation page to the target account according to the feedback information of the target account, and since the feedback information of the target account is related to the state information of the target account, that is, the server can adjust the strategy for recommending the target aggregation page according to the change of the state information of the target account, the flexibility of recommending the aggregation page can be further improved.
Referring to fig. 1, fig. 1 is a flowchart illustrating an aggregated page recommendation method, which may be applied to a server, according to an exemplary embodiment, and which may include the steps of:
s101: first state information of a target account using an application program is obtained.
The first state information may include behavior feature information of the target account, and the aggregated page is used for displaying multiple pieces of information of the same type through the same page. The application may be a news-like application, a short video-like application, or a shopping-like application.
The behavior characteristic information of the target account can be determined based on historical operations performed by the target account on the historically recommended aggregated pages, and the historical operations can be the target account browsing the recommended aggregated pages or the target account ignoring the recommended aggregated pages. The behavior characteristic information of the target account can reflect the browsing habit of the target account for the recommended aggregated page.
In one embodiment, during the process of using the application program by the target account, the server may periodically acquire the state information of the target account (i.e., the first state information in the embodiment of the present disclosure) to determine whether the target aggregation page needs to be recommended to the target account currently according to the current first state information of the target account.
For example, if the application is a short video-class application, then the target aggregated page may be a game short video aggregated page, and for a target account, the server may periodically determine whether to recommend the game short video aggregated page to the target account.
S102: according to the first state information, feedback information of the target account executing the operation on the application program after the target aggregation page is determined to be recommended to the target account is used as first feedback information, and according to the first state information, feedback information of the target account not recommending the target aggregation page to the target account is determined, and feedback information of the target account executing the operation on the application program is used as second feedback information.
Since the first status information includes behavior feature information of the target account, and the behavior feature information is determined based on historical operations performed by the target account on the historically recommended aggregated pages, the server may predict, according to the first status information, feedback information (i.e., first feedback information in the present disclosure) of the target account performing an operation on the application program after recommending the target aggregated page to the target account, and feedback information (i.e., second feedback information in the present disclosure) of the target account not recommending the target aggregated page to the target account.
In one implementation, the server may predict the first feedback information and the second feedback information according to a Deep reinforcement Learning (DQN) network model.
The Markov Decision Process (MDP) is the most basic theoretical model for reinforcement learning. MDP can be represented by 4 quadruplets: firstly, S is expressed as a State Space (State Space) and contains all environment states which can be sensed by an Agent; secondly, A represents an Action Space (Action Space) and comprises all actions which can be taken by the Agent in each state; thirdly, R represents a rewarded Function, R (s, a, s ') represents the Reward value obtained by the Agent from the environment when the Agent executes the action a in the state s and transfers to the state s'; and fourthly, T is a State Transition Function (State Transition Function) of the environment, and T (s, a, s ') represents the probability that the Agent executes the action a and transits to the State s' when the Agent is in the State s.
Based on the above principles, the process of recommending aggregated pages to an account can be abstracted into a markov decision process.
In actual operation, the Markov decision process can be realized based on the deep reinforcement learning network model. DQN can combine Convolutional Neural Networks (CNN) with Q-Learning, where CNN can have inputs as state spaces and outputs as a value estimate for each action.
Correspondingly, in a scene that an aggregation page is recommended to an account, a server is an Agent in the deep reinforcement learning network model, an action is that the server recommends a target aggregation page to a target account, or the server does not recommend the target aggregation page to the target account, and an action space may include two numerical values, one numerical value is used for recommending the target aggregation page to the target account, and the other numerical value is used for recommending no target aggregation page to the target account. The state space may include an identifier of a target aggregation page to be recommended, current state information of a target account, and value evaluation, i.e., corresponding feedback information.
The reward function reflects whether feedback from the target account is positive or negative (which may be referred to as an immediate reward), if a target aggregation page is recommended to the target account, positive feedback from the target account includes the target account having viewed the target aggregation page and the length of time the target account has viewed the target aggregation page, and negative feedback from the target account includes the target account having ignored the target aggregation page, i.e., the target account has not viewed the target aggregation page.
For example, if the target account browses the target aggregation page for 2 minutes, the instant reward may be 20, if the target account only clicks on the target aggregation page and returns immediately, the instant reward may be 0, and if the target account ignores the target aggregation page, the instant reward may be-5.
In addition, in the deep reinforcement learning network model, for a certain action executed by an Agent, an obtained value evaluation can be determined, the value evaluation can be an expected value of an overall return obtained by executing the action, and the overall return can be an expected value of a sum of instant returns obtained by continuously executing a plurality of actions after executing the action. The learning goal of the deep reinforcement learning network model is to maximize the value assessment.
For example, the server may obtain status information (which may be referred to as second status information) of the sample account at the historical time and third feedback information, where the third feedback information may include feedback information that the sample account performed an operation on the application after recommending a target aggregation page to the sample account at the historical time, and feedback information that the target aggregation page was not recommended to the sample account, and the sample account performed an operation on the application.
Then, the server may train the deep reinforcement learning network model by using the second state information and the identifier of the target aggregation page as input features of the deep reinforcement learning network model and using the third feedback information as output features of the deep reinforcement learning network model.
The server can train a deep reinforcement learning network model with a preset structure in an experience playback mode, and the experience playback function mainly solves the problems of correlation and non-static distribution. A sample ti obtained by the server interacting with the environment is stored in the playback memory unit as (si, ai, ri, si +1), so as to obtain an experience pool D, where si represents the ith environment state, ai represents an action that can be executed by the server, ri represents an immediate return obtained by executing the action ai at the environment state si, and si +1 represents the environment state after executing the action ai at the environment state si.
When the server needs to train the deep reinforcement learning network model with the preset structure, the server can randomly select a sample from the experience pool D as a training sample, and then, the relevance of the network model during training can be avoided.
The deep reinforcement learning network model may include a primary network (MainNet) and a target network (TargetNet). MainNet is used to generate the current overall reward, and TargetNet is used to generate the target overall reward. In the training process, the gradient of a loss function of the MainNet relative to model parameters can be calculated, the model parameters of the MainNet are updated by using methods such as a random steepest descent method and the like, and after each iteration, the model parameters of the TargetNet can be determined according to the model parameters of the MainNet, so that the difference between the current overall return and the target overall return is minimized, and the trained deep reinforcement learning network model is obtained.
In the process of training the deep reinforcement learning network model, when the numerical value of the loss function of the deep reinforcement learning network model is smaller than a preset threshold value, the difference value between the current overall return and the target overall return is smaller, the server can determine that a training stopping condition is reached, and then the server can stop training to obtain the trained deep reinforcement learning network model.
Or in the process of training the deep reinforcement learning network model, when the number of times of training the deep reinforcement learning network model reaches a preset number, the difference between the current overall return and the target overall return is smaller, the server can determine that a training stopping condition is reached, and then the server can stop training to obtain the trained deep reinforcement learning network model.
After the training of the deep reinforcement learning network model is completed, the server may input the first state information and the identifier of the target aggregation page into the deep reinforcement learning network model, and obtain feedback information (i.e., first feedback information) that is output by the deep reinforcement learning network model and used by the target account to perform an operation on the application program after the target aggregation page is recommended to the target account, and feedback information (i.e., second feedback information) that is not recommended to the target account and used by the target account to perform an operation on the application program.
S103: and determining whether to recommend the target aggregation page to the target account based on the first feedback information and the second feedback information.
The server can determine whether to recommend the target aggregation page to the target account according to the first feedback information and the second feedback information in order to enable the strategy of recommending the target aggregation page to conform to the operation habit of the target account.
Optionally, S103 may include the following steps:
step one, the first feedback information and the second feedback information are respectively converted into operation feedback values.
In an embodiment, after determining the first feedback information and the second feedback information, the server may convert the first feedback information and the second feedback information into respective corresponding operation feedback values, and further, may determine whether to recommend the target aggregation page to the target account according to the respective corresponding operation feedback values.
Step two, if the operation feedback value corresponding to the first feedback information is larger than the operation feedback value corresponding to the second feedback information, recommending a target aggregation page to the target account; and if the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, recommending no target aggregation page to the target account.
In an embodiment, if the operation feedback value corresponding to the first feedback information is greater than the operation feedback value corresponding to the second feedback information, it indicates that the feedback caused by recommending the target aggregation page to the target account at this time is greater than the feedback caused by not recommending the target aggregation page to the target account, and therefore, in order to maximize the overall feedback, the server may recommend the target aggregation page to the target account at this time.
If the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, it indicates that the feedback brought by recommending the target aggregation page to the target account at the moment is less than or equal to the feedback brought by not recommending the target aggregation page to the target account, so that in order to maximize the overall feedback, the server may not recommend the target aggregation page to the target account at the moment.
Optionally, the feedback information may include a duration for the target account to browse the target aggregation page and/or a probability for the target account to browse the target aggregation page, and accordingly, the method for the server to convert the first feedback information and the second feedback information into the respective corresponding operation feedback values may include the following steps:
determining an operation feedback value corresponding to the first feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, wherein the time length is included in the first feedback information; and determining an operation feedback value corresponding to the second feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the second feedback information.
In one embodiment, the server may calculate a ratio of the number of times that the target account browses the target aggregation page within a preset time period to the total number of times that the target account recommends the target aggregation page as the probability that the target account browses the target aggregation page.
The duration and the probability of the target account browsing the target aggregation page can reflect the browsing habit of the target account for the target aggregation page, so that for the first feedback information and the second feedback information, the server can determine the corresponding operation feedback values according to the duration of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page.
For example, the server may directly use the duration of browsing the target aggregation page by the target account included in the feedback information as the corresponding operation feedback value; or, the server may also directly use the probability that the target account browses the target aggregation page included in the feedback information as the corresponding operation feedback value; or, the server may calculate, according to the preset weight, a weighted sum of a duration of the target account browsing the target aggregation page and a probability of the target account browsing the target aggregation page, which are included in the feedback information, and use the weighted sum as a corresponding operation feedback value.
Optionally, to further improve the effectiveness of the aggregated page recommendation, before S101, the method may further include the following steps: counting the number of times of recommending the target aggregation page to the target account within a preset historical time period, the number of times of browsing the target aggregation page by the target account within the preset historical time period, the moment of browsing the target aggregation page by the target account within the preset historical time period and the total duration of browsing the target aggregation page by the target account within the preset historical time period, and taking the information obtained through counting as the behavior characteristic information of the target account.
In an embodiment, when the server needs to determine whether to recommend a target aggregation page to a target account, the server may obtain the number of times that the target aggregation page is recommended to the target account in a recent history time period (i.e., a preset history time period in the embodiment of the present disclosure), the number of times that the target account browses the target aggregation page, the time that the target account browses the target aggregation page, and the total duration of time that the target account browses the target aggregation page, and further, the server may use the above information as behavior feature information of the target account, for predicting the first feedback information and the second feedback information.
It can be seen that, over time, the behavior feature information of the target account is continuously updated, i.e., corresponding to different state spaces in the deep reinforcement learning network model.
The preset historical time period may be a plurality of time periods, for example, the preset historical time period may include: half an hour before the current time, one hour before the current time, and four hours before the current time.
Optionally, in order to further improve the effectiveness of the aggregated page recommendation, before S101, the method may further include the following steps: the method comprises the steps of obtaining account characteristic information of a target account and environment information of the environment where the target account is located, and taking behavior characteristic information, account characteristic information and environment information of the environment where the target account is located as current first state information of the target account.
The account characteristic information of the target account includes at least one of: age of the target account, gender of the target account, occupation of the target account, physical location of the target account.
The account characteristic information can reflect the account characteristics of the target account, so that the result of determining whether to recommend the target aggregated page or not can be reflected by combining the account characteristic information of the target account, and further, the recommendation effectiveness of the aggregated page can be improved.
The context information may include at least one of: device information of a device used by the target account, network information of a network to which the device has access, and a current time.
The information of the device may include a model of the device, and the network information of the network to which the device is accessed may include a type of the network, for example, the accessed network is a wifi network, or may also be a 3G network, a 4G network, or the like; the current time may include the current time, date.
The environment information can reflect the characteristics of the current environment of the target account, so that the result of determining whether to recommend the target aggregated page or not is determined by combining the environment information of the environment of the target account, the requirements of the target account can be better reflected, and the recommendation effectiveness of the aggregated page can be further improved.
In one embodiment, when the server needs to determine whether to recommend a target aggregation page to a target account, the server may obtain current behavior feature information, account feature information, and environment information of an environment where the target account is located, and use the information as current first state information of the target account to predict the first feedback information and the second feedback information, that is, to use the information as a state space in the deep reinforcement learning network model.
Corresponding to the method embodiment of fig. 1, referring to fig. 2, fig. 2 is a block diagram illustrating an aggregated page recommendation apparatus according to an exemplary embodiment, which may include:
an obtaining module 201 configured to perform obtaining current first state information of a target account using an application program, wherein the first state information includes behavior feature information of the target account, the behavior feature information is determined based on historical operations performed by the target account on a historically recommended aggregated page, and the aggregated page is used for displaying multiple pieces of information of the same type through the same page;
a first determining module 202, configured to execute feedback information that the target account performs an operation on the application after determining that the target aggregation page is recommended to the target account according to the first state information, as first feedback information, and determine that the target aggregation page is not recommended to the target account according to the first state information, where the feedback information that the target account performs an operation on the application is used as second feedback information;
a second determination module 203 configured to perform a determination whether to recommend the target aggregated page to the target account based on the first feedback information and the second feedback information.
Optionally, the second determining module 203 is configured to perform conversion of the first feedback information and the second feedback information into operation feedback values, respectively;
recommending the target aggregation page to the target account if the operation feedback value corresponding to the first feedback information is larger than the operation feedback value corresponding to the second feedback information;
and if the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, not recommending the target aggregation page to the target account.
Optionally, the second determining module 203 is configured to determine, according to the duration of the target account browsing the target aggregation page included in the first feedback information and/or the probability of the target account browsing the target aggregation page, an operation feedback value corresponding to the first feedback information;
and determining an operation feedback value corresponding to the second feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the second feedback information.
Optionally, the apparatus further comprises:
the first processing module is configured to perform statistics on the number of times of recommending a target aggregation page to the target account within a preset historical time period, the number of times of browsing the target aggregation page by the target account within the preset historical time period, the time of browsing the target aggregation page by the target account within the preset historical time period, and the total time of browsing the target aggregation page by the target account within the preset historical time period;
and taking the information obtained by statistics as the behavior characteristic information of the target account.
Optionally, the apparatus further comprises:
a second processing module configured to perform obtaining account characteristic information of the target account;
obtaining environment information of an environment where the target account is located, wherein the environment information includes at least one of: device information of a device used by the target account, network information of a network to which the device is accessed, and a current time;
and taking the behavior characteristic information, the account characteristic information and the environment information of the environment of the target account as the current first state information of the target account.
Fig. 3 is a block diagram illustrating an electronic device 300 for recommending aggregation pages, according to an example embodiment. For example, the electronic device 300 may be provided as a server. Referring to FIG. 3, electronic device 300 includes a processing component 322 that further includes one or more processors and memory resources, represented by memory 332, for storing instructions, such as applications, that are executable by processing component 322. The application programs stored in memory 332 may include one or more modules that each correspond to a set of instructions. Further, the processing component 322 is configured to execute instructions to perform the aggregated page recommendation method described above.
The electronic device 300 may also include a power component 326 configured to perform power management of the electronic device 300, a wired or wireless network interface 350 configured to connect the electronic device 300 to a network, and an input/output (I/O) interface 358. The electronic device 300 may operate based on an operating system stored in the memory 332, such as a Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or similar operating system.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. An aggregated page recommendation method, characterized in that the method comprises:
acquiring current first state information of a target account using an application program, wherein the first state information comprises behavior characteristic information of the target account, the behavior characteristic information is determined based on historical operations performed by the target account on a historically recommended aggregated page, and the aggregated page is used for displaying multiple pieces of information of the same type through the same page;
according to the first state information, after determining that a target aggregation page is recommended to the target account, feedback information of the target account executing operation on the application program is used as first feedback information, and according to the first state information, determining that the target aggregation page is not recommended to the target account, and the feedback information of the target account executing operation on the application program is used as second feedback information;
determining whether to recommend the target aggregation page to the target account based on the first feedback information and the second feedback information.
2. The method of claim 1, wherein the determining whether to recommend the target aggregated page to the target account based on the first feedback information and the second feedback information comprises:
converting the first feedback information and the second feedback information into operation feedback values, respectively;
recommending the target aggregation page to the target account if the operation feedback value corresponding to the first feedback information is larger than the operation feedback value corresponding to the second feedback information;
and if the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, not recommending the target aggregation page to the target account.
3. The method of claim 2, wherein the converting the first feedback information and the second feedback information into operation feedback values respectively comprises:
determining an operation feedback value corresponding to the first feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the first feedback information;
and determining an operation feedback value corresponding to the second feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the second feedback information.
4. The method of claim 1, wherein prior to the obtaining the current first state information of the target account using the application program, the method further comprises:
counting the number of times of recommending a target aggregation page to the target account within a preset historical time period, the number of times of browsing the target aggregation page by the target account within the preset historical time period, the time of browsing the target aggregation page by the target account within the preset historical time period, and the total time of browsing the target aggregation page by the target account within the preset historical time period;
and taking the information obtained by statistics as the behavior characteristic information of the target account.
5. The aggregated page recommendation method according to claim 4, wherein prior to said obtaining the current first status information of the target account using the application program, the method further comprises:
acquiring account characteristic information of the target account;
obtaining environment information of an environment where the target account is located, wherein the environment information includes at least one of: device information of a device used by the target account, network information of a network to which the device is accessed, and a current time;
and taking the behavior characteristic information, the account characteristic information and the environment information of the environment of the target account as the current first state information of the target account.
6. An aggregated page recommendation apparatus, characterized in that the apparatus comprises:
the acquisition module is configured to execute acquisition of current first state information of a target account using an application program, wherein the first state information comprises behavior characteristic information of the target account, the behavior characteristic information is determined based on historical operations performed by the target account on a historically recommended aggregated page, and the aggregated page is used for displaying multiple pieces of information of the same type through the same page;
a first determining module configured to execute feedback information of an operation executed on the application program by the target account after determining that a target aggregation page is recommended to the target account according to the first state information as first feedback information, and determine that the target aggregation page is not recommended to the target account according to the first state information, wherein the feedback information of the operation executed on the application program by the target account is used as second feedback information;
a second determination module configured to perform a determination whether to recommend the target aggregated page to the target account based on the first feedback information and the second feedback information.
7. The apparatus according to claim 6, wherein the second determining module is configured to perform conversion of the first feedback information and the second feedback information into operation feedback values, respectively;
recommending the target aggregation page to the target account if the operation feedback value corresponding to the first feedback information is larger than the operation feedback value corresponding to the second feedback information;
and if the operation feedback value corresponding to the first feedback information is not greater than the operation feedback value corresponding to the second feedback information, not recommending the target aggregation page to the target account.
8. The device for recommending aggregated pages according to claim 7, wherein the second determining module is configured to determine the operation feedback value corresponding to the first feedback information according to a duration of the target account browsing the target aggregated page, and/or a probability of the target account browsing the target aggregated page, which is included in the first feedback information;
and determining an operation feedback value corresponding to the second feedback information according to the time length of the target account browsing the target aggregation page and/or the probability of the target account browsing the target aggregation page, which are contained in the second feedback information.
9. An electronic device, comprising: a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to carry out the method steps of any one of claims 1-5 when executing the instructions stored on the memory.
10. A storage medium, characterized in that instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method steps of any of claims 1-5.
CN201910881954.0A 2019-09-18 2019-09-18 Aggregated page recommendation method and device, electronic equipment and storage medium Pending CN112528131A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910881954.0A CN112528131A (en) 2019-09-18 2019-09-18 Aggregated page recommendation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910881954.0A CN112528131A (en) 2019-09-18 2019-09-18 Aggregated page recommendation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112528131A true CN112528131A (en) 2021-03-19

Family

ID=74975111

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910881954.0A Pending CN112528131A (en) 2019-09-18 2019-09-18 Aggregated page recommendation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112528131A (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008184A (en) * 2014-06-10 2014-08-27 百度在线网络技术(北京)有限公司 Method and device for pushing information
CN104715389A (en) * 2013-12-13 2015-06-17 腾讯科技(深圳)有限公司 Information processing method, device and system
CN104965890A (en) * 2015-06-17 2015-10-07 深圳市腾讯计算机系统有限公司 Advertisement recommendation method and apparatus
CN105159969A (en) * 2015-08-25 2015-12-16 小米科技有限责任公司 Social network based user recommendation method and apparatus
CN105630878A (en) * 2015-12-17 2016-06-01 小米科技有限责任公司 Method and apparatus for displaying application service information
CN105718566A (en) * 2016-01-20 2016-06-29 中山大学 Intelligent music recommendation system
CN106846053A (en) * 2017-01-19 2017-06-13 腾讯科技(深圳)有限公司 A kind of recommendation method and device of the page advertisement that is polymerized
CN108229991A (en) * 2016-12-15 2018-06-29 北京奇虎科技有限公司 Method, apparatus, browser and the terminal device of displaying polymerization promotion message
CN109451038A (en) * 2018-12-06 2019-03-08 北京达佳互联信息技术有限公司 A kind of information-pushing method, device, server and computer readable storage medium
US20190205701A1 (en) * 2017-12-29 2019-07-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for Training Model and Information Recommendation System
CN110020143A (en) * 2017-11-20 2019-07-16 北京京东尚科信息技术有限公司 A kind of landing page generation method and device
CN110020194A (en) * 2018-08-09 2019-07-16 连尚(新昌)网络科技有限公司 Resource recommendation method, device and medium
CN110111152A (en) * 2019-05-10 2019-08-09 腾讯科技(深圳)有限公司 A kind of content recommendation method, device and server

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104715389A (en) * 2013-12-13 2015-06-17 腾讯科技(深圳)有限公司 Information processing method, device and system
CN104008184A (en) * 2014-06-10 2014-08-27 百度在线网络技术(北京)有限公司 Method and device for pushing information
CN104965890A (en) * 2015-06-17 2015-10-07 深圳市腾讯计算机系统有限公司 Advertisement recommendation method and apparatus
CN105159969A (en) * 2015-08-25 2015-12-16 小米科技有限责任公司 Social network based user recommendation method and apparatus
CN105630878A (en) * 2015-12-17 2016-06-01 小米科技有限责任公司 Method and apparatus for displaying application service information
CN105718566A (en) * 2016-01-20 2016-06-29 中山大学 Intelligent music recommendation system
CN108229991A (en) * 2016-12-15 2018-06-29 北京奇虎科技有限公司 Method, apparatus, browser and the terminal device of displaying polymerization promotion message
CN106846053A (en) * 2017-01-19 2017-06-13 腾讯科技(深圳)有限公司 A kind of recommendation method and device of the page advertisement that is polymerized
CN110020143A (en) * 2017-11-20 2019-07-16 北京京东尚科信息技术有限公司 A kind of landing page generation method and device
US20190205701A1 (en) * 2017-12-29 2019-07-04 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for Training Model and Information Recommendation System
CN110020194A (en) * 2018-08-09 2019-07-16 连尚(新昌)网络科技有限公司 Resource recommendation method, device and medium
CN109451038A (en) * 2018-12-06 2019-03-08 北京达佳互联信息技术有限公司 A kind of information-pushing method, device, server and computer readable storage medium
CN110111152A (en) * 2019-05-10 2019-08-09 腾讯科技(深圳)有限公司 A kind of content recommendation method, device and server

Similar Documents

Publication Publication Date Title
CN110781321B (en) Multimedia content recommendation method and device
CN110941740B (en) Video recommendation method and computer-readable storage medium
WO2020135535A1 (en) Recommendation model training method and related apparatus
CN107463701B (en) Method and device for pushing information stream based on artificial intelligence
US20150100526A1 (en) Online temporal difference learning from incomplete customer interaction histories
US20230153857A1 (en) Recommendation model training method, recommendation method, apparatus, and computer-readable medium
CN110149540A (en) Recommendation process method, apparatus, terminal and the readable medium of multimedia resource
RU2720954C1 (en) Search index construction method and system using machine learning algorithm
CN110413867B (en) Method and system for content recommendation
CN110111152A (en) A kind of content recommendation method, device and server
CN113656681B (en) Object evaluation method, device, equipment and storage medium
CN111080417A (en) Processing method for improving booking smoothness rate, model training method and system
CN111738766B (en) Data processing method and device for multimedia information and server
CN113918826B (en) Processing method of release information, and training method and device of resource prediction model
Mehta et al. Collaborative personalized web recommender system using entropy based similarity measure
CN112269943B (en) Information recommendation system and method
CN116362359A (en) User satisfaction prediction method, device, equipment and medium based on AI big data
CN113204699B (en) Information recommendation method and device, electronic equipment and storage medium
CN112528131A (en) Aggregated page recommendation method and device, electronic equipment and storage medium
Bai et al. Automated customization of on-device inference for quality-of-experience enhancement
US20220261683A1 (en) Constraint sampling reinforcement learning for recommendation systems
CN114547116A (en) Data pushing method, device, equipment and medium
CN115858911A (en) Information recommendation method and device, electronic equipment and computer-readable storage medium
CN106503044B (en) Interest feature distribution acquisition method and device
WO2013059517A1 (en) Online temporal difference learning from incomplete customer interaction histories

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination