CN112749332A - Data processing method, device and computer readable medium - Google Patents
Data processing method, device and computer readable medium Download PDFInfo
- Publication number
- CN112749332A CN112749332A CN202010662408.0A CN202010662408A CN112749332A CN 112749332 A CN112749332 A CN 112749332A CN 202010662408 A CN202010662408 A CN 202010662408A CN 112749332 A CN112749332 A CN 112749332A
- Authority
- CN
- China
- Prior art keywords
- user
- interest
- content
- score
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 15
- 230000000875 corresponding Effects 0.000 claims abstract description 52
- 230000002596 correlated Effects 0.000 claims abstract description 8
- 230000006399 behavior Effects 0.000 description 30
- 238000004891 communication Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 15
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 8
- 238000000034 method Methods 0.000 description 8
- 101700050571 SUOX Proteins 0.000 description 5
- 240000000716 Durio zibethinus Species 0.000 description 4
- 235000006025 Durio zibethinus Nutrition 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000003993 interaction Effects 0.000 description 4
- 230000001537 neural Effects 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 229920001276 Ammonium polyphosphate Polymers 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000007477 logistic regression Methods 0.000 description 2
- 238000005065 mining Methods 0.000 description 2
- 230000000051 modifying Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 240000002254 Ananas comosus Species 0.000 description 1
- 235000007119 Ananas comosus Nutrition 0.000 description 1
- 230000037250 Clearance Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000035512 clearance Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000003958 fumigation Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006011 modification reaction Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000001902 propagating Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/55—Push-based network services
Abstract
The application discloses a data processing method, a data processing device and a computer readable medium, and relates to the technical field of computers. The method comprises the following steps: obtaining a first score determined by an interest heuristic model to be trained according to user characteristic data, wherein the first score is used for representing the possibility that sample data belongs to unknown interest of a user; obtaining a weight value determined by the joint probability model according to the sample data, wherein the weight value is used for representing the possibility that the sample is suitable for training the interest heuristic model; adjusting the first score according to the weight value, wherein the weight value is positively correlated with the first score; and training the interest heuristic model to be trained according to the adjusted first score, wherein the trained interest heuristic model is used for determining the estimated interest information of the user according to the user characteristic data. Therefore, determining the content to be pushed according to the estimated interest information of the user; the content to be pushed is pushed to the client corresponding to the user, so that the content can be pushed to the user according to the unknown interest of the user, and the diversity of the pushed content is improved.
Description
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method, an apparatus, and a computer-readable medium.
Background
With the continuous development of internet applications, users use the internet for more and more time, and how to enable the users to quickly obtain information which may be needed from massive information to further improve user experience, information push service needs to be provided for the users. At present, when content is pushed, a new content is often pushed for a user according to high-frequency clicks of some historical content by the user or personal interests and hobbies pre-entered by the user, however, such a pushing manner may cause the pushed content to include too many repeated contents, resulting in a single pushed content.
Disclosure of Invention
The present application proposes a data processing method, apparatus and computer readable medium to improve the above-mentioned drawbacks.
In a first aspect, an embodiment of the present application provides a data processing method, where the method includes: obtaining a first score determined by an interest heuristic model to be trained according to user characteristic data, wherein the first score is used for representing the possibility that sample data belongs to unknown interest of a user; obtaining a weight value determined by the joint probability model according to the sample data, wherein the weight value is used for representing the possibility that the sample is suitable for training the interest heuristic model; adjusting the first score according to the weight value, wherein the weight value is positively correlated with the first score; training the interest heuristic model to be trained according to the adjusted first score, wherein the trained interest heuristic model is used for determining predicted interest information of a user according to the user characteristic data, the predicted interest information is unknown interest information of the user, and the predicted interest information is used for determining content to be pushed to a client corresponding to the user.
In a second aspect, an embodiment of the present application further provides a data processing apparatus, where the apparatus includes: the device comprises a first acquisition unit, a second acquisition unit, an adjustment unit and a training unit. The device comprises a first obtaining unit and a second obtaining unit, wherein the first obtaining unit is used for obtaining a first score which is determined by an interest heuristic model to be trained according to user characteristic data, and the first score is used for representing the possibility that sample data belongs to unknown interest of a user. And the second obtaining unit is used for obtaining a weight value determined by the joint probability model according to the sample data, and the weight value is used for representing the possibility that the sample is suitable for training the interest heuristic model. An adjusting unit, configured to adjust the first score according to the weight value, where the weight value is positively correlated with the first score. The training unit is used for training the interest heuristic model to be trained according to the adjusted first score, the trained interest heuristic model is used for determining estimated interest information of a user according to the user characteristic data, the estimated interest information is unknown interest information of the user, and the estimated interest information is used for determining content to be pushed to a client corresponding to the user.
In a third aspect, an embodiment of the present application further provides a computer-readable storage medium, where a program code executable by a processor is stored, and when executed by the processor, the program code causes the processor to execute the above method.
According to the data processing method, the data processing device and the computer readable medium, when the content needs to be pushed to the user, the pre-estimated interest information of the user is obtained. The method comprises the steps that pre-estimated interest information is unknown interest information of a user, specifically, the pre-estimated interest information is obtained according to a trained interest heuristic model, the training process of the interest heuristic model is that a first score determined by the interest heuristic model to be trained according to user characteristic data is obtained, and the first score is used for representing the possibility that sample data belongs to unknown interest of the user; obtaining a weight value determined by the joint probability model according to the sample data, wherein the weight value is used for representing the possibility that the sample is suitable for training the interest heuristic model; the first score is adjusted according to the weight value, and the weight value is in positive correlation with the first score, so that the first score can be corrected for the first score output by the interest heuristic model to be trained, the applicability of the sample for training the interest heuristic model can be considered by the first score, the training of the interest heuristic model is more reasonable, and the obtained estimated interest information is more accurate. Then, determining the content to be pushed according to the estimated interest information of the user; the content to be pushed is pushed to the client corresponding to the user, so that the content can be pushed to the user according to the known interest of the user and the unknown interest of the user, and the diversity of the pushed content is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of an information push system provided by an embodiment of the present application;
FIG. 2 is a flow chart of a method of processing data according to an embodiment of the present application;
FIG. 3 is a flow chart of a method of processing data according to another embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a structure of a joint probability model provided by an embodiment of the present application;
FIG. 5 illustrates a method flow diagram of a data processing method provided by yet another embodiment of the present application;
FIG. 6 is a schematic diagram illustrating an architecture of an interest heuristic model provided by an embodiment of the present application;
FIG. 7 is a schematic structural diagram of a joint training model provided in an embodiment of the present application;
FIG. 8 is a flow chart of a method of data processing according to yet another embodiment of the present application;
FIG. 9 is a schematic diagram illustrating a client-specific interface provided by an embodiment of the present application;
FIG. 10 is a schematic diagram illustrating a detail interface for content of a client provided by an embodiment of the present application;
FIG. 11 is a schematic diagram illustrating a client-specific interface provided by another embodiment of the present application;
FIG. 12 is a schematic diagram illustrating a client-specific interface provided by yet another embodiment of the present application;
FIG. 13 is a method flow diagram of a data processing method according to yet another embodiment of the present application;
FIG. 14 is a schematic view illustrating a scenario in which an interest heuristic model provided in an embodiment of the present application is applied to information push;
FIG. 15 is a schematic diagram illustrating a client-specific interface provided by yet another embodiment of the present application;
FIG. 16 is a block diagram of a data processing apparatus provided in an embodiment of the present application;
FIG. 17 shows a block diagram of an electronic device provided by an embodiment of the present application;
fig. 18 shows a storage unit for storing or carrying program codes for implementing the information pushing method according to the embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.
With the continuous development of internet applications, users use the internet for more and more time, and how to enable the users to quickly obtain information which may be needed from massive information to further improve user experience, information push service needs to be provided for the users.
Referring to fig. 1, fig. 1 illustrates an information push system according to an embodiment of the present disclosure. As shown in fig. 1, the information push system includes a server 100 and a user terminal 200. The server 100 and the user terminal 200 are located in a wireless network or a wired network, and data interaction can be performed between the server 100 and the user terminal 200.
In some embodiments, the user logs in through an account at the user terminal, and all information corresponding to the account can be stored in the storage space of the server 100. The server 100 may be an individual server, or a server cluster, or a local server, or a cloud server.
A plurality of applications are installed in the user terminal, and the server 100 can push some content to the user terminal, specifically, the content may be a certain application that pushes the content to the user terminal, and the application displays the content, so that the content can be pushed to a user corresponding to the user terminal.
The server 100 may be connected to a plurality of user terminals, and may push the content to be pushed to all the user terminals, or may select one of the user terminals according to some policies and push the content to be pushed to the selected user terminal. The specific policy may be determined according to the content to be pushed and the user corresponding to each user terminal. In the embodiment of the present application, the content to be pushed may be information such as articles, videos, and pictures.
However, the inventors have found that the push method of the present invention is relatively ineffective.
In an information recommendation system, a traditional recommendation algorithm carries out recommendation based on historical behaviors and semantic features of users and contents, and the click rate and the click frequency of the users are used as measuring targets. However, determining the content to be pushed according to the click of the user may cause repeated recommendation of high frequency click interest and known interest of the user, which may cause convergence of the user's visual field, and the recommendation system may not satisfy all the interests of the user, resulting in a decrease in user experience, or even a loss of the user. This is disadvantageous to the diversity of recommendation systems and the long-term development of recommendation systems.
For example, content is pushed for users by a click rate estimation method, specifically, for each pair of combination of users and content, the click probability of the users on the content is predicted by using a deep neural network model according to the basic characteristics, the behavior characteristics, the content and the context characteristics of the users. For example, the Model may include a support-decomposition-machine Neural Network (FNN), a Product-based Neural Network (PNN), a Wide Linear Model and Deep Neural Network (Wide & Deep), a Deep decomposition machine (Deep F), and so forth. The network structure of the model is not very different, mainly the difference of the calculation modes of some characteristics is that high-dimensional sparse characteristics are embedded into low-dimensional continuous vectors, then the co-occurrence relation and the nonlinear change between the characteristics are learned through each calculation layer of the model, and finally the estimated click rate fraction with the interval of [0,1] is output.
As another example, the exploration and utilization method is a method that is widely used by a recommendation system to explore user interests. . The method regards the known interests of the user as the existing benefits, regards the unknown interests of the user as exploitable benefits, and finally selects to utilize the currently known benefits or explore some new strategies to promote the future benefits according to the existing experience in decision making. The algorithm of the method mainly comprises the algorithms of Upper Confidence interval Bound (UCB), Thompson (Thompson), linear UCB (LinUCB) and the like. The main idea is to balance the exploration and exploitation degree of interest by a multi-arm slot machine method, and to update the decision-making benefit of the user by the click feedback of the user, so as to achieve a global optimal benefit.
For another example, a Look-like method is used for pushing content for a user, specifically, the method firstly determines a user clicking content to be recommended as a seed user, and then calculates a user similar to the seed user as a recommended target user through a Look-like model. Common Look-align models include similarity-based models and regression-based models. Based on a similarity model such as Local Sensitive Hashing (LSH), cosine similarity and a mixed model, after the user data is subjected to feature vectorization, similarity between the seed user and the candidate user is measured by calculating the distance between vectors. Wherein, a regression model such as Logistic Regression (LR) and Gradient Boosting Decision Tree (GBDT) is trained for each user or content, a seed user is used as a positive sample of the corresponding feature, a non-seed user is randomly extracted as a negative sample of the model, and a score of a model output interval [0,1] is used as a click probability of the candidate user on the content.
For the click-through rate estimation and look-like methods, the model relies on the portraits of the known interests of the user and is based on the click-through rate and the number of clicks without targets, which may result in the recommendation system over-utilizing the portraits of the known interests of the user to recommend too much of the same interest content to the user. The user is not interested by the recommendation system, and the recommendation cost is high, the click rate of the user is low, and the user is difficult to recommend the user, so that the recommendation system is difficult to enrich the interest of the user, and the visual field of the recommended content is limited.
For the exploration and utilization method, the model directly models the long-term income of the user interest, the income of the known interest portrait of the user and the income of the unknown interest portrait of the user are considered, and the demand is met to a certain extent. However, the existing exploration and utilization methods still have some problems: firstly, the existing model features are simple, and complex feature calculation is difficult to introduce; in addition, the model is based on the strong assumption of probability distribution, and the probability distribution is judged through a confidence interval; finally, the model distinguishes the boundary of mining and exploration by setting parameters, which cannot accurately distinguish the effective boundary of mining and exploration, and can influence the judgment of the model on the real global optimal profit, so that the pushing result is not accurate enough.
Therefore, in order to overcome the above-mentioned drawbacks, embodiments of the present application provide a data processing method capable of determining an unknown interest of a user so as to push content for the user according to the unknown interest of the user. As shown in fig. 2, specifically, the executing subject of the method may be the server described above, and the method includes: s201 to S204.
S201: and acquiring a first score determined by the interest heuristic model to be trained according to the user characteristic data.
Wherein the first score is used to characterize a likelihood, i.e. a first likelihood, that the sample data belongs to an unknown interest of the user.
The interest heuristic model can output interest information of a user according to the similarity between the user characteristic data and a preset interest label, and can enable unknown interest output by the interest heuristic model to be more accurate along with continuous training and learning of the model.
In one embodiment, the user characteristic data includes behavior data generated by a user for a specific application module of the client and operations of other application modules belonging to the client related to the specific application module. Specifically, the client includes a plurality of application modules, and each application module corresponds to one functional service of the client. For example, the client may be a game client, and each game scene in the game client corresponds to one application module, for example, the customs clearance mode and the player engagement mode both correspond to one application module, that is, different application modules provide different functional services. For another example, the client is a social client, for example, the client is a wechat APP, and the application modules corresponding to the client may include a "see-and-see" module, a friend circle module, a chat module, a public number module, and an applet module.
As an implementation manner, the other application modules related to the specified application module may be application modules having the same or similar functions as the specified application module, in this embodiment of the present application, the specified application module has a content pushing function, and the other application modules related to the specified application module also have a content pushing function.
In the embodiment of the present application, the designated application module is a "see-through" module, and other application modules related to the designated application module may include a public number, an applet, and the like. The behavior data of the user can reflect operation data of the user on the specified application module and contents pushed by other application modules related to the specified application module, for example, the operation data includes click times, click frequency and the like, and can reflect the preference of the user to a certain extent.
As one embodiment, the interest heuristic model to be trained determines a likelihood, i.e., a first likelihood, that the sample data belongs to an unknown interest of the user based on the user feature data, and determines the first likelihood with greater accuracy as the interest heuristic model is trained. Specifically, the interest heuristic model can obtain the estimated interest information of the user according to the user characteristic data. It should be noted that, when the interest heuristic model is not trained, that is, the accuracy of the predicted interest information output by the initial interest heuristic model is poor, the accuracy of the predicted interest information output by the interest heuristic model needs to be improved by a training method.
Specifically, the initial unknown interest information determined by the interest heuristic model to be trained according to the user feature data may be an interest tag, and then, the initial unknown interest information is matched with the sample data, for example, a keyword of the sample data is matched with the interest tag, and whether the sample belongs to the unknown interest information of the user is determined according to a matching result.
S202: and acquiring a weight value determined by the joint probability model according to the sample data.
The weight values are used to characterize the likelihood that the sample is suitable for training the interest heuristic model.
Specifically, the sample data is exposure data corresponding to the user. Specifically, the exposure data is content processed by the user within a preset time period, where the processed content may include at least one of contents of clicking, browsing, and commenting by the user. However, the user clicks on the sample may be due to the known interests of the user, for example, the server determines the content pushed for the user according to the known interest information of the user, and the user clicks on the content when browsing the content, and the user clicks on the content, and may also be due to the user clicking when reading the non-pushed content, for example, the user sees the content published by the friends in the circle of friends, and the content does not match with the known interest information of the user. If most of the samples used when the interest heuristic model to be trained is trained are the content pushed by the user based on the known interest information of the user, it may be caused that the unknown interest of the user obtained by the trained interest heuristic model may be excessively similar to the known interest of the user, so that the content pushed based on the unknown interest information of the user may be duplicated with a larger part of the content pushed based on the known interest information of the user. Therefore, in order to avoid training the interest heuristic model to be trained directly with the content clicked by the user based on the known interest information of the user, a weight value is set, which may be a value of [0,1], i.e., a value greater than or equal to 0 and less than or equal to 1.
As an embodiment, the joint probability model may determine the weight value of the sample data empirically, for example, when the sample data is acquired, a weight value is empirically set for each sample data.
As another embodiment, a history push record of the user may also be obtained, specifically, if the server pushes content for the user based on the known interest information of the user, the pushed content is named as interest content, and an identity is set for each interest content pushed by the user, for example, the identity may be a title or a content summary of the pushed interest content. Then, the clicks of each content by the user are recorded, so that whether the content clicked by the user is the interest content or not can be determined, if the content is the interest content, a lower weight value is set for the content, and if the content is not the interest content, a higher weight value is set for the content.
As another implementation, the weight value may also be determined according to social information of the user, specifically, please refer to the following embodiments.
S203: and adjusting the first score according to the weight value.
Specifically, the weight value can correct the first score determined by the interest heuristic model to be trained according to the user feature data, that is, the weight value can affect the magnitude of the first score. In one embodiment, the weight value is positively correlated with the first score. The positive correlation means that the weight value and the first score value increase synchronously, namely the weight value increases, the first score value also increases, the weight value decreases, and the first score value also decreases. That is, the larger the weight value is, the larger the first score can be made, and the smaller the weight value is, the smaller the first score can be made.
The first score is used to characterize the likelihood that the sample data belongs to an unknown interest of the user, i.e., the higher the first score output, the higher the likelihood that the sample may be of unknown interest to the user. And the weight value represents the possibility that the sample is suitable for training the interest heuristic model, which reflects the applicability of the sample for training the interest heuristic model, wherein the larger the weight value is, the more suitable the sample is for training the interest heuristic model, and the smaller the weight value is, the less suitable the sample is for training the interest heuristic model, so that if a certain sample is not suitable for training the interest heuristic model, a lower weight value can be set for the sample.
In the method of the dobby slot machine, the data of the plurality of samples are explicitly divided into two sample sets, that is, one sample set is used for training one model, the other sample set is used for training the other model, and the samples in the two sample sets are different, or although some samples in the two sample sets are the same, the value of the sample to be trained on the interest heuristic model depends only on the interest heuristic model itself. That is, the dobby slot machine method simply classifies a sample, e.g., a sample that is 80% more likely than the server pushed the user based on the user's known interest tag, then classifies that sample in a first class for training the interest heuristic model, and if a sample that is 20% more likely than the server pushed the user based on the user's known interest tag, then classifies that sample in a second class that is not used for training the interest heuristic model, i.e., the sample is discarded, i.e., the sample is not used for training the interest heuristic model.
In the embodiment of the present application, each sample data obtained can be used for training the interest heuristic model, because even if there is a 20% possibility that the sample is determined not to be pushed by the server for the user based on the known interest information for the user, the sample can reflect the interest of the user to some extent, because the sample is the content clicked by the user after all, the sample should still be used for training the interest heuristic model, and in order to avoid that the sample may belong to the known interest for the user, if the first score of the sample by the interest heuristic model is too high, there may be a determined unknown interest that is too similar to the known interest, and therefore, the score determined by the interest heuristic model for the sample may be corrected by means of weight. Thus, in a case where the sample has a low probability of belonging to the content pushed by the server for the user based on the known interest information for the user, a low weight value may be output, so as to correct the score of the interest heuristic model, i.e., to lower the first score, so that the interest heuristic model has a low probability of considering that the sample belongs to the user unexecuted interest, i.e., the probability that the key feature in the sample can serve as the unknown interest tag of the user is low. The key features may be the type, title, keyword or entry of the sample.
S204: and training the interest heuristic model to be trained according to the adjusted first score.
The trained interest heuristic model is used for determining pre-estimated interest information of a user according to the user characteristic data, the pre-estimated interest information is unknown interest information of the user, and the pre-estimated interest information is used for determining content to be pushed to a client corresponding to the user.
The interest heuristic model can determine the possibility that the sample data is the unknown interest of the user, namely a first possibility, according to the currently determined unknown interest information, the output first possibility can be regarded as a pre-estimation value of the interest heuristic model for the sample data, specifically, the user characteristic data is used as an input value of the interest heuristic model, the interest heuristic model can determine the unknown interest information of the user according to the user characteristic data, wherein the unexecuted interest information can be an unexecuted interest label of the user, then, the unknown interest information of the user is matched with the sample, the matching degree of the two is calculated, so as to obtain the possibility that the sample belongs to the unknown interest of the user, namely a first score, the first score can be used as a transfer value of a loss function of the interest heuristic model, and the loss function is continuously optimized through the transfer value, namely, the interest heuristic model is continuously trained, then, when the training is finished, the optimal solution of the loss function can be obtained, namely the parameters of the interest heuristic model are trained to the optimal parameters, and the interest heuristic model finishes the training, so that the trained interest heuristic model can obtain more accurate unexecuted interest according to the user characteristic data of the user, namely the estimated interest information.
Referring to fig. 3, an embodiment of the present application provides a data processing method, which is capable of determining an unknown interest of a user, so as to push content for the user according to the unknown interest of the user. As shown in fig. 3, specifically, the executing subject of the method may be the server described above, and the method includes: s301 to S305.
S301: and acquiring a first score determined by the interest heuristic model to be trained according to the user characteristic data.
S302: and acquiring the intimacy degree between the user and each friend and the interest information of each friend.
The server records the friend relationship of the user, and the friend relationship of the user comprises the user identification of each user belonging to the friend relationship with the user. The intimacy degree between the user and the friend can reflect the interaction frequency between the user and the friend and the intimacy of the friend relationship, which is equivalent to the classification of the social relationship of the user.
As one implementation mode, the degree of closeness between the user and each friend can be determined through social information of the user. The social information of the user comprises interaction information of the user and each friend, the interaction information comprises forwarding times, collection times, comment times and grouping information, specifically, the forwarding times, the collection times and the comment times can be times of forwarding, collecting and commenting contents of the friend issued by the user, and the grouping information can be keywords of a plurality of groups established by the user and user identifications in the groups.
As an implementation manner, parameters may be set for the forwarding times, the collection times, the comment times, and the grouping information, that is, the parameters respectively include a first parameter, a second parameter, a third parameter, and a fourth parameter, where the first parameter corresponds to the forwarding times, the second parameter corresponds to the collection times, the third parameter corresponds to the comment times, and the fourth parameter corresponds to the grouping information. For ease of calculation, the first parameter, the second parameter, the third parameter, and the fourth parameter may all be normalized to a [0,1] range of values.
Specifically, the forwarding times of the user to each user are obtained, that is, the forwarding times corresponding to each friend are determined, then the forwarding times of all friends are added to obtain a total forwarding time, and then the forwarding time of each friend is divided by the total forwarding time to obtain a value as a first parameter of each friend. Similarly, the second parameter and the third parameter can also be obtained.
The fourth parameter may be obtained by determining a group of each friend by the user, determining a keyword of each group, determining a category corresponding to the keyword of the group, and determining a first value corresponding to the keyword of the group according to preset scores corresponding to different categories, where the scores corresponding to different categories are different, and for some categories, for example, for friends, the set score is higher, and for friends who are not grouped belong to a default group, the score of the default group is lower, so that the first value can represent the group where the friend is located, and whether the friend is a group corresponding to a friend with higher affinity of the user. The first value is normalized as a fourth parameter.
Then, the degree of closeness between the user and each friend is obtained according to the first parameter, the second parameter, the third parameter, and the fourth parameter, and as an implementation manner, the first parameter, the second parameter, the third parameter, and the fourth parameter may be summed, and the summed result is used as the degree of closeness.
As an embodiment, the social information of the user may be used as an input of the graphpage model, and the individual friends of the user may be classified by the graphpage model, for example, the degree of closeness between friends at the same position level is close, the degree of doing it is close to the art, and the degree of doing it is close to different interests.
In addition, the interest information of the friends can be counted in advance by the server, that is, each friend corresponds to one interest information, and the interest information can be other known interest information.
S303: and acquiring the weight value determined by the joint probability model according to the intimacy degree of each friend, the interest information of each friend and the sample data.
Specifically, if sample data matches interest information of most friends in friends with a higher user affinity, the determined weight value is smaller. That is, if most of the friends with high user affinity consider a certain sample as an accurate interest, the output weight value is smaller.
As an implementation manner, according to the closeness degree of each friend, searching friends, the closeness degree of which is greater than a certain threshold value, of the user from a plurality of friends of the user to serve as alternative friends, and then determining the number of friends matched with the sample according to the interest information of each friend in the alternative friends, wherein the friends matched with the sample are that the possibility that the known interest of the sample is determined to be friends based on the interest information of the friends is greater than a specified value. Then, a weight value is determined according to the number of the friends matched with the sample, for example, the number of the friends matched with the sample is recorded as a first number, the number of the candidate friends is used as a second number, and a ratio of the first number and the second number is used as the weight value.
As another embodiment, the joint probability model is used to: determining the interest degree of each friend for the sample data according to the known interest information of each friend; determining the weight value according to the intimacy degree and the interestingness degree of the friend, wherein the greater the intimacy degree and the greater the interestingness degree, the smaller the determined weight value is.
Specifically, the interest level of each friend with respect to the sample data is determined based on the known interest information of each friend, where the interest level may be a possibility that the sample belongs to the known interest information of the friend, then, the closeness between the friend and the user is determined, and a weight value is determined according to the possibility and the closeness, for example, a product of the acquisition possibility and the closeness is recorded as a reference result, then, a reference result corresponding to each friend is acquired, and then, a weight value is obtained according to the reference result of each friend, for example, an average value of all the reference results is acquired, and the average value is used as the weight value.
As an implementation manner, the joint probability model may also be trained according to the known interest information of the user, specifically, please refer to the following embodiments.
As an implementation manner, a structure of the joint probability model is shown in fig. 4, where the friend relationship data may be the social information, the friend division model may be a GraphSage model, and is used to output User embedding, where the User embedding is equivalent to representing the degree of closeness between the User and other users, that is, the degree of closeness of the friend. And obtaining the weight value through three-layer full-connection calculation. As an embodiment, the three fully-connected layers may be multi-layer Perceptron (MLP), specifically, that is, the first fully-connected layer is an input layer of the MLP, the second fully-connected layer is a hidden layer of the MLP, and the third fully-connected layer is an output layer of the MLP. Wherein the loss function of the hidden layer may be:
wherein x is User embedding.
S304: adjusting the first score according to the weight value, wherein the weight value is positively correlated with the first score.
S305: and training the interest heuristic model to be trained according to the adjusted first score.
Referring to fig. 5, an embodiment of the present application provides a data processing method, which is capable of determining an unknown interest of a user, so as to push content for the user according to the unknown interest of the user. As shown in fig. 5, specifically, the executing subject of the method may be the server described above, and the method includes: s501 to S507.
S501: and acquiring a first score determined by the interest heuristic model to be trained according to the user characteristic data.
Referring to FIG. 6, the principles of the interest heuristic model are described in conjunction with the structure of the interest heuristic model shown in FIG. 6.
As shown in FIG. 6, the user characteristic data input to the interest heuristic model includes attribute characteristics, fundamental characteristics, short-term behavior characteristics, and behavior characteristics. The attribute features of the user include basic features such as gender, age, region, and app used of the user, the basic features of the user include click interest behavior features of the user for content products displayed on the specified interface, specifically, the click interest behavior features may be feature data of content clicked by the user on the specified interface at a high frequency, where the feature data of the content may include description information and category of the content, and the description information may include information such as keywords of the content. As an embodiment, the specified client may be a wechat APP, the specified interface may be a wechat view-one interface, and the plurality of contents are all contents that can be displayed in the view-one interface. That is, the underlying features are data generated by a user's operation with respect to display content at a specified interface of the client.
The short-term behavior characteristics and behavior characteristics may be data generated by user operations on display contents of other interfaces belonging to the client together with the specified interface, that is, the short-term behavior characteristics and behavior characteristics are data generated by user operations on display contents displayed on other interfaces of the client. In one embodiment, the short-term behavior feature and the behavior feature are generated according to the operation of the user on the displayed content of the other interface, and the behavior feature represents a long-term attention point, for example, the attention or approval of the user on the other interface, while the short-term behavior feature represents the specific operation behavior of the user on the other interface. For example, the user successively clicks the content 1, the content 2, the content 3, the content 1 and the content 2 in a certain time period on other interfaces, then, what is recorded by the behavior characteristics is that the user clicks the content 1 twice, the content 2 twice and the content 3 once in the certain time period, and the short-term behavior characteristics record the whole clicking operation of the user, that is, not only the number of clicks but also the order of clicks. In one embodiment, the interface is designated as a one-look-at-one-look interface of the WeChat, and the other interfaces may be interfaces of the applet, interfaces of the public number, and the short-term behavior characteristics and behavior characteristics may be behavior characteristics of the user on the WeChat (such as reading articles, looking at active times, open public number, applet usage, short-term reading history, and the like).
As shown in fig. 6, each feature vector corresponds to one domain through the input feature vectors, for example, if the user feature data includes an attribute feature, a basic feature, a short-term behavior feature, and a behavior feature, the corresponding domain is four domains. The discrete characteristics S ═ (S1, S2 …, sn) of each domain are converted into continuous vectors W in an embedding modee∈Rm*DWherein m is the dimension of the embedded vector, D is the size of the dictionary, and the domains are not shared with each other. Each field embedded vector WeBy a mean function, i.e. Average _ Powing (W)e) Operation, converting the characteristics of each domain into a vector expression by means of a mean pooling operation, i.e. Ee=Average_Pooling(We). Then, the feature splicing function (i.e. Concat) is operated to perform inter-feature fusion, i.e. Ce12=Concat(Ee1,Ee2). Then, capturing nonlinear characteristic expression change through a multilayer full-connection layer, namely obtaining the nonlinear characteristic expression change through a linear rectification function, namely Re=Relu(Ee)。The Dot product, Dot, is then calculated with the interest tage1,e2=Dot(Ee1,Ee2). Specifically, the process of calculating the dot product may be understood as calculating similarity between the interest tag shown in fig. 6 and the user feature data, and determining the estimated interest information of the user according to the similarity. For example, if the similarity between the user feature data and the animation label is high, it can be shown that the interest level of the user in the animation is high, and the animation can be used as the estimated interest information of the user.
In addition, the Attention mechanism shown in fig. 6 is an Attention Model (Attention Model) and is essentially derived from the human visual Attention mechanism. People generally do not see a scene from head to tail all at a time when people perceive things, but often observe a specific part according to needs. And when people find that a scene often appears in a part where people want to observe, people can learn to pay attention to the part when similar scenes reappear in the future, more attention is focused on the useful part, the nature of the attention model is weighting, namely, a weight parameter distribution mechanism, and the aim is to assist the model in capturing important information. The Attention model may calculate a weight of each input feature and then perform a weighted summation on the features, wherein the greater the weight of a feature is, the greater the contribution of the feature to the current recognition result is. For example, given a set of < key, value > and a target (query) vector query, a weight coefficient of each key is obtained by calculating the similarity between the query and each set of keys, and a final output result is obtained by weighted summation of values.
Therefore, the interest heuristic model can obtain the predicted interest information of the user according to the user characteristic data. It should be noted that, when the interest heuristic model is not trained, that is, the accuracy of the predicted interest information output by the initial interest heuristic model is poor, the accuracy of the predicted interest information output by the interest heuristic model needs to be improved by a training method. In the embodiment of the application, in order to improve the accuracy of the predicted interest information output by the interest heuristic model and improve an accurate boundary between the known interest and the unknown interest of the user, the interest heuristic model may be trained by using the joint training model shown in fig. 7.
S502: and acquiring a weight value determined by the joint probability model according to the sample data.
S503: and adjusting the first score according to the weight value.
S504: and training the interest heuristic model to be trained according to the adjusted first score.
S505: and acquiring a second score determined by the accurate interest model according to the known interest information of the user.
And the second score is used for representing the possibility that the sample data belongs to the known interest of the user, namely a second possibility, wherein the known interest information is the determined interest and hobbies of the user.
As an embodiment, the known interest information may be interest preference determined according to click data of the user on the pushed history content. Specifically, the known interest information of the user is used to characterize the interest tag of the user that the server has currently determined. For example, the server stores the user identifier of each user and the known interest information corresponding to the user identifier. As an embodiment, the known interest information may be an interest tag, and when the server recommends content for the user according to the known interest information of the user, the server may determine the interest tag of the user and search for content matching the interest tag from among the plurality of content as recommended content.
In some embodiments, the known interest information is an interest tag determined from user click data of pushed historical content. In some embodiments, the server counts the user's click data for each content within a preset time period. Wherein the click data comprises a click operation. And then, determining the contents clicked by the user within a preset time period, analyzing the clicked contents, extracting key features of the contents, and counting and partially ordering each key feature to obtain the known interest information of the user. For example, the user clicks on durian-related content within a preset time period for a relatively high number of times, so that it can be determined that the durian is an interest tag of the user, that is, the known interest information of the user includes durian.
As another embodiment, the known interest information may also be information entered by the user. For example, the user sends interest information to the server, and the server takes the interest information sent by the user as known interest information and binds with the user identification of the user. The method for the user to enter the interest information may be that the server provides an interest submitting interface for the user, an interest entry area exists in the interface, the user can enter the interest information in the area through a virtual keyboard, and the user can also select an interest from a plurality of interest tags in the area as the interest information entered by the user.
In some embodiments, when a user registers at a client, in a client registration process, the client displays an interest submission interface, and the user can complete registration only by inputting interest information in the interest submission interface, so that the client sends the interest information input by the user and a user account registered by the user to a server together, and the server takes the interest information submitted by the user as known interest information and correspondingly stores the user account and the known interest information of the user.
The predicted interest information is known interest information which does not belong to the user, but the probability of the user being interested is high. For example, if both the predicted interest information and the known interest information are interest tags, the predicted interest information is a known interest tag that does not belong to the user, but is a tag with a high possibility of being interested by the user. For example, the known interest tags of the users include durian, the predicted interest information of the users may include pineapple, and the predicted interest information of the users is used for representing information that the users may be interested in, or the interest tags with the interest degree larger than a specified threshold.
As an embodiment, the known interest information of the user may be obtained based on a precise interest model, specifically, the precise interest model may be the FNN, RNN, or the like, which obtains the known interest information of the user according to a user profile, wherein the user profile may include user basic features, behavior features, content, context features, and the like.
Similar to the interest heuristic model, the accurate interest model may also determine a likelihood that the sample data belongs to the known interest of the user, i.e., a second likelihood, based on the known interest information of the user.
S506: and adjusting the weight value according to the first score and the second score.
S507: and training the joint probability model according to the adjusted weight value.
As an implementation manner, the joint training model is trained on the interest heuristic model and the joint probability model by using the accurate interest model, the interest heuristic model and the joint probability model at the same time, so that when the interest heuristic model is trained by using the whole joint training model, the accuracy of the interest information output by the interest heuristic model affects the convergence rate of the whole model, and further affects the training rate. Therefore, the performance of the interest heuristic model is not weak when the combined model is trained, so that the learning fluctuation and the convergence speed of the whole model are slow, the accuracy of the interest heuristic model is improved, a more appropriate combined model training direction can be provided, a stable and effective model is obtained, the convergence speed of the model is improved, and the interest heuristic model can be trained before the combined model is trained.
In the embodiment of the present application, an interest heuristic model that has not been trained is named as initial interest heuristic model training, and as shown in fig. 6, after the initial interest heuristic model is constructed, user feature data is input into the initial interest heuristic model, and then the initial interest heuristic model outputs initial interest information that represents the output result of the initial interest heuristic model that has not been trained.
As one embodiment, sample data is obtained and the sample data and the output result of the initial interest heuristic model (i.e., the output result of the dot product function described above) are trained through a loss function. Where a loss function is used to determine how well the model fits, the loss function is typically used to measure the degree of fit. The minimization of the loss function means the best fitting degree, and the corresponding model parameters are the optimal parameters. As an embodiment, the initial interest information can be optimized by optimizing the parameters of the interest heuristic model using a gradient descent algorithm on the loss function.
As an embodiment, the sample data may be all contents pushed to the user through the designated interface for a preset time period. As another embodiment, the sample data may be content clicked by the user in the designated interface for a preset time period.
In one embodiment, sigmoid loss is calculated, and the loss is optimized by back propagation, specifically, the initial interest heuristic model is trained 5000 times, resulting in the interest heuristic model to be trained. After pre-training, the interest heuristic model learns and trains the heuristic interest of the user, and the heuristic interest is used as a strong and weak expression for depicting the heuristic interest of the user to prepare for the training of the combined model. It should be noted that the interest heuristic model to be trained is a training of an initial interest heuristic model, and the interest information output by the specially trained interest heuristic model may be named as interest information to be confirmed, that is, the accuracy of the interest information to be confirmed is still not ideal, and the interest heuristic model to be trained needs to be trained through a joint training model.
As another example, the initial interest heuristic model may not be pre-trained. Specifically, the initial interest heuristic model may be trained directly by the joint training model after the initial interest heuristic model is produced, and the interest heuristic model to be trained shown in fig. 7 at this time is the initial interest heuristic model that is not trained in advance and is not the initial interest heuristic model that is trained in advance.
As shown in fig. 7, fig. 7 shows a structure of a joint training model, where the joint model training introduces an accurate interest model of a recommendation system, and performs joint training on the accurate interest model and an interest heuristic model by using a joint probability model, so as to perform unified training on sample data of a user. The accurate interest model is responsible for solving the known interest portrayal of the user in the sample, the interest heuristic model is responsible for solving the unknown interest calculation of the user, and the accurate calculation log is combined with the probability model to accurately calculate the division of the accurate and heuristic samples.
As shown in FIG. 7, the output of the accurate interest model is the corresponding known interest information of the user. As an implementation manner, the accurate interest model is a trained model, so that it is not necessary to train the accurate interest model through a joint training model, and due to the introduction of the accurate interest model, the interest heuristic model to be trained can be trained according to the sample data and the known interest information corresponding to the user, so as to optimize the interest information to be confirmed as the predicted interest information of the user.
Since the known interest information of the user output by the accurate interest model reflects the determined interest and preference of the user, that is, the content similar to the known interest information is the content in which the user is interested, the content similar to the content may also be the content in which the user is interested, for example, the known interest information of the user includes shooting games, the user may also like moba games, and the correlation between the two can be used as a basis for measuring whether the moba games are the unknown interest of the user, that is, the estimated interest information. Therefore, the interest heuristic model to be trained can be trained by referring to the known interest information corresponding to the user and according to the sample data, and the interest information to be confirmed can be optimized.
As an embodiment, as shown in fig. 7, the user data 1 input into the interest heuristic model may include search words, operation data of the user on an applet, operation data of the user on a client, and sequence behaviors, the sequence behaviors may include click behaviors of the user on a glance or other interfaces, specifically, the user data may refer to the aforementioned user characteristic data of the input interest heuristic model, and the user data 2 input into the precise interest model may include corresponding reading article data, click video data, public numbers of interest, and other reading histories of the user.
The method comprises the steps that sample data are input into an accurate interest model, an interest heuristic model and a joint probability model, the accurate interest model can determine a second score of the sample data belonging to the known interest of a user according to the known interest information corresponding to the user, the interest heuristic model can determine a first score of the sample data belonging to the unknown interest of the user, and the joint probability model outputs a weight value of the sample. And then training the interest heuristic model to be trained according to the first score, the second score and the weight value, wherein the weight value can modify the first score determined by the interest heuristic model to be trained so as to optimize the accuracy of the first score.
As an embodiment, the loss transfer value of the interest heuristic model is used to represent a difference between a sample predicted by the interest heuristic model to belong to the unknown interest of the user and a sample actually belonging to the unknown interest of the user, i.e., the difference can represent a difference between a predicted value, which is a first likelihood output by the interest heuristic model, i.e., a first score, and a true value, which represents a true likelihood that the sample is of the unknown interest of the user.
In order to obtain the unknown interest information of the user by combining the known interest information of the user, determining a loss transfer value of the interest heuristic model according to the first possibility, the second possibility and the weight value. Specifically, the first likelihood is a first score, the second likelihood is a second score, and the determining the loss delivery value of the interest heuristic model according to the first likelihood, the second likelihood and the weight value may be implemented by obtaining a total score according to a weight value, the first score and the second score, the weight value being a likelihood obtained according to a joint probability model for representing that the sample belongs to the known interest of the user; and determining a loss transfer value of the interest heuristic model according to the total score.
Specifically, the embodiment of obtaining the total score according to the weight value, the first score and the second score may be determining the total score according to the following formula (1).
score=(1-ds)*scorep+ds*scoreep (1)
Where score is the total score, dsAs a weighted value, scorepScore of the second scoreepIs the first score. Score, d as an embodiments、scorepAnd scoreepAll the values of (1) are in the range of [0,1]]。
In the embodiment of the present application, the weighting value is a probability obtained according to the joint probability model for characterizing the known interest of the sample belonging to the user, that is, the weighting value provides a correction force to correct the score of the sample by the interest heuristic model.
As an embodiment, the joint probability model can determine the probability that the currently trained sample belongs to the accurate interest model. It should be noted that the second score output by the accurate interest model is used for representing the possibility that the sample data belongs to the known interest of the user, and the score of the sample is expressed by the known interest information of the user when the sample data is expressed, that is, the higher the score is, the more likely the sample is to be matched with the known interest of the user, and the weight value output by the probability model is combined to represent the probability that the sample belongs to the accurate interest model, and the meaning expressed by the weight value is different from the second score because the sample belongs to the accurate interest model and does not represent the known interest of the sample which is definitely the user, and the influence of the score of the accurate interest model on the sample on the finally determined unknown interest of the user is expressed by the weight value.
The training process of the joint training model will be described below with reference to fig. 7.
Specifically, bp shown in fig. 7 is back propagation (backward), and fw is an abbreviation for forward, and represents model prediction. The joint training model is divided into two parts, namely interest heuristic model training and joint probability model training. Inputting corresponding user interest vectors by each model, and outputting scores of each model for the current sample, namely a second score after calculation and conversion of feature space of each modelpFirst scoreepAnd a weight value ds。
As shown in fig. 7, the data input into the joint probability model is a user vector, which may be a high-dimensional pictorial representation of a user, and the space vector may be a spatial transformation of the user vector as an input into the joint probability model. Specifically, according to the type of the content to be pushed or the specific pushing requirement, the user vector may be set according to the specific pushing requirement, and in the embodiment of the present application, the user vector may be the social information of the user.
And then calculating to obtain a total score according to the formula (1), and reversely propagating and optimizing parameters of the joint probability model and the interest heuristic model.
For the interest heuristic model, the loss function loss is:
wherein, thetajFor the parameters of the interest heuristic model, the formula (2) is a gradient descent algorithm, and the gradient descent algorithm is used for performing rating training on the sample by the interest heuristic model to obtain an optimal solution, so that the trained interest heuristic model can accurately output the estimated interest information of the user. It can be seen that when dsAnd in the smaller time, more accurate models are included, namely the probability that the samples are suitable for training the interest heuristic model is lower, loss transmission of loss to the interest heuristic model is lower, and it can be seen that the weight value enables the overall score of the right expression to be smaller, namely the weight value enables the first score output by the interest heuristic model to be smaller. When d issWhen the probability of the sample being suitable for training the interest heuristic model is larger, loss is larger for the interest heuristic model, the weight value makes the overall score of the right expression larger, namely the weight value makes the first score output by the interest heuristic model larger.
For the joint probability model, the loss function loss is:
wherein, thetaiFor the parameters of the interest heuristic model, formula (3) is a gradient descent algorithm, and the optimal solution is obtained by the grading training of the combined probability model to the sample through the gradient descent algorithm, so that the trained combined probability model can accurately determine the weight value, namely the weight value can be accurately determinedA likelihood that the current sample belongs to the accurate model of interest is determined. When scorepAnd scoreepWhen the difference is small, the boundary of the accurate interest model and the interest heuristic model is fuzzy, loss transfer of loss to the joint probability model is small, that is, the weight value is used for representing the possibility that the sample is suitable for training the interest heuristic model, if the sample, the accurate interest model and the interest heuristic model all give a higher score value, and the difference between the accurate interest model and the interest heuristic model is small, the weight value is smaller, because the sample is more likely to be suitable for training the accurate interest model, scorepAnd scoreepThe difference value enables the weight value to be smaller, and further reduces the possibility that the sample is suitable for training the interest heuristic model, namely, when the boundary between the accurate interest model and the interest heuristic model is fuzzy, and when the sample is difficult to accurately divide and is suitable for training the accurate interest model or the interest heuristic model, the weight value is reduced. Similarly, when scorepAnd scoreepWhen the difference value is larger, the boundary of the accurate interest model and the interest heuristic model is clear, loss transfer of loss to the combined model is larger, and the probability that the sample is suitable for training the interest heuristic model is higher. Therefore, through the formula (3), the joint probability model can be trained by using the accurate interest model and the interest heuristic model, so that the weight value determined by the joint probability model is more accurate. Specifically, a parameter in the output f (W1X + b1) of the hidden layer in the joint probability model may be trained so that the parameter gets an optimal solution, for example, a gradient descent algorithm may be employed.
Referring to fig. 8, an embodiment of the present application provides a data processing method, which is capable of determining an unknown interest of a user, so as to push content for the user according to the unknown interest of the user. As shown in fig. 8, specifically, the executing subject of the method may be the server described above, and the method includes: s801 to S807.
S801: and acquiring a first score determined by the interest heuristic model to be trained according to the user characteristic data.
S802: and acquiring a weight value determined by the joint probability model according to the sample data.
S803: and adjusting the first score according to the weight value.
S804: and training the interest heuristic model to be trained according to the adjusted first score.
S805: and acquiring the estimated interest information of the user according to the trained interest heuristic model.
Specifically, please refer to the foregoing embodiments, which are not described herein again.
S806: and determining the content to be pushed according to the estimated interest information of the user.
As an embodiment, the content to be pushed may be determined from a plurality of contents according to the predicted interest information of the user.
An information tag is determined for each content, the information tag describing the content. In some embodiments, the informational tag may be a category of content, a keyword of content, or the like.
The plurality of contents are contents for displaying through a specified interface of a specified client. As an implementation manner, the specified client may be a client that is used by the user terminal to display the content to be pushed, the server may be a server corresponding to the specified client, and the user identifier of the user may be a user account that logs in the specified client. As shown in fig. 9, the designated interface is an information presentation interface of the designated client, in which a plurality of contents, such as the content 301, the content 302, the content 303, and the content 304 shown in fig. 9, can be displayed, and the user can refer to the content 301, the content 302, the content 303, and the content 304 in the interface and can click each content, so that a detailed interface corresponding to each content can be entered, as shown in fig. 10, the interface shown in fig. 10 is a detailed interface corresponding to the content 302, when the user clicks the content 302 in the designated interface shown in fig. 9, the content detailed interface shown in fig. 9 is displayed on a screen, and the user refers to the detailed content corresponding to the content 302 in the interface shown in fig. 10.
And determining the implementation mode of the content to be pushed from a plurality of contents according to the estimated interest information of the user, acquiring the information tag of each content, matching the information tag with the estimated interest information of the user, and taking the content corresponding to the information tag matched with the estimated interest information as the content to be pushed. The matching mode may be to obtain similarity between the information tag and the estimated interest information of the user, and use the content with the similarity greater than a specified threshold as the content to be pushed.
As an implementation manner, the matching degree of the information tag of each content and the estimated interest information of the user may be determined, the matching degrees are sorted to obtain a push sequence, and the top N contents in the push sequence are used as the contents to be pushed. Wherein N is a positive integer greater than 1.
It should be noted that the content matching with the known interest information of the user may be named user precise interest content, and the content matching with the predicted interest information of the user may be named user tentative interest content, where the user precise interest content is usually related to the content clicked by the user or the content clicked frequently, the tentative interest content may be the content of the category not clicked by the user or the content clicked infrequently, but the content predicted by the server to be likely to be interesting or to be clicked with a high probability by the user.
S807: and pushing the content to be pushed to a client corresponding to the user.
The client corresponding to the user may be the above specified client.
As an implementation, the content to be pushed is also displayed in the above-mentioned designated interface of the designated client, and as an implementation, the designated interface is an interface of a designated application program in the client.
Specifically, the server pushes the content to be pushed to the specified client of the user, and the specified client displays the content to be pushed in a specified interface.
In some embodiments, when the specified client starts the specified interface, the content to be pushed and other content are displayed together on the specified interface, where the other content may be the pushed content determined by the server according to other policies, where the other policies may include the pushed content determined according to the known interest information of the user.
In other embodiments, when the server determines that the content to be pushed is already opened, a designated interface of a designated client is already opened, and the content to be pushed can be pushed to the designated client and displayed in the designated interface based on some push policies in the case that the user already opens the designated interface.
As an implementation manner, the implementation manner of displaying the content to be pushed in the designated interface may be that after the designated client acquires the content to be pushed, the designated client waits for a next refresh operation of the user, and when a next refresh comes, displays the content to be pushed in a designated area of the designated interface. In some embodiments, the designated area is a top area of a content display page within the designated area, and the refresh operation triggers the user to move the top edge of the page downward. Therefore, after the specified client acquires the content to be pushed, when the user triggers the top edge of the page to move downward and keeps for a certain length of time, a page refreshing operation is performed, and as shown in fig. 11, the content to be pushed is displayed in the top area of the page of the specified interface. Therefore, when the user refreshes, the content to be pushed can be displayed to the user in the appointed interface immediately, and the content to be pushed can be acquired in advance, so that the user terminal can generate the content to be displayed in advance according to the content to be pushed, the content to be displayed can be displayed immediately when a refreshing request is acquired, and the waiting time for refreshing the appointed interface is shortened.
As another embodiment, the embodiment that the content to be pushed is displayed in the designated interface may also be that after the designated client acquires the content to be pushed, the designated client waits for the next time when the user opens the designated interface to display the content to be pushed in the designated interface.
As another embodiment, the content 301, 302, 303, 304 on the designated interface shown in fig. 9 may be displayed by the server within the designated interface based on a request of the user or other push policy, and in a case that the designated interface is displayed, the server determines the content to be pushed by the user and sends the content to be pushed to the designated client. And when the specified client side determines that the user clicks the content on the current specified interface, the specified client side enters the detail interface of the selected content, and when the user returns to the specified interface, the content to be pushed is displayed below the content selected by the user.
For example, the user clicks on the content 302 in the interface shown in fig. 9, and then, the specified client displays the detail interface of the content 302 shown in fig. 10, and when the user returns to the specified interface after closing the detail interface, the content displayed by the designation interface becomes the content shown in fig. 12, that is, the content to be pushed is displayed below the content 302 shown in fig. 9 and 12, therefore, the content to be pushed can be displayed under the content currently browsed by the user, on one hand, the content to be pushed can be quickly and effectively pushed to the user, on the other hand, the content displayed on the designated interface in the figure 9 can be prevented from being possibly related to the known interest information of the user, while the differences between the contents in fig. 9 are relatively small, the user may feel the aesthetic fatigue if continuously reading the contents with the relatively small differences.
For example, if the known interest information of the user includes a cartoon, the content displayed in fig. 9 is the content determined by the server according to the known interest information of the user, and the user is in the designated interface of fig. 9, the content shown in fig. 9, even the content not shown in fig. 9, is related to the cartoon, and the user may continuously see that a plurality of contents are related to the cartoon. Therefore, after the content to be pushed is determined according to the estimated interest information of the user, the content to be pushed is displayed below the content selected by the user on the appointed interface, so that the user can be in contact with the content which is beyond some cartoons and is likely to be interested, and the user experience is improved.
As an implementation manner, after determining the content to be pushed, the server may preset a pushing condition, and when the parameter information of the server meets the pushing condition, the server executes to push the content to be pushed to the specified client of the user, so that it can be ensured that the server will push the content to be pushed to the specified client corresponding to the target user only when the parameter information meets the pushing condition.
If the pushing condition is a time interval, the parameter information may be system time corresponding to the server, that is, current time. The server acquires the current system time as the current time, then the time of sending the activity information to the user last time is taken as the historical time, the time difference between the historical time and the current time is acquired, whether the time difference is larger than or equal to the specified time interval threshold value or not is judged, and if the time difference is larger than or equal to the specified time interval threshold value, the operation of acquiring the activity information to be pushed is executed. The specified time interval threshold may be set in advance according to user requirements or push requirements, and may be 24 hours, for example, so as to ensure that the active content is pushed once a day.
In addition, the pushing condition may be, besides the time interval, that the network parameter between the server and the user terminal satisfies the specified communication condition, so that it is possible to avoid resource waste caused by the fact that the activity information to be pushed is still sent to the user terminal when the network state between the server and the user terminal is poor. Specifically, the server obtains a communication parameter between the server and a user terminal corresponding to a user to be pushed, determines whether the communication parameter meets a specified communication condition, and pushes the activity information to be pushed to the user terminal corresponding to the user to be pushed if the communication parameter meets the specified communication condition.
The communication parameter may specifically be a channel quality, wherein the channel quality may be an error vector magnitude of the channel, a number of access points, a signal strength, etc. An Error Vector Magnitude (EVM) is a Vector difference between an ideal Error-free reference signal and an actually transmitted signal at a given time, and is used for measuring an amplitude Error and a phase Error of a modulated signal, and the EVM specifically indicates a proximity degree of an IQ component generated when a receiving terminal demodulates the signal and an ideal signal component, and is an index for considering the quality of the modulated signal. The smaller the EVM, the better the channel quality of the channel. The number of the access points can also acquire the access points on each channel when the channels are scanned, so that the number of the access points on each channel can be determined, and the larger the number of the access points is, the worse the channel quality is, and vice versa, the better the channel quality is. Similarly, the signal strength can also be obtained during channel scanning, and the higher the signal strength is, the higher the channel quality is, and vice versa, the lower the channel quality is.
Specifically, the determination of whether the communication parameter satisfies the specified communication condition is performed by determining whether the channel quality satisfies the specified communication quality, and if the specified communication quality is satisfied, determining that the communication parameter satisfies the specified communication condition. Specifically, the channel quality is an error vector magnitude of the channel, and if the error vector magnitude of the channel is smaller than a specified value, it is determined that the channel quality satisfies a specified communication quality, and it is further determined that the communication parameter satisfies the specified communication condition.
Thus, the unknown interests of the user are determined from the known interest information. Then, determining the content to be pushed according to the estimated interest information of the user; the content to be pushed is pushed to the client corresponding to the user, so that the content can be pushed to the user according to the known interest of the user and the unknown interest of the user, and the diversity of the pushed content is improved.
Referring to fig. 13, an embodiment of the present application provides a data processing method, which is capable of determining an unknown interest of a user, so as to push content for the user according to the unknown interest of the user. As shown in fig. 13, specifically, the executing subject of the method may be the server described above, and the method includes: s1301 to S1308.
S1301: and acquiring a first score determined by the interest heuristic model to be trained according to the user characteristic data.
S1302: and acquiring a weight value determined by the joint probability model according to the sample data.
S1303: and adjusting the first score according to the weight value.
S1304: and training the interest heuristic model to be trained according to the adjusted first score.
S1305: and acquiring the estimated interest information of the user according to the trained interest heuristic model.
Specifically, please refer to the foregoing embodiments, which are not described herein again.
S1306: and determining the content meeting the specified requirement as the alternative content.
In one embodiment, the server is provided with a content database in which a plurality of contents are stored, and the plurality of contents are displayed for transmission to clients corresponding to respective users. The method provided by the embodiment of the application needs a plurality of contents in the content database to push the contents for the user.
Specifically, the plurality of contents are recorded as the contents to be selected, and the contents meeting the specified requirements are determined from the contents to be selected and serve as the alternative contents.
As an implementation manner, the estimated interest information of the user may be a plurality of estimated interest tags, and considering that the number of the estimated interest tags of the user is relatively large, if the content to be pushed is determined based on too many estimated interest tags, the types or tags of the content seen by the user may be too dispersed, so that the possibility that the user clicks on a certain content is low, and the pushing accuracy is reduced. A filtering condition may be set by which to determine the alternative content.
As an embodiment, the filtering condition may be determined according to the current geographical location of the user terminal. Specifically, the server obtains current position information of the user terminal, determines a position area where the current position information is located, then determines shops in the position area, and determines shop information corresponding to each shop, where the shop information may be description information of a commodity sold by the shops, where the description information may include a category of the commodity, and then determines, from the content to be selected, a content that matches the commodity information as the alternative content. Specifically, the shop information of each shop in the location area of the user terminal is matched with each content in the content to be selected, and all the matched contents are used as alternative contents. For example, the store information of stores near the current geographic location of the user terminal includes fruits, toys, movie theaters, restaurants and the like, then the server takes the contents related to the store information of the fruits, the toys, the movie theaters, the restaurants and the like in the contents to be selected as the alternative contents, and then the subsequent contents to be pushed are screened out from the alternative contents.
As another embodiment, the embodiment of determining the content meeting the specified requirement as the alternative content may also be that the content matching with the known interest information of the user is determined from a plurality of contents; and taking the content except the matching content in the plurality of contents as the alternative content.
Specifically, as shown in fig. 14, the known interest information of the user is obtained, and the content matching the known interest information of the user is determined from the content to be selected according to the known interest information of the user, and is used as the accurate interest content. And then, taking the content except the accurate interest content in the content to be selected as the alternative content. Therefore, the predicted interest information determined according to the interest heuristic model can be prevented from causing the pushed content determined based on the predicted interest information to be too much repeated with the content determined based on the known interest information if the predicted interest information is too similar to the known interest information of the user, so that too much repeated content is pushed for the user. Therefore, when the push content is determined based on the pre-estimated interest information, the content belonging to the accurate interest can be directly filtered, so that the condition that the content is pushed inaccurately due to unreasonable weight value setting can be avoided.
S1307: and determining the content to be pushed from the alternative content according to the estimated interest information of the user.
S1308: and pushing the content to be pushed to a client corresponding to the user.
In addition, as shown in fig. 14, after the content to be pushed is pushed to the client corresponding to the user, operation data of the user on the content to be pushed may also be collected, and the operation data continues to be used as user feature data, and then the interest heuristic model is trained according to the method mentioned in the above embodiment, so as to further optimize the predicted interest information output by the interest heuristic model.
As an implementation manner, the embodiment of the application may further push content determined based on the known interest information of the user, that is, accurate interest content, to the client corresponding to the user. Specifically, the content to be pushed, which is determined according to the estimated interest information of the user, is named as unknown interest content, and the accurate interest content and the unknown interest content may be simultaneously pushed to a client corresponding to the user according to a predetermined policy.
As an embodiment, the content 301, the content 302, the content 303, and the content 304 shown in fig. 9, 11, and 12 are all accurate interest contents, and as shown in fig. 11, the unknown interest contents may be displayed when the user refreshes the designated interface, for example, the unknown interest contents are displayed in a top area of a page of the designated interface, or as shown in fig. 12, when the user clicks a certain accurate interest content, the unknown interest contents are displayed in the designated interface, specifically, the unknown interest contents are displayed in an adjacent area below the clicked accurate interest content. As shown in fig. 11 and 12, both the content 501 and the content 502 are unknown interest content. Therefore, when the user opens the appointed interface, the server firstly pushes the accurate interest content to the client of the user and displays the accurate interest content in the appointed interface, so that the reading interest of the user can be firstly called through the accurate interest content, and then after the user clicks a certain accurate interest content, the unknown interest content is pushed for the user, so that the user can more possibly click the unknown interest content under the fumigation of the accurate interest content. As another embodiment, the precise interest content and the unknown interest content may be displayed in a staggered manner as shown in FIG. 15.
As an embodiment, the number of the precise interest content and the unknown interest content may be set according to the total number of the required push contents. Specifically, assuming that the number of contents pushed to the user at a time is not more than 20, the total number of required pushed contents is 20, and the ratio between the accurate interest content and the unknown interest content is set to M1/M2, so that the accurate interest content is 20 × M1/(M2+ M1), and the number of unknown interest contents is the difference between the total number and the number of accurate interest contents.
Therefore, by determining the content matching with the known interest information of the user from the plurality of contents, using the content other than the matching content in the plurality of contents as the candidate content, and then determining the content to be pushed from the candidate content according to the estimated interest information of the user, it is possible to avoid that too many duplicate contents exist between the content pushed based on the estimated interest information of the user and the pushed content determined based on the known interest information of the user.
Referring to fig. 16, an embodiment of the present application further provides a data processing apparatus, where the data processing apparatus 1600 includes: a first acquisition unit 1601, a second acquisition unit 1602, an adjustment unit 1603, and a training unit 1604.
The first obtaining unit 1601 is configured to obtain a first score determined by the interest heuristic model to be trained according to the user feature data, where the first score is used to represent a possibility that the sample data belongs to unknown interest of the user.
A second obtaining unit 1602, configured to obtain a weight value determined by the joint probability model according to the sample data, where the weight value is used to represent a possibility that the sample is suitable for training the interest heuristic model.
Further, the second obtaining unit 1602 is further configured to obtain the degree of closeness between the user and each friend and the interest information of each friend; and acquiring the weight value determined by the joint probability model according to the intimacy degree of each friend, the interest information of each friend and the sample data.
Further, the second obtaining unit 1602 is further configured to determine, according to the known interest information of each friend, an interest level of each friend in the sample data; determining the weight value according to the intimacy degree and the interestingness degree of the friend, wherein the greater the intimacy degree and the greater the interestingness degree, the smaller the determined weight value is.
An adjusting unit 1603, configured to adjust the first score according to the weight value, where the weight value is positively correlated with the first score.
A training unit 1604, configured to train the interest heuristic model to be trained according to the adjusted first score, where the trained interest heuristic model is configured to determine predicted interest information of the user according to the user feature data, where the predicted interest information is unknown interest information of the user, and the predicted interest information is used to determine content to be pushed to a client corresponding to the user.
Further, the data processing apparatus 1600 further includes: the joint training unit is used for acquiring a second score determined by the accurate interest model according to the known interest information of the user, wherein the second score is used for representing the possibility that the sample data belongs to the known interest of the user, and the known interest information is the determined interest and hobbies of the user; adjusting the weight value according to the first score and the second score, wherein the weight value is used as an output value of a loss function of the joint probability model; and training the joint probability model according to the adjusted weight value.
Further, the data processing apparatus 1600 further includes: and the pre-training unit is used for pre-training the initial interest heuristic model for a specified number of times according to the sample data and the user characteristic data to obtain the interest heuristic model to be trained.
Specifically, the user characteristic data includes basic characteristics and behavior data of the user, the behavior data is data generated by the user aiming at the operation of a specified application module and other application modules of the client, and the other application modules are application modules related to the specified application module and belonging to the client.
Further, the data processing apparatus 1600 further includes: the pushing unit is used for obtaining the estimated interest information of the user according to the trained interest heuristic model; determining the content to be pushed according to the estimated interest information of the user; and pushing the content to be pushed to a client corresponding to the user.
Specifically, the pushing unit is further configured to determine content meeting the specified requirement as alternative content; and determining the content to be pushed from the alternative content according to the estimated interest information of the user.
Specifically, the pushing unit is further used for determining the content matched with the known interest information of the user from the plurality of contents; and taking the content except the matching content in the plurality of contents as the alternative content.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
Referring to fig. 17, a block diagram of an electronic device according to an embodiment of the present application is shown. The electronic device 10 may be a smart phone, a tablet computer, an electronic book, or other electronic devices capable of running an application. Specifically, in the embodiment of the present application, the electronic device 10 may be the server 200 described above.
The electronic device 10 in the present application may include one or more of the following components: a processor 110, a memory 120, and one or more applications, wherein the one or more applications may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more programs configured to perform a method as described in the aforementioned method embodiments.
Processor 110 may include one or more processing cores. The processor 110 interfaces with various components throughout the electronic device 10 using various interfaces and circuitry to perform various functions of the electronic device 10 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and invoking data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.
The Memory 120 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The stored data area may also store data created during use by the electronic device 10 (e.g., phone books, audio-visual data, chat log data), and the like.
Referring to fig. 18, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer readable medium 1800 has stored therein program code that can be invoked by a processor to perform the methods described in the above-described method embodiments.
The computer-readable storage medium 1800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 1800 includes a non-volatile computer-readable storage medium. Computer-readable storage medium 1800 has storage space for program code 1810 for performing any of the method steps described above. The program code can be read from or written to one or more computer program products. Program code 1810 may be compressed, for example, in a suitable form.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.
Claims (11)
1. A data processing method, comprising:
obtaining a first score determined by an interest heuristic model to be trained according to user characteristic data, wherein the first score is used for representing the possibility that sample data belongs to unknown interest of a user;
obtaining a weight value determined by the joint probability model according to the sample data, wherein the weight value is used for representing the possibility that the sample is suitable for training the interest heuristic model;
adjusting the first score according to the weight value, wherein the weight value is positively correlated with the first score;
and training the interest heuristic model to be trained according to the adjusted first score, wherein the trained interest heuristic model is used for determining the predicted interest information of the user according to the user characteristic data, the predicted interest information is the unknown interest information of the user, and the predicted interest information is used for determining the content to be pushed to the client corresponding to the user.
2. The method of claim 1, wherein obtaining the weight value determined by the joint probability model according to the sample data comprises:
acquiring the intimacy degree between the user and each friend and the interest information of each friend;
and acquiring the weight value determined by the joint probability model according to the intimacy degree of each friend, the interest information of each friend and the sample data.
3. The method of claim 2, wherein the joint probability model is used to:
determining the interest degree of each friend for the sample data according to the known interest information of each friend;
determining the weight value according to the intimacy degree and the interestingness degree of the friend, wherein the greater the intimacy degree and the greater the interestingness degree, the smaller the determined weight value is.
4. The method of claim 1, further comprising:
acquiring a second score determined by the accurate interest model according to the known interest information of the user, wherein the second score is used for representing the possibility that the sample data belongs to the known interest of the user, and the known interest information is the determined interest and hobbies of the user;
adjusting the weight value according to the first score and the second score, wherein the weight value is used as an output value of a loss function of the joint probability model;
and training the joint probability model according to the adjusted weight value.
5. The method of claim 1, wherein before obtaining the first score determined by the interest heuristic model to be trained based on the user characteristic data, further comprising:
and pre-training the initial interest heuristic model for a specified number of times according to the sample data and the user characteristic data to obtain the interest heuristic model to be trained.
6. The method according to any one of claims 1 to 5, wherein the user characteristic data comprises basic characteristics and behavior data of the user, the behavior data is data generated by the user for the operation of a specific application module of the client and other application modules, and the other application modules are application modules related to the specific application module and belonging to the client.
7. The method according to any one of claims 1-5, wherein after training the interest heuristic model to be trained according to the adjusted first score, further comprising:
acquiring the estimated interest information of a user according to the trained interest heuristic model;
determining the content to be pushed according to the estimated interest information of the user;
and pushing the content to be pushed to a client corresponding to the user.
8. The method according to claim 7, wherein the determining the content to be pushed according to the predicted interest information of the user comprises:
determining contents meeting the specified requirements as alternative contents;
and determining the content to be pushed from the alternative content according to the estimated interest information of the user.
9. The method of claim 8, wherein the determining the content meeting the specified requirements, as an alternative content, comprises:
determining content matching the known interest information of the user from the plurality of contents;
and taking the content except the matching content in the plurality of contents as the alternative content.
10. A data processing apparatus, comprising:
the system comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a first score determined by an interest heuristic model to be trained according to user characteristic data, and the first score is used for representing the possibility that sample data belongs to unknown interest of a user;
a second obtaining unit, configured to obtain a weight value determined by the joint probability model according to the sample data, where the weight value is used to represent a possibility that the sample is suitable for training the interest heuristic model;
the adjusting unit is used for adjusting the first score according to the weight value, and the weight value is positively correlated with the first score;
the training unit is used for training the interest heuristic model to be trained according to the adjusted first score, the trained interest heuristic model is used for determining estimated interest information of a user according to the user characteristic data, the estimated interest information is unknown interest information of the user, and the estimated interest information is used for determining content to be pushed to a client corresponding to the user.
11. A computer-readable medium, characterized in that the readable medium stores program code executable by a processor, the program code causing the processor to perform the method of any one of claims 1-9 when executed by the processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010662408.0A CN112749332A (en) | 2020-07-10 | 2020-07-10 | Data processing method, device and computer readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010662408.0A CN112749332A (en) | 2020-07-10 | 2020-07-10 | Data processing method, device and computer readable medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112749332A true CN112749332A (en) | 2021-05-04 |
Family
ID=75645231
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010662408.0A Pending CN112749332A (en) | 2020-07-10 | 2020-07-10 | Data processing method, device and computer readable medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112749332A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239199A (en) * | 2021-05-18 | 2021-08-10 | 重庆邮电大学 | Credit classification method based on multi-party data set |
-
2020
- 2020-07-10 CN CN202010662408.0A patent/CN112749332A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239199A (en) * | 2021-05-18 | 2021-08-10 | 重庆邮电大学 | Credit classification method based on multi-party data set |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020228514A1 (en) | Content recommendation method and apparatus, and device and storage medium | |
US11231946B2 (en) | Personalized gesture recognition for user interaction with assistant systems | |
CN111602147A (en) | Machine learning model based on non-local neural network | |
US11416714B2 (en) | Method, system, and apparatus for identifying and revealing selected objects from video | |
US9654593B2 (en) | Discovering signature of electronic social networks | |
CN114556333A (en) | Smart camera enabled by assistant system | |
US20150161529A1 (en) | Identifying Related Events for Event Ticket Network Systems | |
CN106575503A (en) | Session context modeling for conversational understanding systems | |
CN109983455A (en) | The diversified media research result on online social networks | |
WO2022033199A1 (en) | Method for obtaining user portrait and related device | |
CN111444428A (en) | Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium | |
WO2022041979A1 (en) | Information recommendation model training method and related device | |
WO2021155691A1 (en) | User portrait generating method and apparatus, storage medium, and device | |
WO2018040069A1 (en) | Information recommendation system and method | |
US10474899B2 (en) | Social engagement based on image resemblance | |
CN112131472A (en) | Information recommendation method and device, electronic equipment and storage medium | |
CN110175297A (en) | Personalized every member's model in feeding | |
Elahi et al. | Recommender systems: Challenges and opportunities in the age of big data and artificial intelligence | |
CN112749332A (en) | Data processing method, device and computer readable medium | |
Andrews | Recommender systems for online dating | |
Huang et al. | A time-aware hybrid approach for intelligent recommendation systems for individual and group users | |
CN113569129A (en) | Click rate prediction model processing method, content recommendation method, device and equipment | |
CN111860870A (en) | Training method, device, equipment and medium for interactive behavior determination model | |
KR20200029647A (en) | Generalization method for curated e-Commerce system by user personalization | |
McIlwraith | Algorithms of the Intelligent Web |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40043931 Country of ref document: HK |
|
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |