CN110209941B - Method for maintaining push content pool, push method, device, medium and server - Google Patents

Method for maintaining push content pool, push method, device, medium and server Download PDF

Info

Publication number
CN110209941B
CN110209941B CN201910478470.1A CN201910478470A CN110209941B CN 110209941 B CN110209941 B CN 110209941B CN 201910478470 A CN201910478470 A CN 201910478470A CN 110209941 B CN110209941 B CN 110209941B
Authority
CN
China
Prior art keywords
content
target
push
user
pool
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910478470.1A
Other languages
Chinese (zh)
Other versions
CN110209941A (en
Inventor
陈一初
王军博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Calorie Information Technology Co ltd
Original Assignee
Beijing Calorie Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Calorie Information Technology Co ltd filed Critical Beijing Calorie Information Technology Co ltd
Priority to CN201910478470.1A priority Critical patent/CN110209941B/en
Publication of CN110209941A publication Critical patent/CN110209941A/en
Application granted granted Critical
Publication of CN110209941B publication Critical patent/CN110209941B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method, a pushing device, a medium and a server for maintaining a pushed content pool. Monitoring that a maintenance event of a push content pool is triggered, and acquiring at least one piece of target content meeting set conditions; determining a prediction vocabulary of a region to be filled in a preset document template corresponding to each target content; determining the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance; and generating a target file according to the preset file template and the target keyword, and updating the push content pool based on the target content and the target file. By adopting the scheme, the updating of the push content pool does not depend on manual intervention any more, the problem that the maintenance work of the push content pool is influenced by subjective judgment of people is solved, and the automatic maintenance of the push content pool is realized.

Description

Method for maintaining push content pool, push method, device, medium and server
Technical Field
The embodiment of the invention relates to a data processing technology, in particular to a method, a pushing device, a medium and a server for maintaining a pushed content pool.
Background
With the rapid development of the internet, more and more APPs (applications) achieve the effect of guiding users to use APPs by pushing messages to the users.
In the related art, the message pushing mode may be a manual screening mode, and the appropriate content is selected from the push content pool and pushed to the corresponding user group. The push content pool comprises contents to be pushed and documentations corresponding to each piece of content to be pushed. At present, the maintenance work of the push content pool, such as the increase of the content in the push content pool, the abstraction of a file and the like, cannot be automated and can be completed only by manual intervention. However, the maintenance mode of the push content pool is influenced by subjective judgment of people to a certain extent, and has the defect of low automation degree.
Disclosure of Invention
The embodiment of the invention provides a method, a pushing device, a pushing medium and a server for maintaining a pushing content pool, so as to optimize the maintenance scheme of the pushing content pool and improve the automation degree.
In a first aspect, an embodiment of the present invention provides a method for maintaining a pushed content pool, including:
monitoring that a maintenance event of a push content pool is triggered, and acquiring at least one piece of target content meeting set conditions;
determining a prediction vocabulary of a region to be filled in a preset document template corresponding to each target content;
determining the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance;
and generating a target file according to the preset file template and the target keyword, and updating the push content pool based on the target content and the target file.
In a second aspect, an embodiment of the present invention further provides a push method, including:
acquiring user characteristics of a target user and text characteristics of each piece of content in a preset push content pool;
predicting the click rate of the target user to each piece of content according to the user characteristics and the text characteristics, and determining target content to be pushed based on the click rate;
and acquiring a target file corresponding to the target content in the push content pool, and pushing the target content and the target file to the target user, wherein the target file is automatically generated based on the keywords of the target content and a preset file template.
In a third aspect, an embodiment of the present invention further provides an apparatus for maintaining a pushed content pool, where the apparatus includes:
the content acquisition module is used for monitoring that a maintenance event of the pushed content pool is triggered and acquiring at least one piece of target content meeting set conditions;
the vocabulary acquisition module is used for determining the predicted vocabulary of the area to be filled in the preset document template corresponding to each target content;
the distance determining module is used for determining the distance between the keyword corresponding to each item label content and the word vector between the predicted words and determining the target keyword corresponding to the area to be filled according to the distance;
and the content pool updating module is used for generating a target file according to the preset file template and the target keyword and updating the push content pool based on the target content and the target file.
In a fourth aspect, an embodiment of the present invention further provides a pushing device, where the pushing device includes:
the characteristic acquisition module is used for acquiring the user characteristics of a target user and the text characteristics of each piece of content in a preset push content pool;
the click rate prediction module is used for predicting the click rate of the target user on each piece of content according to the user characteristics and the text characteristics and determining target content to be pushed based on the click rate;
and the content pushing module is used for acquiring a target file corresponding to the target content in the pushing content pool and pushing the target content and the target file to the target user, wherein the target file is automatically generated based on the keywords of the target content and a preset file template.
In a fifth aspect, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for maintaining a pushed content pool according to the embodiment of the present application.
In a sixth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the push method according to the embodiments of the present application.
In a seventh aspect, an embodiment of the present invention further provides a server, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor, when executing the computer program, implements the method for maintaining the pushed content pool according to the embodiment of the present application.
In an eighth aspect, an embodiment of the present invention further provides a server, which includes a memory, a processor, and a computer program stored on the memory and executable by the processor, where the processor executes the computer program to implement the push method according to the embodiment of the present application.
The embodiment of the invention provides a scheme for maintaining a push content pool, which comprises the steps of determining a prediction vocabulary of a region to be filled in a preset document template corresponding to each item target content by acquiring at least one target content; calculating the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance; and filling the target keyword into the area to be filled to obtain a target file, updating the push content pool based on the target content and the target file, realizing automatic adding operation of the content to be pushed in the push content pool, and automatically generating a corresponding target file based on the internal content. Therefore, the updating of the push content pool is not dependent on manual intervention any more, the problem that the maintenance work of the push content pool is influenced by subjective judgment of people is solved, and the automatic maintenance of the push content pool is realized.
Drawings
Fig. 1 is a flowchart of a method for maintaining a push content pool according to an embodiment of the present invention;
fig. 2 is a flowchart of a push method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a push flow according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for maintaining a pushed content pool according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a pushing device according to an embodiment of the present invention;
fig. 6 is a block diagram of a server according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Fig. 1 is a flowchart of a method for maintaining a push content pool according to an embodiment of the present invention, where the method may be performed by an apparatus for maintaining a push content pool, where the apparatus may be implemented by hardware and/or software, and is generally integrated in a server. As shown in fig. 1, the method includes:
step 110, it is monitored that a maintenance event of the push content pool is triggered, and at least one piece of target content meeting set conditions is obtained.
In the embodiment of the present invention, the maintenance event of the pushed content pool is an event that triggers generation or update of the pushed content pool, and conditions for triggering the maintenance event may be various, and the embodiment of the present application is not particularly limited. For example, a maintenance event for the content pool is triggered periodically. Or, triggering a maintenance event of the pushed content pool according to historical operation data of the pushed content set by the collection user, wherein the historical operation data comprises relevant data of operations such as clicking, praise, collection or forwarding and the like. Or, when the number of contents in the push content pool exceeds a set number threshold, a maintenance event of the push content pool is triggered, and the like.
In the embodiment of the invention, a plurality of target contents are acquired from the alternative push content set by setting conditions. The setting conditions may include timeliness of the content, heat of the content, and the like. The timeliness refers to a time distance between the occurrence time, the release time, and the like of the content and the current time. If the time distance is smaller, it is determined that the timeliness is higher. The popularity of the content may be the number of times the content is clicked on, endorsed, collected, etc. by the user. And if the user operates more frequently, determining that the heat is higher.
The alternative push content set may include high-quality articles, topics and the like in a relevant field within a set time period, and may further include content generated based on user own operations. For example, the content in the alternative push content set may be data of high-quality articles and hot topics determined through statistical analysis in a field provided by a third-party service organization and related to the service provided by the APP. Or, the content in the alternative pushed content set may be an article uploaded or shared by the user of the APP, an activity attended or community dynamics concerned, or other content in the internet related to the above content. It should be noted that a community can be considered as a social networking platform, that is, a platform that can assist users in interaction. And taking the content meeting the set conditions in the alternative push content set as target content.
Illustratively, when it is monitored that a maintenance event of a pushed content pool is triggered, a candidate pushed content set is obtained, a weight value of each content in the candidate pushed content set is determined according to timeliness and/or heat, and at least one target content is determined according to the weight value. Note that, as the timeliness of a piece of content is higher, a higher weight is given to the piece of content. A piece of content is given a lower weight if the timeliness of the piece of content is lower. If the popularity of a piece of content is high, the piece of content is given a high weight. If the piece of content is less hot, the piece of content is given a lower weight. Therefore, each piece of content can be given a corresponding weight value according to timeliness and/or heat.
For example, when it is monitored that a maintenance event of the pushed content pool is triggered, the alternative pushed contents in the alternative pushed content set are input into a preset classification model, a weight value is distributed to each alternative pushed content through the classification model based on timeliness and heat, and the alternative pushed contents with the set number of weight values exceeding a set threshold value are output as target contents. It should be noted that the classification model may be a pre-constructed neural network model. The classification model can distribute weight values for all the alternative push contents based on timeliness and heat respectively, and the weight values are recorded as reference weight values. According to the reference weight value of the same piece of alternative push content, the sum of the weight values of the piece of alternative push content can be calculated and used as the weight value of the piece of alternative push content. Or, a larger one of the reference weight values of the same piece of candidate push content is selected as the weight value of the piece of candidate push content.
And step 120, determining the prediction vocabulary of the area to be filled in the preset document template corresponding to each target content.
In the embodiment of the application, the preset document template can be a pre-designed general template with an area to be filled. For example, an xAPP () course, or the "little" on that () thing, or "(), what () to move with, and so on.
In the embodiment of the application, the predicted vocabulary is determined by prediction and can be filled into the area to be filled (namely, in brackets) in the preset document template.
Illustratively, the neural network model in the form of a continuous bag-of-words model predicts the prediction words of the areas to be filled in the preset document template corresponding to each piece of the target content. Wherein, the continuous bag of words model, also called CBOW model, is a model of Word2 vec. Word2vec is a group of correlation models used to generate Word vectors. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic word text. The network is represented by words and the input words in adjacent positions are guessed, and the order of the words is unimportant under the assumption of the continuous bag of words model of word2 vec. For example, the context of the contents of each entry target can be analyzed based on Word2vec in the form of a continuous bag-of-words model to predict the predicted vocabulary of the area to be filled. Because the Word2vec model is adopted for prediction, the prediction vocabulary is expressed in the form of Word vectors and is marked as a first Word vector.
Step 130, determining the distance between the keyword corresponding to each item label content and the word vector between the prediction vocabularies, and determining the target keyword corresponding to the area to be filled according to the distance.
It should be noted that there are many ways to extract keywords from the text, and the present invention is not limited to the specific way. For example, a Term Frequency-Inverse text Frequency index (TF-IDF) algorithm may be used to extract the keywords in the content of each entry tag.
In the embodiment of the present invention, Word vector (Word embedding) is a vector in which words or phrases are mapped to real numbers, i.e., words or phrases are converted into a distributed representation, which is also called Word vector. Wherein the distributed representation represents words or phrases as a continuous dense vector of fixed length. The distributed representation has the following advantages:
(1) similar relationships exist between words: it is the concept of "distance" between words that is very helpful to many natural language processing tasks.
(2) Contains more information: the word vector can contain more information and each dimension has a specific meaning. When one-hot features are adopted, feature vectors can be deleted, and word vectors cannot.
Exemplarily, obtaining a keyword corresponding to each item label content; determining a second word vector corresponding to the keyword by setting a neural network model; calculating the distance between the first word vector and the second word vector, and determining the alternative second word vector of which the distance is smaller than a set distance threshold; and determining the keywords corresponding to the alternative second word vectors as the target keywords corresponding to the area to be filled. After determining the keywords corresponding to the respective entry target contents, a Word2vec model may be used to determine the second Word vector corresponding to the keywords of the respective entry target contents. And acquiring a first word vector corresponding to the prediction vocabulary, and calculating the distance between the first word vector and the second word vector. And the second word vector with the distance smaller than the set distance threshold value is called as a candidate second word vector. And determining the keywords corresponding to the alternative second word vector as target keywords of the area to be filled in the preset document template corresponding to the item mark content.
It should be noted that there are many methods for generating word vectors, such as a statistical-based method and a language model-based method. The statistical-based method includes a co-occurrence matrix or singular value decomposition-based method. Language model-based methods include CBOW methods and the like.
And 140, generating a target file according to the preset file template and the target keyword, and updating the push content pool based on the target content and the target file.
Illustratively, the target file can be obtained by filling the target keywords in the area to be filled of the preset file template. And updating the push content pool based on the newly acquired target content and the target file corresponding to each item target content. For example, the newly acquired target content and the target file corresponding to each item target content are added to the pushed content pool. Or replacing the original content of the same target file in the pushed content pool according to the newly acquired target content, and the like.
According to the technical scheme of the embodiment, through obtaining at least one piece of target content, the prediction vocabulary of the area to be filled in the preset file template corresponding to each item target content is determined; calculating the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance; and filling the target keyword into the area to be filled to obtain a target file, updating the push content pool based on the target content and the target file, realizing the adding operation of the content to be pushed in the push content pool, and automatically generating a corresponding target file based on the internal content. Therefore, the updating of the push content pool is not dependent on manual intervention any more, the problem that the maintenance work of the push content pool is influenced by subjective judgment of people is solved, and the automatic maintenance of the push content pool is realized.
On the basis of the above technical solution, after acquiring at least one piece of target content satisfying the setting condition, the method further includes: determining the content association degree of the first target content of each undetermined case and the second target content of the determined case; and when the association degree exceeds a set threshold value, determining a target file corresponding to the first target content according to a target file corresponding to the second target content. The content relevance refers to the similarity between two contents. There are various ways to determine the similarity of contents. For example, the similarity between two contents can be determined by determining whether the two contents belong to the same topic set. Or, the similarity of the two contents can be determined by the similarity of the characters in the two contents. Alternatively, the similarity between the two contents may be determined by the similarity between the two contents for the purpose to be achieved. The advantage of setting up like this is that through the effect of presetting partial matching rule realization target content corresponding target copy of confirming fast.
On the basis of the above technical solution, after acquiring at least one piece of target content satisfying the setting condition, the method further includes: acquiring registration information of a new user, labeling the new user according to the registration information, matching the label with a preset case set according to the label, and determining a successfully matched case. And when the target user is a new user, after the target file is generated according to the preset file template and the target keyword, distributing a lower weight value to the determined target file, and distributing a higher weight value to the successfully matched file. And taking the successfully matched file as a target file of the target content pushed to the new user. The method has the advantages that the method is suitable for determining the file according to the group characteristics of the new user, can attract the attention of the new user, and further achieves the effect of improving the click rate of the pushed content.
Fig. 2 is a flowchart of a push method provided by an embodiment of the present invention, which may be executed by a push device, wherein the push device may be implemented by hardware and/or software, and is generally integrated in a server. As shown in fig. 2, the method includes:
step 210, obtaining user characteristics of the target user and text characteristics of each piece of content in a preset push content pool.
In the embodiment of the present invention, the target user may be all users of the APP, including a registered user and a guest user, or may be at least one type of user among all users of the APP. For a registered user, a user representation of the registered user may be determined based on its registration information and historical operational information. For the guest user, a registered user most similar to the guest user can be determined according to the historical operation information of the guest user, and the user portrait of the registered user is used as a reference portrait of the guest user. And obtaining the user characteristics of the target user through screening, cleaning and enhancing processing of data in the user portrait. The user characteristics include gender, age, region, type of information of interest, and the like. The type of the information of interest may be determined based on historical operational information. For example, if a target user frequently browses an instrument article or course, it is determined that the type of information of interest of the target user is an instrument. Or, the target user often browses related articles or courses such as yoga or jogging, and then it is determined that the target user's attention information is aerobic exercise or the like.
In the embodiment of the invention, the preset push content pool comprises the content to be pushed, a file corresponding to the content to be pushed and the like, wherein the content to be pushed comprises information such as articles, community dynamic or popular activities and the like. The file is highly refined to the content to be pushed so as to achieve the eyeball effect of a blog, and the file and the content to be pushed have an association relationship. It should be noted that the method for maintaining the push content pool described in the above embodiment is used to maintain the preset push content pool.
In the embodiment of the invention, word embedding processing is carried out on each piece of content to be pushed in a preset pushing content pool to obtain the text characteristics of each piece of content to be pushed. Where word embedding refers to embedding a high-dimensional space with a number of all words into a continuous vector space with a much lower dimension, each word or phrase being mapped as a vector on the real number domain. The word embedding method comprises an artificial neural network, dimension reduction of a word co-occurrence matrix, a probability model, explicit representation of the context in which the word is positioned and the like.
Illustratively, user representation data of a target user on the current day is obtained, and user characteristics of the target user are determined from the user representation data. Because the users who use the APP daily are different, the users who use the APP daily need to be counted again. In addition, if the target user is not all users using the APP, the target user needs to be screened out from all users using the APP on the same day according to the setting requirement. For example, if the setting requirement specifies that the registered users are intelligently pushed, the registered users are screened from all users using the APP on the current day according to the setting requirement and are used as target users on the current day. And acquiring user portrait data of the target user on the current day, and respectively screening, cleaning and enhancing the user portrait data of each target user to obtain the user characteristics of each target user. In addition, the keywords of each piece of content in the pushed content pool of the current day are also obtained and used as the text features of each piece of content. For example, each content in the pushed content pool on the current day is obtained, Word embedding operation is performed on each content by adopting a Word2vec model, and text characteristics of each content are obtained.
And step 220, predicting the click rate of the target user to each piece of content according to the user characteristics and the text characteristics, and determining the target content to be pushed based on the click rate.
Illustratively, the synthesized features are generated by computing a Cartesian product of the user features and the textual features. Where cartesian products are one implementation of feature intersection. A synthesized feature is a feature derived by the intersection of a user feature and a text feature. For example, feature intersection is performed on the latest user feature in the current day and the text feature of each piece of content in the updated push content pool, so as to obtain a synthesized feature.
And predicting the click rate of the target user on each piece of content based on the synthesis characteristic through a pre-constructed push model. The push model is a neural network model constructed by taking historical user operation data of pushed contents as a training set and taking an eXtreme Gradient Boosting (XGboost) algorithm as a classifier, wherein the historical user operation data can be the user operation data aiming at the pushed contents within a set time length, for example, in the past two weeks, the historical user operation data can be related data of actions such as clicking, forwarding, collecting or appropriating and the like of all users using the APP, or the historical user operation data can also be sample users determining a set proportion from all users using the APP, randomly sending the set pushed contents to the sample users and acquiring the user operation data of the sample users aiming at the set pushed contents, because the content push scheme in the related technology is that users are grouped and different groups are labeled, to push messages based on tags. Before model training, the historical operation information of the user is data collected after content pushing is carried out based on a grouping matching rule, part of matching relations between implicit users and the content are not recorded in a scheme for carrying out content pushing based on the grouping matching rule, and the implicit matching relations between the implicit users and the content do not exist in a pushing model which is constructed by taking the data as a training set, so that the matching relations can never be touched, and the long tail problem is caused, namely the users can never receive the content of implicit attention. However, the user history data is collected by randomly sending the set push content, and because the matching relationship between the user and the content is not manually specified, the collected user history data can reflect the attention point of the user more truly, and the effect of reducing the long tail problem is achieved.
And sequencing and filtering the contents according to the click rate to obtain target contents. For example, the respective pieces of content are arranged in descending order according to the click rate, and the set number of pieces of content that are arranged in the top order are set as the target content. Optionally, after determining the content sorted in the previous set number, querying the intermediate table to obtain the identification information of the pushed content in the set time period. And the intermediate table stores identification information of the pushed content. And if the content ranked at the top of the click rate has the same identification information as the pushed content, abandoning the content. And if the content ranked at the top of the click rate and the pushed content have different identification information, taking the content as the target content.
Step 230, obtaining a target file corresponding to the target content in the push content pool, and pushing the target content and the target file to the target user.
In the embodiment of the invention, the target file is automatically generated based on the keywords of the target content and the preset file template. It should be noted that the generation method of the target document is described in the above embodiments, and is not described herein again.
Illustratively, a target file corresponding to the target content is selected from a preset push content pool, and the target content and the target file are pushed to a target user, so that automation and intellectualization of the whole push process are realized, and the problem of inaccurate push caused by manual intervention in the push process is avoided.
Fig. 3 is a schematic diagram of a push flow provided by an embodiment of the present invention, and as shown in fig. 3, a server collects high-quality articles, community dynamics, hot activities, and the like to form an alternative push content set. Respectively determining the weight value of each piece of content in the alternative push content set according to timeliness and/or heat, and determining at least one piece of target content according to the weight value. In addition, a target file corresponding to the target content is automatically generated based on the keywords of the target content and a preset file template. Updating the push content pool (also referred to as a push recall pool) based on the target content and the target copy. User portrait data of a target user on the current day is acquired, and user characteristics of the target user are determined according to the user portrait data. And performing Word embedding operation on each content in the push content pool by adopting a Word2vec model to obtain the text characteristics of each content. And calculating the characteristic combination of the user characteristic and the text characteristic in a Cartesian product mode to serve as a synthesized characteristic. And training a classifier generated based on an extreme gradient lifting algorithm by taking the historical operation data of the pushed content as a training set so as to construct a pushing model. And predicting the click rate of the target user to each content based on the synthetic characteristics through the push model, sequencing and filtering each content according to the click rate to obtain target content, and executing a push task based on the target content.
According to the technical scheme, the click rate prediction is carried out on the basis of the user characteristics and the text characteristics by adopting the neural network model, the matching degree between the user and the push content is increased, the diversity of the push content every day is increased, the working difficulty of workers is reduced, and in addition, the matching logic between the content and the user depends on the data feedback of the historical operation data of the user, so that the push accuracy is increased.
On the basis of the above embodiment, pushing the target content and the target document to the target user includes: determining an individualized pushing strategy of each target user according to the historical user operation data, wherein the historical user operation data comprise pushing content click time, pushing closing reminding time and unloading behavior data, and the individualized pushing strategy comprises pushing time and pushing number; and respectively pushing the target content and the target file to each target user according to the personalized pushing strategy. The design is good, aiming at the click time, the push closing reminding time and the unloading behavior data of each target user, the number of the push contents and the push time of different users are adjusted, so that the individual requirements of each target user are met.
Fig. 4 is a schematic structural diagram of an apparatus for maintaining a push content pool according to an embodiment of the present invention, where the apparatus may execute the method for maintaining a push content pool according to the embodiment of the present invention, and the apparatus may be implemented by hardware and/or software and is generally integrated in a server. As shown in fig. 4, the apparatus includes:
a content obtaining module 410, configured to monitor that a maintenance event of the push content pool is triggered, and obtain at least one piece of target content meeting a set condition;
a vocabulary obtaining module 420, configured to determine predicted vocabularies of areas to be filled in a preset document template corresponding to each piece of the target content;
a distance determining module 430, configured to determine a distance between a keyword corresponding to each entry label content and a word vector between the predicted vocabulary, and determine a target keyword corresponding to the area to be filled according to the distance;
and a content pool updating module 440, configured to generate a target document according to the preset document template and the target keyword, and update the push content pool based on the target content and the target document.
The embodiment of the invention provides a device for maintaining a push content pool, which is characterized in that through acquiring at least one piece of target content, a prediction vocabulary of an area to be filled in a preset file template corresponding to each item target content is determined; calculating the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance; and filling the target keyword into the area to be filled to obtain a target file, updating the push content pool based on the target content and the target file, realizing the adding operation of the content to be pushed in the push content pool, and automatically generating a corresponding target file based on the internal content. Therefore, the updating of the push content pool is not dependent on manual intervention any more, the problem that the maintenance work of the push content pool is influenced by subjective judgment of people is solved, and the automatic maintenance of the push content pool is realized.
Optionally, the content obtaining module 410 is specifically configured to:
the method comprises the steps of obtaining an alternative push content set, respectively determining the weight value of each piece of content in the alternative push content set according to timeliness and/or heat, and determining at least one piece of target content according to the weight value.
Optionally, the vocabulary acquiring module 420 is specifically configured to:
and predicting the prediction words of the areas to be filled in the preset document template corresponding to each target content based on a neural network model in a continuous bag-of-words model form.
Optionally, the distance determining module 430 is specifically configured to:
acquiring a keyword corresponding to each item label content;
respectively determining a first word vector corresponding to the predicted vocabulary and a second word vector corresponding to the keyword by setting a neural network model;
calculating the distance between the first word vector and the second word vector, and determining the alternative second word vector of which the distance is smaller than a set distance threshold;
and determining the keywords corresponding to the alternative second word vectors as the target keywords corresponding to the area to be filled.
Optionally, the method further includes:
after at least one piece of target content meeting set conditions is obtained, determining the content association degree of the first target content of each undetermined file and the second target content of the determined file;
and when the association degree exceeds a set threshold value, determining a target file corresponding to the first target content according to a target file corresponding to the second target content.
Fig. 5 is a schematic structural diagram of a pushing apparatus according to an embodiment of the present invention, where the pushing apparatus may execute the pushing method according to the embodiment of the present invention, and the apparatus may be implemented by hardware and/or software, and is generally integrated in a server. As shown in fig. 5, the apparatus includes:
a feature obtaining module 510, configured to obtain a user feature of a target user and a text feature of each content in a preset push content pool;
the click rate prediction module 520 is configured to predict click rates of the target users for the pieces of content according to the user characteristics and the text characteristics, and determine target content to be pushed based on the click rates;
a content pushing module 530, configured to obtain a target document corresponding to the target content in the pushed content pool, and push the target content and the target document to the target user, where the target document is automatically generated based on the keywords of the target content and a preset document template.
The embodiment of the invention provides a pushing device, which carries out click rate prediction based on user characteristics and text characteristics by adopting a neural network model, increases the matching degree between users and pushed contents, increases the diversity of daily pushed contents, reduces the working difficulty of workers, and increases the pushing accuracy because the matching logic between the contents and the users depends on the data feedback of historical operation data of the users.
Optionally, the feature obtaining module 510 is specifically configured to:
acquiring user portrait data of a target user on the current day, and determining user characteristics of the target user according to the user portrait data;
and acquiring keywords of each piece of content in the pushed content pool on the current day as the text features of each piece of content.
Optionally, the click rate prediction module 520 is specifically configured to:
generating a synthesized feature by calculating a Cartesian product of the user feature and the text feature;
predicting the click rate of the target user on each content based on the synthetic features through a pre-constructed push model, wherein the push model is a neural network model constructed by taking the historical operation data of the pushed content as a training set and taking an extreme gradient lifting algorithm as a classifier;
and sequencing and filtering the contents according to the click rate to obtain target contents.
Optionally, the content pushing module 530 is specifically configured to:
determining an individualized pushing strategy of each target user according to the historical user operation data, wherein the historical user operation data comprise pushing content click time, pushing closing reminding time and unloading behavior data, and the individualized pushing strategy comprises pushing time and pushing number;
and respectively pushing the target content and the target file to each target user according to the personalized pushing strategy.
The embodiment of the invention also provides a server, and the server can be integrated with the device for maintaining the push content pool provided by the embodiment of the invention. Fig. 6 is a block diagram of a server according to an embodiment of the present invention. The server may include a memory 610, a processor 620, and a computer program stored on the memory 610 and executable on the processor 620, wherein the processor 620 implements the method for maintaining the pushed content pool according to the embodiment of the present invention when executing the computer program.
The server provided by the embodiment of the invention determines the prediction vocabulary of the area to be filled in the preset document template corresponding to each item target content by acquiring at least one piece of target content; calculating the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance; and filling the target keyword into the area to be filled to obtain a target file, updating the push content pool based on the target content and the target file, realizing the adding operation of the content to be pushed in the push content pool, and automatically generating a corresponding target file based on the internal content. Therefore, the updating of the push content pool is not dependent on manual intervention any more, the problem that the maintenance work of the push content pool is influenced by subjective judgment of people is solved, and the automatic maintenance of the push content pool is realized.
The embodiment of the invention also provides another server, and the pushing device provided by the embodiment of the invention can be integrated in the electronic equipment. The server may comprise a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the push method according to an embodiment of the invention when executing the computer program.
According to the server provided by the embodiment of the invention, the neural network model is adopted to predict the click rate based on the user characteristics and the text characteristics, so that the matching degree between the user and the push content is increased, the diversity of the daily push content is increased, the working difficulty of workers is reduced, and in addition, the matching logic between the content and the user depends on the data feedback of the historical operation data of the user, and the push accuracy is increased.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for maintaining a pushed content pool, the method comprising:
monitoring that a maintenance event of a push content pool is triggered, and acquiring at least one piece of target content meeting set conditions;
determining a prediction vocabulary of a region to be filled in a preset document template corresponding to each target content;
determining the distance of a word vector between a keyword corresponding to each item label content and the predicted vocabulary, and determining a target keyword corresponding to the area to be filled according to the distance;
and generating a target file according to the preset file template and the target keyword, and updating the push content pool based on the target content and the target file.
Of course, the storage medium containing the computer-executable instructions provided in the embodiments of the present invention is not limited to the maintenance operation of pushing the content pool as described above, and may also perform related operations in the method for maintaining the pushing content pool provided in any embodiment of the present invention.
An embodiment of the present invention further provides another storage medium containing computer-executable instructions, which when executed by a computer processor, perform a push method, the method including:
acquiring user characteristics of a target user and text characteristics of each piece of content in a preset push content pool;
predicting the click rate of the target user to each piece of content according to the user characteristics and the text characteristics, and determining target content to be pushed based on the click rate;
and acquiring a target file corresponding to the target content in the push content pool, and pushing the target content and the target file to the target user, wherein the target file is automatically generated based on the keywords of the target content and a preset file template.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the push operation described above, and may also perform related operations in the push method provided by any embodiment of the present invention.
Note that storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDRRAM, SRAM, EDORAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors.
The apparatus, the storage medium, and the electronic device for maintaining a pushed content pool provided in the foregoing embodiments may execute the method for maintaining a pushed content pool provided in any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method. For technical details that are not described in detail in the above embodiments, reference may be made to a method for maintaining a push content pool according to any embodiment of the present invention.
The pushing device, the storage medium and the electronic device provided in the above embodiments may execute the pushing method provided in any embodiment of the present invention, and have corresponding functional modules and beneficial effects for executing the method. Technical details that are not described in detail in the above embodiments may be referred to a push method provided in any embodiment of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (11)

1. A method of maintaining a pool of pushed content, comprising:
monitoring that a maintenance event of a push content pool is triggered, and acquiring at least one piece of target content meeting set conditions;
analyzing the context of each target content based on a neural network model in a continuous word bag model form to predict a prediction vocabulary of a region to be filled in a corresponding preset pattern template, wherein the prediction vocabulary is expressed in a word vector form and is marked as a first word vector;
acquiring a keyword corresponding to each item label content;
determining a second word vector corresponding to the keyword by setting a neural network model;
calculating the distance between the first word vector and the second word vector, and determining the alternative second word vector of which the distance is smaller than a set distance threshold;
determining keywords corresponding to the alternative second word vectors as target keywords corresponding to the area to be filled;
and generating a target file according to the preset file template and the target keyword, and updating the push content pool based on the target content and the target file.
2. The method of claim 1, wherein obtaining at least one piece of target content satisfying a set condition comprises:
the method comprises the steps of obtaining an alternative push content set, respectively determining the weight value of each piece of content in the alternative push content set according to timeliness and/or heat, and determining at least one piece of target content according to the weight value.
3. The method according to claim 1, further comprising, after obtaining at least one piece of target content satisfying a set condition:
determining the content association degree of the first target content of each undetermined case and the second target content of the determined case;
and when the association degree exceeds a set threshold value, determining a target file corresponding to the first target content according to a target file corresponding to the second target content.
4. A push method, comprising:
acquiring user characteristics of a target user and text characteristics of each content in a preset push content pool, wherein the preset push content pool is maintained based on the method for maintaining the push content pool in any one of claims 1-3;
predicting the click rate of the target user to each piece of content according to the user characteristics and the text characteristics, and determining target content to be pushed based on the click rate;
and acquiring a target file corresponding to the target content in the push content pool, and pushing the target content and the target file to the target user, wherein the target file is automatically generated based on the keywords of the target content and a preset file template.
5. The method of claim 4, wherein obtaining the user characteristics of the target user and the text characteristics of each piece of content in the preset push content pool comprises:
acquiring user portrait data of a target user on the current day, and determining user characteristics of the target user according to the user portrait data;
and acquiring keywords of each piece of content in the pushed content pool on the current day as the text features of each piece of content.
6. The method of claim 4, wherein predicting click-through rates of the target users for the pieces of content according to the user features and the text features, and determining the target content to be pushed based on the click-through rates comprises:
generating a synthesized feature by calculating a Cartesian product of the user feature and the text feature;
predicting the click rate of the target user on each content based on the synthetic features through a pre-constructed push model, wherein the push model is a neural network model constructed by taking the historical operation data of the pushed content as a training set and taking an extreme gradient lifting algorithm as a classifier;
and sequencing and filtering the contents according to the click rate to obtain target contents.
7. The method of claim 6, wherein pushing the target content and target copy to the target user comprises:
determining an individualized pushing strategy of each target user according to the historical user operation data, wherein the historical user operation data comprise pushing content click time, pushing closing reminding time and unloading behavior data, and the individualized pushing strategy comprises pushing time and pushing number;
and respectively pushing the target content and the target file to each target user according to the personalized pushing strategy.
8. An apparatus for maintaining a pool of pushed content, comprising:
the content acquisition module is used for monitoring that a maintenance event of the pushed content pool is triggered and acquiring at least one piece of target content meeting set conditions;
the vocabulary acquisition module is used for analyzing the context of each target content based on a neural network model in a continuous bag-of-words model form so as to predict the prediction vocabulary of the corresponding region to be filled in the preset pattern template, wherein the prediction vocabulary is expressed in a word vector form and is marked as a first word vector;
the distance determining module is used for acquiring the keywords corresponding to the content of each item label; determining a second word vector corresponding to the keyword by setting a neural network model; calculating the distance between the first word vector and the second word vector, and determining the alternative second word vector of which the distance is smaller than a set distance threshold; determining keywords corresponding to the alternative second word vectors as target keywords corresponding to the area to be filled;
and the content pool updating module is used for generating a target file according to the preset file template and the target keyword and updating the push content pool based on the target content and the target file.
9. A pushing device, comprising:
a characteristic obtaining module, configured to obtain a user characteristic of a target user and a text characteristic of each piece of content in a preset push content pool, where the preset push content pool is maintained based on the method for maintaining the push content pool according to any one of claims 1 to 3;
the click rate prediction module is used for predicting the click rate of the target user on each piece of content according to the user characteristics and the text characteristics and determining target content to be pushed based on the click rate;
and the content pushing module is used for acquiring a target file corresponding to the target content in the pushing content pool and pushing the target content and the target file to the target user, wherein the target file is automatically generated based on the keywords of the target content and a preset file template.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of maintaining a push content pool according to any one of claims 1 to 3, or which, when being executed by a processor, carries out the push method according to any one of claims 4 to 7.
11. A server comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of maintaining a pushed content pool according to any one of claims 1 to 3 when executing the computer program; alternatively, the processor, when executing the computer program, implements the push method according to any of claims 4-7.
CN201910478470.1A 2019-06-03 2019-06-03 Method for maintaining push content pool, push method, device, medium and server Active CN110209941B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910478470.1A CN110209941B (en) 2019-06-03 2019-06-03 Method for maintaining push content pool, push method, device, medium and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910478470.1A CN110209941B (en) 2019-06-03 2019-06-03 Method for maintaining push content pool, push method, device, medium and server

Publications (2)

Publication Number Publication Date
CN110209941A CN110209941A (en) 2019-09-06
CN110209941B true CN110209941B (en) 2021-01-15

Family

ID=67790523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910478470.1A Active CN110209941B (en) 2019-06-03 2019-06-03 Method for maintaining push content pool, push method, device, medium and server

Country Status (1)

Country Link
CN (1) CN110209941B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177160B (en) * 2021-05-25 2024-04-23 上海众源网络有限公司 Push text generation method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477566A (en) * 2009-01-19 2009-07-08 腾讯科技(深圳)有限公司 Method and apparatus used for putting candidate key words advertisement
CN101887415A (en) * 2010-06-24 2010-11-17 西北工业大学 Automatic extraction method for text document theme word meaning
CN107122349A (en) * 2017-04-24 2017-09-01 无锡中科富农物联科技有限公司 A kind of feature word of text extracting method based on word2vec LDA models
CN108052593A (en) * 2017-12-12 2018-05-18 山东科技大学 A kind of subject key words extracting method based on descriptor vector sum network structure
CN108446382A (en) * 2018-03-20 2018-08-24 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN109190017A (en) * 2018-08-02 2019-01-11 腾讯科技(北京)有限公司 Determination method, apparatus, server and the storage medium of hot information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477566A (en) * 2009-01-19 2009-07-08 腾讯科技(深圳)有限公司 Method and apparatus used for putting candidate key words advertisement
CN101887415A (en) * 2010-06-24 2010-11-17 西北工业大学 Automatic extraction method for text document theme word meaning
CN107122349A (en) * 2017-04-24 2017-09-01 无锡中科富农物联科技有限公司 A kind of feature word of text extracting method based on word2vec LDA models
CN108052593A (en) * 2017-12-12 2018-05-18 山东科技大学 A kind of subject key words extracting method based on descriptor vector sum network structure
CN108446382A (en) * 2018-03-20 2018-08-24 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN109190017A (en) * 2018-08-02 2019-01-11 腾讯科技(北京)有限公司 Determination method, apparatus, server and the storage medium of hot information

Also Published As

Publication number Publication date
CN110209941A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
Xu et al. E-commerce product review sentiment classification based on a naïve Bayes continuous learning framework
Swathi et al. An optimal deep learning-based LSTM for stock price prediction using twitter sentiment analysis
Zamani et al. Neural query performance prediction using weak supervision from multiple signals
Smailović et al. Stream-based active learning for sentiment analysis in the financial domain
CN110163647B (en) Data processing method and device
CN104268292B (en) The label Word library updating method of portrait system
US20140122405A1 (en) Information processing apparatus, information processing method, and program
CN110263979B (en) Method and device for predicting sample label based on reinforcement learning model
CN107368521B (en) Knowledge recommendation method and system based on big data and deep learning
CN110390052A (en) Search for recommended method, the training method of CTR prediction model, device and equipment
CN107239564A (en) A kind of text label based on supervision topic model recommends method
Gülle et al. Topic modeling on user stories using word mover's distance
Sivanantham Sentiment analysis on social media for emotional prediction during COVID‐19 pandemic using efficient machine learning approach
Rao et al. A machine learning approach to classify news articles based on location
Rasiman et al. How effective is automated trace link recovery in model-driven development?
Yang et al. Learning topic-oriented word embedding for query classification
CN110209941B (en) Method for maintaining push content pool, push method, device, medium and server
CN116882414B (en) Automatic comment generation method and related device based on large-scale language model
Jeong et al. Discovery of research interests of authors over time using a topic model
Liu et al. Dynamic topic-based sentiment analysis of large-scale online news
Huynh et al. Scalable Nonparametric Bayesian Multilevel Clustering.
CN115860283A (en) Contribution degree prediction method and device based on portrait of knowledge worker
Shankar et al. Digital crop health monitoring by analyzing social media streams
CN113254623B (en) Data processing method, device, server, medium and product
Jishag et al. Automated review analyzing system using sentiment analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant