CN114722267A - Information pushing method and device and server - Google Patents

Information pushing method and device and server Download PDF

Info

Publication number
CN114722267A
CN114722267A CN202110001491.1A CN202110001491A CN114722267A CN 114722267 A CN114722267 A CN 114722267A CN 202110001491 A CN202110001491 A CN 202110001491A CN 114722267 A CN114722267 A CN 114722267A
Authority
CN
China
Prior art keywords
target
sentence
information
importance
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110001491.1A
Other languages
Chinese (zh)
Inventor
郭叶
程印超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Communications Ltd Research Institute filed Critical China Mobile Communications Group Co Ltd
Priority to CN202110001491.1A priority Critical patent/CN114722267A/en
Publication of CN114722267A publication Critical patent/CN114722267A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an information pushing method, an information pushing device and a server, wherein the method comprises the following steps: acquiring push information to be selected; extracting an event from the to-be-selected push information to obtain a key event and a first target sentence related to the key event in the to-be-selected push information; screening the key events according to historical behavior data of the user to obtain target events; generating target information according to a first target sentence corresponding to the target event; and pushing the target information to the user. The generated target information is determined based on the historical behavior data of the user, and the target information pushed by different users is different for the same piece of pushed information to be selected, so that the personalized requirements of the users can be met, the target information can better meet the requirements of the users, and the pushing effect is improved.

Description

Information pushing method and device and server
Technical Field
The embodiment of the invention relates to the technical field of information pushing, in particular to an information pushing method, an information pushing device and a server.
Background
With the progress of society and the development of science and technology, the pace of life of people is faster and faster, meanwhile, people develop towards individuation for consumption, entertainment, life, learning and the like, and the demand and the favor of all people are difficult to meet by a universal content or mode. Especially, at present, information is rapidly developed, and various kinds of information are exponentially increased, because the knowledge background, behavior habits and interest preferences of each person are different, and the content of interest and the reading mode are also different.
At present, news information products generally combine with user behavior preferences to perform information article-level personalized recommendation, which meets the personalized requirements of users to a certain extent, but the pushed contents are the same for the same information article, and the pushing effect is poor.
Disclosure of Invention
The embodiment of the invention provides an information pushing method, an information pushing device and a server, and aims to solve the problem that in the prior art, the pushed information is the same, so that the pushing effect is poor.
In order to solve the problems, the invention is realized as follows:
in a first aspect, an embodiment of the present invention provides an information pushing method, where the method includes:
acquiring push information to be selected;
extracting an event from the to-be-selected push information to obtain a key event and a first target sentence related to the key event in the to-be-selected push information;
screening the key events according to historical behavior data of the user to obtain target events;
generating target information according to a first target sentence corresponding to the target event;
and pushing the target information to the user.
In a second aspect, an embodiment of the present invention further provides an information pushing apparatus, including:
the acquisition module is used for acquiring the push information to be selected;
the event extraction module is used for extracting events from the to-be-selected push information to obtain key events and first target sentences related to the key events in the to-be-selected push information;
the screening module is used for screening the key events according to historical behavior data of the user to obtain target events;
the generating module is used for generating target information according to a first target sentence corresponding to the target event;
and the pushing module is used for pushing the target information to the user.
In a third aspect, an embodiment of the present invention further provides a server, including: a transceiver, a memory, a processor, and a program stored on the memory and executable on the processor; the processor is configured to read the program in the memory to implement the steps of the method according to the first aspect.
In a fourth aspect, the embodiment of the present invention further provides a readable storage medium for storing a program, where the program, when executed by a processor, implements the steps in the method according to the foregoing first aspect.
In the embodiment of the invention, push information to be selected is obtained; extracting an event from the to-be-selected push information to obtain a key event and a first target sentence related to the key event in the to-be-selected push information; screening the key events according to historical behavior data of the user to obtain target events; generating target information according to a first target sentence corresponding to the target event; and pushing the target information to the user. The generated target information is determined based on the historical behavior data of the user, and the target information pushed by different users is different for the same piece of pushed information to be selected, so that the personalized requirements of the users can be met, the target information can better meet the requirements of the users, and the pushing effect is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic flow chart of an information pushing method according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of an information pushing apparatus according to an embodiment of the present invention;
fig. 3 is a second schematic flowchart of an information pushing method according to an embodiment of the present invention;
fig. 4 is another schematic structural diagram of an information pushing apparatus provided in the implementation of the present invention;
fig. 5 is a schematic structural diagram of a server provided in the implementation of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
The terms "first," "second," and the like in the embodiments of the present invention are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Further, as used herein, "and/or" means at least one of the connected objects, e.g., a and/or B and/or C, means 7 cases including a alone, B alone, C alone, and both a and B present, B and C present, both a and C present, and A, B and C present.
Referring to fig. 1, fig. 1 is a schematic flow chart of an information pushing method according to an embodiment of the present invention. The information pushing method shown in fig. 1 may be executed by a server, for example, a server for pushing information may specifically include the following steps:
step 101, obtaining push information to be selected.
The push information to be selected may be push information to be selected determined according to the historical behavior data of the user, for example, when determining that the user frequently reads football information and entertainment news information according to the historical behavior data of the user, the information of the football information and the entertainment news information may be used as the push information to be selected. The push information to be selected may be text information, and may include a plurality of sentences.
Specifically, the historical behavior data of the user may include information reading behavior data, information content data, and other data (such as user basic information). According to the information reading behavior of the user, the information content data and other multidimensional data, the information interest preference of the user can be generated. And according to the user preference, finding the information articles conforming to the user preference from the information base to generate information recommendation lists, wherein each consultation recommendation list corresponds to one piece of push information to be selected.
And 102, extracting an event from the to-be-selected push information to obtain a key event and a first target sentence related to the key event in the to-be-selected push information.
Extracting events from the to-be-selected push information to obtain key events, for example, extracting events according to an actual event extraction algorithm, specifically extracting events based on a labeled corpus, for example, training a machine learning model and a deep learning model based on the labeled corpus, and extracting events from the to-be-selected push information by using the trained machine learning model or deep learning model; event extraction can also be performed by adopting a rule-based template extraction method, such as event extraction based on text semantic rules, text description rules, and the like, and the specifically adopted event extraction method can be flexibly selected according to actual conditions, and is not limited herein.
One or more key events can be obtained by extracting events from the to-be-selected push information, and according to each key event, a first target sentence related to the key event is obtained from the to-be-selected push information, namely the first target sentence is a sentence in the to-be-selected push information, and the first target sentence is associated with the key event, for example, the first target sentence can embody information expressed by the key event. The first target sentence may include one sentence or a plurality of sentences.
And 103, screening the key events according to historical behavior data of the user to obtain target events.
The historical behavior data of the user can comprise information reading behavior data, information content data and other data (such as user basic information and the like) of the user. And screening the key events according to the historical user behavior data to obtain the target events. For example, the information for frequently reading the financing market is determined according to the historical behavior data of the user, the key events determined according to the push information to be selected comprise author resume introduction events and financing market quotation introduction events, and the financing market quotation introduction events can be used as target events when the key events are screened. The target event may be one key event or a plurality of key events.
And step 104, generating target information according to the first target sentence corresponding to the target event.
After the target event is determined, a first target sentence corresponding to the target event may be acquired, and target information may be generated according to the first target sentence, for example, the first target sentence may be directly used as the target information, or the target information may be acquired after performing deduplication processing on a text generated according to the first target sentence, or after performing deduplication processing on a text generated according to the first target sentence, a title of push information to be selected or a picture in the push information to be selected is added, and the target information is acquired.
And 105, pushing the target information to the user.
The user in the application can be understood as a user using the terminal, the target information is pushed to the user, namely the target information is sent to the terminal, the target information can be displayed on the terminal and can be read by the user, and the terminal can be a mobile phone, a tablet computer, a portable computer and other terminal equipment.
In the embodiment, push information to be selected is obtained; extracting an event from the to-be-selected push information to obtain a key event and a first target sentence related to the key event in the to-be-selected push information; screening the key events according to historical behavior data of the user to obtain target events; generating target information according to a first target sentence corresponding to the target event; and pushing the target information to the user. In the above, the push information to be selected is event-extracted, and the extracted key events are screened based on the historical behavior data of the user, so that a first target sentence corresponding to the target event is obtained, and the target information is generated based on the first target sentence, so that the generated target information is determined based on the historical behavior data of the user, and for the same push information to be selected, the target information pushed by different users is different, which can meet the personalized requirements of the user, so that the target information better meets the requirements of the user, and the push effect is improved.
In an embodiment of the present application, after acquiring, in step 101, push information to be selected, and before generating, in step 104, target information according to a first target sentence corresponding to the target event, the method further includes:
acquiring the target importance of each sentence in the to-be-selected push information;
determining a second target sentence according to the target importance of each sentence in the to-be-selected push information, wherein the second target sentence is a sentence in which the target importance of each sentence in the to-be-selected push information is sequenced from high to low according to the importance, the sentences corresponding to the top N target importance are sequenced, and N is a positive integer;
correspondingly, step 104, generating target information according to the first target sentence corresponding to the target event, including:
and generating target information according to the second target sentence and the first target sentence corresponding to the target event.
In this embodiment, the target information is generated not only based on the first target sentence corresponding to the target event, but also based on the second target sentence determined by the target importance of each sentence in the to-be-selected push information. The following describes a process of determining the target importance of a sentence.
In the above, the target importance of the sentence may be understood as the importance of the sentence in the to-be-selected pushed information, and may specifically be determined according to a semantic similarity between the sentence and another sentence, or determined according to a semantic similarity between the sentence and a title of the to-be-selected pushed information, or determined according to a similarity between the sentence and a keyword of the user. Specifically, calculating the target importance of each sentence in the to-be-selected push information includes:
calculating the first importance of each sentence in the to-be-selected push information based on semantic similarity between the sentences;
determining a second importance degree of each sentence according to the similarity between each sentence and the title of the to-be-selected push information;
determining a third importance of each sentence according to the similarity of each sentence and the keywords of the user, wherein the keywords of the user are the keywords of browsing content preference determined according to the historical behavior data;
determining a target importance of each sentence according to the first importance, the second importance and the third importance of each sentence.
In the above, the sentence vector of each sentence in the to-be-selected push information is obtained first, and then the semantic similarity between any two sentences is calculated, specifically, the cosine similarity can be adopted for calculation. And establishing a graph based on sentence similarity, wherein the vertexes in the graph represent sentences, the edges in the graph represent similarity between the sentences, and then calculating and obtaining the importance score of each sentence, namely the first importance based on a textrank algorithm. Namely, calculating the first importance of each sentence in the to-be-selected push information based on semantic similarity between sentences, including:
splitting the to-be-selected push information to obtain a plurality of words;
obtaining word vectors for the plurality of words;
obtaining a sentence vector of a third target sentence according to the word vector of each word in the third target sentence, wherein the third target sentence is any one sentence in the to-be-selected push information;
determining semantic similarity of any two third target sentences according to sentence vectors of any two third target sentences in the to-be-selected push information;
and determining the first importance of each third target sentence according to the semantic similarity of any two third target sentences in the to-be-selected push information.
In the above, the push information to be selected is split to obtain a plurality of words, and for each word, a word vector (word2vec) model or a bert model may be used to obtain a word vector of each word, for example, if a word vector of an ith word in a certain sentence is represented as [ w [ ]i1,wi2,......,win]Then, based on the word vector, a vector representation of each sentence, i.e., a sentence vector, is generated. Sentence vector generation may average each word vector in the sentence, if a sentence vector is denoted as S1,S2,......,Sn]Wherein:
Figure BDA0002881553470000061
in the above, m is the number of word vectors included in a sentence, and n is a vector dimension.
The sentence vectors of any two third target sentences in the to-be-selected push information are obtained through the method, and the semantic similarity between any two third target sentences is calculated based on the sentence vectors, wherein the cosine similarity can be specifically adopted for calculation. Establishing a graph based on sentence similarity, wherein a vertex in the graph represents a third target sentence, an edge in the graph represents the similarity between the third target sentences, and then calculating and obtaining an importance score of each third target sentence based on a TextRank algorithm, namely a first importance, wherein the calculation expression is as follows:
Figure BDA0002881553470000071
wherein, wjkAnd d is a damping coefficient, the value is between 0 and 1, and TR is the first importance of the third target sentence. The stable TR value can be obtained finally by iterative calculation of the formula, n is the dimensionality of a sentence vector, In (v)i) Representing in-graph sum v established based on sentence similarityiSet of vertices with edge connections, Out (v)j) Representing in-graph sum v established based on sentence similarityjA set of vertices with edge connections.
When the second importance of each sentence is calculated, each sentence and the title of the to-be-selected push information can be calculated by adopting the method for calculating the first importance, namely, the semantic similarity between each sentence and the title is calculated, and then the second importance of each sentence is determined based on the similarity.
Similarly, when the third importance of each sentence is calculated, the first importance of each sentence and the keyword of the user may also be calculated in the above manner, that is, the semantic similarity between each sentence and the keyword is calculated first, and then the third importance of each sentence is determined based on the similarity. The keywords of the user are keywords of browsing content preference determined according to the historical behavior data, for example, information interest preference of the user is generated according to historical behavior data such as user information reading behavior and information content, and the keywords are determined based on the interest preference.
The first importance degree and the second importance degree can represent the importance degree of a sentence in the to-be-selected push information, the third importance degree can represent the association degree of the sentence and the interest preference of the user, the target importance degree obtained through the first importance degree, the second importance degree and the third importance degree can comprehensively consider the importance degree of the sentence in the to-be-selected push information and the association degree of the sentence and the interest preference of the user, and the target importance degree can more accurately represent whether the sentence is the sentence meeting the requirements of the user.
When determining the target importance of each sentence according to the first importance, the second importance and the third importance of each sentence, the first importance, the second importance and the third importance may be averaged, or weights may be set for the first importance, the second importance and the third importance, respectively, and the first importance, the second importance and the third importance are weighted and averaged to determine the target importance, which may be specifically selected according to actual situations, and is not limited herein.
Further, in an embodiment of the present application, the generating target information according to the second target sentence and the first target sentence corresponding to the target event includes:
generating an initial text based on the second target sentence and a first target sentence corresponding to the target event;
calculating target similarity between a first sentence and a second sentence in the initial text, wherein the first sentence and the second sentence are any two sentences in the initial text;
if the target similarity is larger than a first preset threshold value, removing the first sentence or the second sentence from the initial text to obtain a target text;
and obtaining the target information according to the target text.
After determining a second target sentence based on the target importance of the sentences in the to-be-selected push information and performing event extraction on the to-be-selected push information to obtain a first target sentence corresponding to the target event, generating an initial text based on the first target sentence and the second target sentence, for example, the order of each target sentence in the initial text may be determined according to the position order of each target sentence in the to-be-selected push information.
In order to further refine the initial text and improve the quality of the initial text, sentence deduplication processing may be performed on the initial text, that is, a target similarity between a first sentence and a second sentence in the initial text is calculated, and if the target similarity is greater than a first preset threshold, it is indicated that the first sentence and the second sentence have a high similarity and may be repeated sentences, the first sentence or the second sentence is removed from the initial text, and a target text is obtained, that is, only the first sentence or the second sentence is retained in the target text, so that sentence redundancy is avoided, and the quality of target information is not affected.
Further, target information is obtained based on the target text, for example, a title of the to-be-selected push information or a picture in the to-be-selected push information is added to the target text, so that the target information is obtained.
Further, in order to improve the quality of the finally obtained target information, not only sentence deduplication processing but also error correction processing can be performed on the initial text. Namely, if the target similarity is greater than a first preset threshold, removing the first sentence or the second sentence from the initial text to obtain a target text, including:
if the target similarity is larger than a first preset threshold value, removing the first sentence or the second sentence from the initial text to obtain an intermediate text;
calculating the probability of a third sentence in the intermediate text, wherein the third sentence is any one sentence in the intermediate text;
and if the probability is smaller than a second preset threshold value, correcting the error of a third sentence in the intermediate text to obtain the target text.
For example, the probability of the third sentence can be calculated by using the N-gram model, and error correction processing is performed on the third sentence corresponding to the sentence with lower probability, so that the accuracy of the sentence in the target text is improved, and the quality of the target text is improved.
Further, in order to improve the hit rate of the target information and improve the recommendation effect, the historical behavior data may be updated based on the behavior data of the user for the target information, that is, after the pushing the target information to the user, the method further includes:
recording current behavior data of the user for operating the target information;
and updating the historical behavior data by adopting the current behavior data.
The historical behavior data can include current behavior data for the target information, such as whether to browse, browsing duration, browsing times, whether to forward the target information, and the like, so that the server can determine whether the target information push is successful according to the current behavior data, adjust the next pushed information according to the current behavior data, continuously optimize the pushed information, and improve the pushing accuracy.
For ease of understanding, examples are illustrated below:
fig. 2 is a block diagram of an information pushing apparatus provided in the present application, where the information pushing apparatus includes the following modules:
a data acquisition module: including information reading behavior data, information content data, and other data (such as user basic information);
the user preference and recommendation calculation module: and generating information interest preference of the user according to the information reading behavior of the user, the information content data and other multidimensional data. According to the user preference, the information article which accords with the user preference is found from the information base to generate an information recommendation list.
The personalized content generation module: and generating personalized information content for the user according to the interest of the user and the information article content in the list. The method comprises three parts of information summary content generation combined with user interests, key event content generation conforming to user attention points and information content smoothing processing.
An application module: personalized information content pushing and displaying module: and pushing the personalized information content generated for the user to the user and displaying the personalized information content.
The data acquisition module comprises the following data:
user behavior data, i.e. information reading behavior: including click, reading time, reading information article position, access times and the like;
information content data: information classification, labels, content, etc.;
other data of the user: for example, the basic information of the user includes basic information data such as age, region, gender and the like, and the data are optional data.
The user preference and recommendation calculation module comprises: a user interest preference calculation module and an information recommendation list generation module.
The user interest preference calculation module is used for generating the information interest preference of the user according to data such as information reading behaviors, information contents and other information of the user.
And the information recommendation list generation module is used for finding the information articles which accord with the user preference from the information base according to the user preference and generating a recommendation article list for the user.
The personalized information content generating module is configured to generate personalized information content for each information article in the recommendation list according to the user interest preference, and an overall flow is shown in fig. 3.
The information summary content generation module combined with the user interests is used for calculating the sentence importance in the information article, comprehensively considering the factors such as the sentence similarity, the similarity with the title, the position of the text and the like, simultaneously fusing the user interests, comprehensively obtaining the sentence importance, extracting the most important K sentences and generating the information summary content fused with the user interests. The method specifically comprises the following steps:
(1) and calculating the importance of the sentence based on the semantic similarity of the sentence.
Step A, obtaining the text contained in the information article. Splitting the text into single sentences and words;
step B, obtaining a word vector of each word in the text by adopting a word2vec model or a bert model and the like, and if the word vector of the ith word in a certain sentence is expressed as [ wi1,wi2,......,win]。
And C, generating a vector representation of each sentence, namely a sentence vector, based on the word vector. The sentence vector generation can be carried out by averaging the vectors of each word in the sentence, and a certain sentence vector is represented as S1,S2,......,Sn]Wherein:
Figure BDA0002881553470000101
d, calculating the similarity between every two sentences based on the sentence vectors, wherein a cosine similarity and other similarity calculation mode can be adopted;
and E, establishing a graph based on sentence similarity, wherein the top points in the graph represent sentences, edges in the graph represent similarity among the sentences, and the importance scores of the sentences are calculated based on TexRrank and used for sorting the importance of the subsequent sentences. The formula is as follows:
Figure BDA0002881553470000111
wherein, wjkAnd d is a damping coefficient, the value is between 0 and 1, and TR is the first importance of the third target sentence. The stable TR value can be obtained finally by iterative calculation of the formula, n is the dimensionality of a sentence vector, In (v)i) Representing in-graph sum v established based on sentence similarityiSet of vertices with edge connections, Out (v)j) Representing in-graph sum v established based on sentence similarityjA set of vertices with edge connections.
(2) And adding factors such as similarity and position of the sentence and the title, and calculating the importance of the sentence. The similarity calculation method can refer to the steps B to D.
(3) And adding a user content preference factor, and calculating the similarity between the sentence and the user content preference keyword to serve as an influence factor for calculating the importance of the sentence.
(4) And (3) calculating the final importance value of the sentence by combining the similarity calculated in the steps (1), (2) and (3).
(5) Selecting K sentences to generate information summary content combined with user interests.
And the key event content generation module is used for extracting key events and original text content descriptions corresponding to the events in the information content according with the user attention points, and finally selecting the key events concerned by the user to generate the part of the information content by combining the user interests.
The event extraction algorithm, such as a non-labeled corpus, can adopt a rule-based template extraction method, such as a text semantic rule, a text description rule and the like; if labeled linguistic data exists, training methods based on machine learning and deep learning, such as a Dynamic Multi-pool Convolutional Neural Network (DMCNN), can be adopted.
The following technical implementation is carried out by taking a text semantic rule template method as an example, and the main scheme is as follows: and performing syntactic dependency relationship analysis and semantic role labeling on the information text, and performing event extraction according to the semantic role analysis and the dependency syntactic analysis result. The specific description is as follows:
and performing word segmentation, part of speech tagging, semantic role tagging and dependency relationship analysis on the information text.
And extracting the trigger words according to the labeling result to form a trigger word library and provide a basis for extracting subsequent events.
Determining whether each sentence in the information content contains a trigger word. If yes, extracting the event phrases with semantic roles A0 and A1 according to the trigger words.
For sentences for which events are not extracted above, short sentences having SBV and VOB dependencies are extracted for each trigger in the sentence according to the result of the dependency syntax analysis.
If only VOB relation exists in the sentence and ATT relation exists to modify the verb, taking a word before the word as a subject to extract; if only the SBV and CMP relation exists in the sentence, extracting a word modified by the CMP relation (CMP dynamic complement relation);
extracting the original text description content corresponding to the event, and generally adopting the context of the sentence where the event is located.
And calculating the similarity between the user interest and the events according to the obtained events, outputting the events with higher similarity and the original text content, and generating the key event content which accords with the user attention point.
And the information content smoothing processing module is used for integrating the generated information summary content combined with the user interest and the key event content conforming to the user attention point to perform smoothing processing, including sentence duplication removal, semantic consistency, text error correction and the like of similar content.
The similar sentence duplication removal can be carried out by calculating the sentence similarity in the steps B-D, filtering the sentences with higher similarity, removing redundant contents and generating the final personalized information content which accords with the user interest.
Semantic consistency and text error correction can be judged by adopting an N-gram language model. Based on the massive text data corpus, the generation probability of the N-gram model information text is utilized to correct the text with lower probability.
The personalized information content application module comprises a personalized information content pushing and displaying module and is used for pushing personalized information content generated for a user to the user and displaying the personalized information content. And evaluating the effect of generating the personalized information content according to the click behavior feedback of the user, and continuously adjusting the algorithm for generating the personalized information content.
The method generates user information content preference according to the information reading behavior and the information content of the user, and generates an information recommendation list according to the user preference; secondly, extracting summary information of the information content and key event information concerned by the user according to each piece of information in the recommendation list and combining with the preference of the user, and preliminarily forming personalized information content; and finally, smoothing the information content, generating final personalized information content, pushing the final personalized information content to a user for displaying and applying, and further extracting important information of the information content and key event information concerned by the user according to the information reading behavior preference of the user on the basis of pushing a personalized information article for the user to generate the personalized information content for the user, so that the information acquisition efficiency of the user is improved, and the value of recommended information is improved.
Referring to fig. 4, fig. 4 is a structural diagram of an information pushing apparatus according to an embodiment of the present invention. As shown in fig. 4, the information pushing apparatus 400 includes:
a first obtaining module 401, configured to obtain push information to be selected;
an event extraction module 402, configured to perform event extraction on the to-be-selected push information to obtain a key event and a first target sentence, related to the key event, in the to-be-selected push information;
the screening module 403 is configured to screen the key event according to historical behavior data of the user to obtain a target event;
a generating module 404, configured to generate target information according to a first target sentence corresponding to the target event;
a pushing module 405, configured to push the target information to the user.
Further, the information pushing apparatus 400 further includes:
the second acquisition module is used for acquiring the target importance of each sentence in the to-be-selected push information;
a determining module, configured to determine a second target sentence according to a target importance of each sentence in the to-be-selected push information, where the second target sentence is a sentence in which the target importance of each sentence in the to-be-selected push information is sorted according to importance degrees from high to low, the sentences corresponding to the top N target importance degrees are sorted, and N is a positive integer;
the generating module 404 is configured to generate the target information according to a second target sentence and a first target sentence corresponding to the target event.
Further, the second obtaining module includes:
the first determining submodule is used for calculating the first importance of each sentence in the to-be-selected push information based on semantic similarity between sentences;
the second determining submodule is used for determining a second importance degree of each sentence according to the similarity between each sentence and the title of the to-be-selected push information;
a third determining submodule, configured to determine a third importance of each sentence according to a similarity between each sentence and a keyword of the user, where the keyword of the user is a keyword of browsing content preference determined according to the historical behavior data;
a fourth determining submodule, configured to determine a target importance of each sentence according to the first importance, the second importance, and the third importance of each sentence.
Further, the generating module 404 includes:
a first generation submodule, configured to generate an initial text based on the second target sentence and a first target sentence corresponding to the target event;
a calculation submodule, configured to calculate a target similarity between a first sentence and a second sentence in the initial text, where the first sentence and the second sentence are any two sentences in the initial text;
a duplication removing submodule, configured to remove the first sentence or the second sentence from the initial text to obtain a target text if the target similarity is greater than a first preset threshold;
and the acquisition submodule is used for acquiring the target information according to the target text.
Further, the de-weight sub-module includes:
a first obtaining unit, configured to remove the first sentence or the second sentence from the initial text to obtain an intermediate text if the target similarity is greater than a first preset threshold;
a calculating unit, configured to calculate a probability of a third sentence in the intermediate text, where the third sentence is any one sentence in the intermediate text;
and the error correction unit is used for correcting the error of the third sentence in the intermediate text to obtain the target text if the probability is smaller than a second preset threshold value.
Further, the first determining sub-module includes:
the splitting unit is used for splitting the push information to be selected to obtain a plurality of words;
a second obtaining unit, configured to obtain word vectors of the multiple words;
a third obtaining unit, configured to obtain a sentence vector of a third target sentence according to a word vector of each word in the third target sentence, where the third target sentence is any one sentence in the to-be-selected push information;
the first determining unit is used for determining semantic similarity of any two third target sentences according to sentence vectors of any two third target sentences in the to-be-selected push information;
and the second determining unit is used for determining the first importance of each third target sentence according to the semantic similarity of any two third target sentences in the to-be-selected push information.
Further, the information pushing apparatus 400 further includes:
recording current behavior data of the user for operating the target information;
and updating the historical behavior data by adopting the current behavior data.
The information pushing apparatus 400 can implement each process in the embodiment of the method in fig. 1 and achieve the same beneficial effects, and is not described herein again to avoid repetition.
The embodiment of the invention also provides a server. Referring to fig. 5, the server may include a processor 901, a memory 902, and a program 9021 stored in the memory 902 and capable of running on the processor 901, where when the program 9021 is executed by the processor 901, any step in the method embodiment corresponding to fig. 1 may be implemented and the same beneficial effect may be achieved, and details are not repeated here.
Those skilled in the art will appreciate that all or part of the steps of the method according to the above embodiments may be implemented by hardware associated with program instructions, and the program may be stored in a readable medium. An embodiment of the present invention further provides a readable storage medium, where a computer program is stored on the readable storage medium, and when the computer program is executed by a processor, any step in the method embodiments corresponding to fig. 1 to fig. 2 may be implemented, and the same technical effect may be achieved, and in order to avoid repetition, details are not repeated here.
The storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. An information pushing method, characterized in that the method comprises:
acquiring push information to be selected;
extracting an event from the to-be-selected push information to obtain a key event and a first target sentence related to the key event in the to-be-selected push information;
screening the key events according to historical behavior data of the user to obtain target events;
generating target information according to a first target sentence corresponding to the target event;
and pushing the target information to the user.
2. The method according to claim 1, wherein after the obtaining of the push information to be selected and before generating the target information according to the first target sentence corresponding to the target event, further comprising:
acquiring the target importance of each sentence in the to-be-selected push information;
determining a second target sentence according to the target importance of each sentence in the to-be-selected push information, wherein the second target sentence is a sentence in which the target importance of each sentence in the to-be-selected push information is sequenced from high to low according to the importance, the sentences corresponding to the top N target importance are sequenced, and N is a positive integer;
generating target information according to a first target sentence corresponding to the target event, wherein the generating of the target information comprises:
and generating the target information according to the second target sentence and the first target sentence corresponding to the target event.
3. The method according to claim 2, wherein the obtaining of the target importance of each sentence in the to-be-selected push information comprises:
calculating the first importance of each sentence in the to-be-selected push information based on semantic similarity between the sentences;
determining a second importance degree of each sentence according to the similarity between each sentence and the title of the to-be-selected push information;
determining a third importance of each sentence according to the similarity of each sentence and the keywords of the user, wherein the keywords of the user are the keywords of browsing content preference determined according to the historical behavior data;
determining a target importance of each sentence according to the first importance, the second importance and the third importance of each sentence.
4. The method according to claim 2, wherein the generating the target information according to the second target sentence and the first target sentence corresponding to the target event comprises:
generating an initial text based on the second target sentence and a first target sentence corresponding to the target event;
calculating target similarity between a first sentence and a second sentence in the initial text, wherein the first sentence and the second sentence are any two sentences in the initial text;
if the target similarity is larger than a first preset threshold value, removing the first sentence or the second sentence from the initial text to obtain a target text;
and obtaining the target information according to the target text.
5. The method according to claim 4, wherein the removing the first sentence or the second sentence from the initial text to obtain the target text if the target similarity is greater than a first preset threshold comprises:
if the target similarity is larger than a first preset threshold value, removing the first sentence or the second sentence from the initial text to obtain an intermediate text;
calculating the probability of a third sentence in the intermediate text, wherein the third sentence is any one sentence in the intermediate text;
and if the probability is smaller than a second preset threshold value, correcting the error of a third sentence in the intermediate text to obtain the target text.
6. The method according to claim 3, wherein the calculating the first importance of each sentence in the to-be-selected push information based on semantic similarity between sentences comprises:
splitting the to-be-selected push information to obtain a plurality of words;
obtaining word vectors for the plurality of words;
obtaining a sentence vector of a third target sentence according to the word vector of each word in the third target sentence, wherein the third target sentence is any one sentence in the to-be-selected push information;
determining semantic similarity of any two third target sentences according to sentence vectors of any two third target sentences in the to-be-selected push information;
and determining the first importance of each third target sentence according to the semantic similarity of any two third target sentences in the to-be-selected push information.
7. The method of claim 1, further comprising, after the pushing the target information to the user:
recording current behavior data of the user for operating the target information;
and updating the historical behavior data by adopting the current behavior data.
8. An information pushing apparatus, comprising:
the acquisition module is used for acquiring the push information to be selected;
the event extraction module is used for extracting events from the to-be-selected push information to obtain key events and first target sentences related to the key events in the to-be-selected push information;
the screening module is used for screening the key events according to historical behavior data of the user to obtain target events;
the generating module is used for generating target information according to a first target sentence corresponding to the target event;
and the pushing module is used for pushing the target information to the user.
9. A server, comprising: a transceiver, a memory, a processor, and a program stored on the memory and executable on the processor; characterized in that the processor is used for reading the program in the memory to realize the steps in the information pushing method according to any one of claims 1-7.
10. A readable storage medium storing a program, wherein the program, when executed by a processor, implements the steps in the information push method according to any one of claims 1 to 7.
CN202110001491.1A 2021-01-04 2021-01-04 Information pushing method and device and server Pending CN114722267A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110001491.1A CN114722267A (en) 2021-01-04 2021-01-04 Information pushing method and device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110001491.1A CN114722267A (en) 2021-01-04 2021-01-04 Information pushing method and device and server

Publications (1)

Publication Number Publication Date
CN114722267A true CN114722267A (en) 2022-07-08

Family

ID=82234224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110001491.1A Pending CN114722267A (en) 2021-01-04 2021-01-04 Information pushing method and device and server

Country Status (1)

Country Link
CN (1) CN114722267A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150221A (en) * 2022-10-09 2023-05-23 浙江博观瑞思科技有限公司 Information interaction method and system for service of enterprise E-business operation management

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150221A (en) * 2022-10-09 2023-05-23 浙江博观瑞思科技有限公司 Information interaction method and system for service of enterprise E-business operation management

Similar Documents

Publication Publication Date Title
CN109284357B (en) Man-machine conversation method, device, electronic equipment and computer readable medium
CN109657054B (en) Abstract generation method, device, server and storage medium
CN108287858B (en) Semantic extraction method and device for natural language
CN108536852B (en) Question-answer interaction method and device, computer equipment and computer readable storage medium
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN111444320B (en) Text retrieval method and device, computer equipment and storage medium
CN112800170A (en) Question matching method and device and question reply method and device
CN111797898B (en) Online comment automatic reply method based on deep semantic matching
EP3707622A1 (en) Generation of text from structured data
US20130159277A1 (en) Target based indexing of micro-blog content
CN110717038B (en) Object classification method and device
CN113569011B (en) Training method, device and equipment of text matching model and storage medium
US10922492B2 (en) Content optimization for audiences
CN111274822A (en) Semantic matching method, device, equipment and storage medium
CN110147494A (en) Information search method, device, storage medium and electronic equipment
CN112749272A (en) Intelligent new energy planning text recommendation method for unstructured data
CN113392305A (en) Keyword extraction method and device, electronic equipment and computer storage medium
CN112163560A (en) Video information processing method and device, electronic equipment and storage medium
CN112231554A (en) Search recommendation word generation method and device, storage medium and computer equipment
US20120239382A1 (en) Recommendation method and recommender computer system using dynamic language model
CN116932730B (en) Document question-answering method and related equipment based on multi-way tree and large-scale language model
CN114722267A (en) Information pushing method and device and server
CN112487151A (en) File generation method and device, storage medium and electronic equipment
CN115455152A (en) Writing material recommendation method and device, electronic equipment and storage medium
US20230042683A1 (en) Identifying and transforming text difficult to understand by user

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination