CN114218930A - Title generation method and device and title generation device - Google Patents

Title generation method and device and title generation device Download PDF

Info

Publication number
CN114218930A
CN114218930A CN202111166929.8A CN202111166929A CN114218930A CN 114218930 A CN114218930 A CN 114218930A CN 202111166929 A CN202111166929 A CN 202111166929A CN 114218930 A CN114218930 A CN 114218930A
Authority
CN
China
Prior art keywords
title
candidate
generation
keywords
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111166929.8A
Other languages
Chinese (zh)
Inventor
涂曼姝
龚能
谢冰茹
马尔胡甫·曼苏尔
祁点点
宋瑞强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN202111166929.8A priority Critical patent/CN114218930A/en
Publication of CN114218930A publication Critical patent/CN114218930A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the invention provides a title generation method and device and a title generation device. The method comprises the following steps: screening hot topics with hot attributes from a data source, and extracting keywords in the hot topics; generating candidate titles based on the keywords and a preset title generation rule, wherein the preset title generation rule comprises any one or more of the following items: title multiplexing rules, title migration rules, title association rules. The embodiment of the invention can generate the candidate titles in multiple angles from multiple aspects to increase the richness and diversity of the generated candidate titles, thereby providing more optional and high-quality candidate titles for the user and further improving the writing efficiency and quality of the auxiliary user.

Description

Title generation method and device and title generation device
Technical Field
The present invention relates to the field of network technologies, and in particular, to a method and an apparatus for generating a title, and an apparatus for generating a title.
Background
With the development of network technology, massive information is gathered in various information flow platforms, massive articles provide nearly infinite choices for readers, writing and publishing thresholds for producers of the articles are reduced, but difficulties of distinguishing from massive articles are increased.
Disclosure of Invention
The embodiment of the invention provides a title generation method and device and a title generation device, which can improve the writing efficiency and quality of an auxiliary user and improve the reading probability of an article.
In order to solve the above problem, an embodiment of the present invention discloses a title generating method, including:
screening hot topics with hot attributes from a data source, and extracting keywords in the hot topics;
generating candidate titles based on the keywords and a preset title generation rule, wherein the preset title generation rule comprises any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
Optionally, the generating the candidate title according to the keyword and the preset title generating rule includes:
retrieving a first reference article related to the keyword based on the keyword;
and inputting the content of the first reference article into a trained first generation model, and outputting candidate titles through the first generation model.
Optionally, the method further comprises:
and generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
Optionally, the generating the candidate title based on the keyword and the preset title generation rule includes:
identifying a first entity word and an event action word in the keyword;
determining a second entity word corresponding to the first entity word;
retrieving a second reference article related to the word sequence based on the word sequence composed of the second entity word and the event action word;
generating a reference title according to the content of the second reference article and a title multiplexing rule;
and determining a target entity word in the reference title, and replacing the target entity word in the reference title with a first entity word to obtain a candidate title.
Optionally, the preset title generation rule includes a title association rule, and the generating a candidate title based on the keyword and the preset title generation rule includes:
determining related topics of the keywords according to entity attributes of the keywords;
inputting the keywords and the associated topics into a trained second generation model, and outputting candidate titles through the second generation model.
Optionally, the method further comprises:
calculating an evaluation score for the generated candidate title based on an evaluation index, wherein the evaluation index comprises at least one of a topicality index, an operability index, a scarcity index, a timeliness index and a thermal value index of the candidate title;
and sorting the generated candidate titles according to the evaluation scores.
Optionally, the screening out the hot topics with the hot attributes from the data source includes:
based on a selection category of a user, screening out a hot topic with a hot attribute related to the selection category from a data source, wherein the selection category comprises at least one of a recreation circle category, an academic category, an international situation category and a department acquaintance category.
Optionally, the method further comprises:
receiving a title recommendation request of a user;
determining a recommended title corresponding to the title recommendation request based on the candidate title;
and returning the recommended title.
On the other hand, the embodiment of the invention discloses a title generation device, which comprises:
the data screening module is used for screening hot topics with hot spot attributes from a data source and extracting key words in the hot topics;
a title generation module, configured to generate a candidate title based on the keyword and a preset title generation rule, where the preset title generation rule includes any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
Optionally, the preset title generation rule includes a title multiplexing rule, and the title generation module includes:
the first retrieval sub-module is used for retrieving a first reference article related to the keyword based on the keyword;
and the first generation submodule is used for inputting the content of the first reference article into a trained first generation model and outputting candidate titles through the first generation model.
Optionally, the apparatus further comprises:
and the abstract outline generating module is used for generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
Optionally, the preset title generation rule includes a title migration rule, and the title generation module includes:
the entity identification submodule is used for identifying a first entity word and an event action word in the keyword;
the entity determining submodule is used for determining a second entity word corresponding to the first entity word;
the second retrieval submodule is used for retrieving a second reference article related to the word sequence based on the word sequence formed by the second entity word and the event action word;
the second generation submodule is used for generating a reference title according to the content of the second reference article and a title multiplexing rule;
and the entity replacing submodule is used for determining the target entity word in the reference title and replacing the target entity word in the reference title with the first entity word to obtain a candidate title.
Optionally, the preset title generation rule includes a title association rule, and the title generation module includes:
the association determining submodule is used for determining the associated topics of the keywords according to the entity attributes of the keywords;
and the third generation submodule is used for inputting the keywords and the associated topics into a trained second generation model, and outputting candidate titles through the second generation model.
Optionally, the apparatus further comprises:
the evaluation calculation module is used for calculating an evaluation score for the generated candidate title based on an evaluation index, wherein the evaluation index comprises at least one of a topicality index, an operability index, a scarcity index, a timeliness index and a thermal value index of the candidate title;
and the title sorting module is used for sorting the generated candidate titles according to the evaluation scores.
Optionally, the data screening module is specifically configured to screen out, from the data source, a hot topic having a hot attribute and related to a selection category based on the selection category of the user, where the selection category includes at least one of an entertainment circle category, an academic category, an international situation category, and a department of academic.
Optionally, the apparatus further comprises:
the request receiving module is used for receiving a title recommendation request of a user;
the recommendation determining module is used for determining a recommended title corresponding to the title recommendation request based on the candidate title;
and the recommendation returning module is used for returning the recommendation title.
In yet another aspect, the present invention discloses an apparatus for title generation, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, and the one or more programs comprise instructions for performing one or more of the title generation methods described above.
In yet another aspect, embodiments of the invention disclose a machine-readable medium having instructions stored thereon, which when executed by one or more processors of an apparatus, cause the apparatus to perform one or more of the title generation methods described above.
The embodiment of the invention has the following advantages:
the embodiment of the invention generates the candidate title based on the extracted keyword and the preset title generation rule. The keywords are extracted by screening hot topics with hot attributes from a data source, have high attention degree at any time, and can ensure timeliness and attention degree of the generated candidate titles. In addition, in the process of generating candidate titles, the embodiment of the invention adopts the preset title generation rule, can select a certain proper title generation rule according to the actual requirement, and can also adopt the combination of a plurality of title generation rules; the multi-angle candidate titles can be generated from multiple aspects to increase the richness and diversity of the generated candidate titles, so that more optional and high-quality candidate titles can be provided for the user, and the writing efficiency and quality of the auxiliary user can be improved. Moreover, whether the title of the article is attractive or not is an important factor influencing whether a reader reads the article or not, so that the attractive force of the generated candidate title can be improved through the embodiment of the invention, and the probability of reading the article can be further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a flow chart of the steps of one embodiment of a title generation method of the present invention;
FIG. 2 is a block diagram of a title generation apparatus according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus 800 for title generation of the present invention;
fig. 4 is a schematic diagram of a server in some embodiments of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms first, second and the like in the description and in the claims of the present invention are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the invention may be practiced other than those illustrated or described herein, and that the objects identified as "first," "second," etc. are generally a class of objects and do not limit the number of objects, e.g., a first object may be one or more. Furthermore, the term "and/or" in the specification and claims is used to describe an association relationship of associated objects, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. The term "plurality" in the embodiments of the present invention means two or more, and other terms are similar thereto.
Method embodiment
Referring to fig. 1, a flowchart illustrating steps of an embodiment of a title generating method according to the present invention is shown, which may specifically include the following steps:
step 101, screening hot topics with hot attributes from a data source, and extracting keywords in the hot topics;
102, generating candidate titles based on the keywords and a preset title generation rule, wherein the preset title generation rule comprises any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
The title generation method provided by the embodiment of the invention can be applied to terminal equipment and also can be applied to a server. Wherein, the terminal device may include but is not limited to: smart terminals, computers, Personal Digital Assistants (PDAs), tablet computers, e-book readers, laptop portable computers, in-vehicle devices, smart televisions, wearable devices, and the like.
The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, cloud communication, Network service, middleware service, Content Delivery Network (CDN), big data and artificial intelligence platform, and the like.
The embodiment of the invention can be applied to scenes for generating titles, such as scenes for generating titles of articles, news and commodities.
The data source refers to a website or an application which can generate text content autonomously and has a hot list search list, the information content can be uploaded by a user or personnel in the website, such as a mainstream media website, hot list lists for the user to view hot information, real-time hot point search and other hot spot sections are provided on the website, and the personnel in the website or other users can upload the hottest information content in real time in the hot spot sections.
It is to be understood that the above examples of data sources are only examples listed for better understanding of the technical solutions of the embodiments of the present invention, and are not to be taken as the only limitation on the embodiments of the present invention. For convenience of description, in the embodiment of the present invention, a hot chart and/or a hot word chart of at least one platform are taken as an example of the data source.
The hot attribute refers to an attribute which has a meaning of generating a title and is high in search heat, search amount and the like of the hot topic in a current period of time.
In some examples, the hotspot attribute may reflect a search topic that has never been overheated, and within a fixed period (e.g., a week or a month, etc.), the search volume suddenly increases (e.g., the search volume within a week exceeds 800 or 1000, etc.), and the search topic enters the hot search list, which may also be considered as a hotspot topic with a hotspot attribute.
The embodiment of the invention screens the hot topics with the hot attributes from the data source and extracts the keywords in the hot topics. The extracted keywords may be named entity words such as place names, person names, etc., or entities that can individually serve as sentence components, etc.
For example, screening a certain hot topic with a hot attribute as "determination of real-person-version actor" in snow white princess "may extract keywords in the hot topic, including: white snow princess, real version, and actors.
In an optional embodiment of the present invention, the screening out the hot topics having the hot attributes from the data source may include:
based on a selection category of a user, screening out a hot topic with a hot attribute related to the selection category from a data source, wherein the selection category comprises at least one of a recreation circle category, an academic category, an international situation category and a department acquaintance category.
In specific implementation, the data source may include different types of hot topics, and according to the ranking lists of the at least one platform, the hot topics N before the comprehensive ranking may be screened out according to the hot ranking list and/or the hot word list and/or the like. In addition, the user can input a selection category to specify the category to which the hot topic needing to be screened belongs. The method and the device can receive the selection category input by the user, screen the hot topics with the hot spot attributes related to the selection category from the data source based on the selection category of the user, and further generate the candidate titles under the finer-grained category according to the selection of the fine category or the selection of the finer-grained category by the user.
The embodiment of the present invention does not limit the specific manner of receiving the selection category input by the user. Optionally, an input interface may be provided to receive user input selecting a category. The input interface can be a text input interface, can receive text content input by a user and identifies a selection category specified by the text content; or, the input interface can be a voice input interface, and can receive voice content input by a user and recognize a selection category specified by the voice content; or, the input interface may be a preset menu, the preset menu includes candidate categories, and the selection operation of the candidate categories by the user may be received to obtain the selection categories of the user.
It should be noted that the aforementioned category of the playcircle, the academic category, the international situation category, and the known product category are only one application example of the present invention, and the specific selection category is not limited in the embodiment of the present invention. Further, after the candidate titles are generated, the generated candidate titles can be divided according to categories to obtain candidate titles in different categories.
And generating candidate titles based on the keywords and a preset title generation rule. The preset title generation rule may include any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
The title multiplexing rule is used for generating candidate titles based on articles related to keywords in the hot topics. The title migration rule is based on the title multiplexing rule, and the title generated by the title multiplexing rule is rewritten to generate a candidate title. The title association rule is to diverge the keywords in the hot topics to obtain the associated topics of the keywords, and generate candidate titles by using the associated topics.
The generated candidate titles can be recommended to the user to assist the user in writing. In practical applications, the candidate title may be generated by any one of the three title generation rules, or by a combination of a plurality of rules.
The embodiment of the invention generates the candidate title based on the extracted keyword and the preset title generation rule. The keywords are extracted by screening hot topics with hot attributes from a data source, have high attention degree at any time, and can ensure timeliness and attention degree of the generated candidate titles. In addition, in the process of generating the candidate titles, the embodiment of the invention adopts the preset title generation rule, can select a certain proper title generation rule according to the actual requirement, can also adopt the combination of various title generation rules, and can generate the multi-angle candidate titles from multiple aspects to increase the richness and diversity of the generated candidate titles, thereby providing more optional and high-quality candidate titles for the user, and further improving the writing efficiency and quality of the auxiliary user. Moreover, whether the title of the article is attractive is an important factor influencing whether people read the article, so that the probability of reading the article can be improved through the embodiment of the invention.
In an optional embodiment of the present invention, the preset title generation rule may include a title multiplexing rule, and the generating a candidate title based on the keyword and the preset title generation rule may include:
step S11, retrieving a first reference article related to the keyword based on the keyword;
and step S12, inputting the content of the first reference article into a trained first generation model, and outputting candidate titles through the first generation model.
The title multiplexing rule is used for generating candidate titles based on articles related to keywords in the hot topics.
Specifically, firstly, hot topics are screened from a hot search list and/or a hot word list of at least one platform, and keywords in the hot topics are extracted. Then, based on the extracted keyword, a first reference article related to the keyword is retrieved.
The first reference article refers to an article which is screened from the retrieved articles related to the keyword and meets a preset condition. The title and/or keywords in the body text of the first reference article match the keywords. Meeting the preset conditions may include, but is not limited to: consumption data (such as the amount of praise, the amount of forwarding, the amount of reply, the amount of reading and the like) of the first reference article meet preset values, or the account level of the first reference article meets preset levels.
And finally, inputting the content of the first reference article into a trained first generation model, and outputting candidate titles through the first generation model.
Embodiments of the present invention may pre-train a first generative model for generating a title. The first generation model may be obtained by supervised training of an existing neural network based on a large number of training samples and machine learning methods. It should be noted that, the embodiment of the present invention does not limit the model structure and the training method of the first generation model. The first generation model may fuse a plurality of neural networks. The neural network includes, but is not limited to, at least one or a combination, superposition, nesting of at least two of the following: CNN (Convolutional Neural Network), LSTM (Long Short-Term Memory) Network, RNN (Simple Recurrent Neural Network), attention Neural Network, and the like.
Specifically, a large number of articles may be collected, keywords may be extracted from the titles and texts of the articles, and the texts of the collected articles, the extracted keywords, and the titles of the articles may be used as training data to train the first generation model.
In one example, suppose the following hot topics are screened from a hot search list of a platform: "white snow princess" real version actor definition ". Extracting the following keywords from the hot topic: white snow princess, real version and actors. Based on the extracted keyword, a first reference article related to the keyword is retrieved, the first reference article having a title of "New Disney princess from! The women with mixed blood will come out from the true edition "white snow princess" to trigger a heat conference. The content (e.g., text) of the first reference article is input into a first generative model from which the following candidate headlines can be output: "the women's internal autumn of mixed blood, ze ge le exuberant edition" white snow princess "led to a heat conference".
Wherein the content of the first reference article may include text of the first reference article. In addition, related information such as a title, a keyword, and the like of the first reference text may be included.
It should be noted that, in the embodiment of the present invention, the number of the screened hot topics is not limited, and the number of the retrieved first reference articles is also not limited. For a screened hot topic, one or more first reference articles can be retrieved based on keywords in the hot topic, and a candidate title can be generated for each first reference article. Thus, one or more candidate titles may be generated for a certain hotspot topic.
In an optional embodiment of the invention, the method may further comprise: and generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
The method for generating the candidate abstract and the candidate outline is not limited in the embodiment of the invention, and the existing method for generating the candidate abstract and the candidate outline can be adopted. For example, a summary generation model may be trained in advance, the content of the first reference article is input into the summary generation model, and the candidate summary is output through the summary generation model. The outline generating model can be trained in advance, the content of the first reference article is input into the outline generating model, and the candidate outline is output through the outline generating model. The abstract generating model may be a neural network model which is good at generating an abstract in the field, and the outline generating model may be a neural network model which is good at generating an outline in the field, which is not limited in the embodiment of the present invention.
Taking the above example as an example, the content of the first reference article retrieved in the above example is input into a summary generation model, and the following candidate summaries can be output through the summary generation model: "true version of Disney-made" white snow princess and seven dwarfs ", evolved from Nyssini Rui autumn Zeogle. Disputes have also been raised following disconey selection of harry bei to demonstrate the small mermaid. The classic character image in our memory is subverted again. "
Further, the contents of the first reference article retrieved in the above example are input to the outline generation model, by which the following candidate outlines can be output:
1. the real version of white snow princess and seven dwarf stubborn knocking out the Ruiqiu-Zeggeler
2. Introduction of birth and past works of Ruiqiu, Zegle and Miller
3. Introduction of the works of the white snow princess
4. Following the evolution of the little mermaid in Harry and Bely, the same controversy is also caused
5. Disney with many net friends
6. About this thing, how do you see?
The generated candidate abstract and/or the candidate outline can also be recommended to the user to assist the user in writing, so that the writing efficiency and quality of the user are further improved.
In an optional embodiment of the present invention, the preset title generation rule may include a title migration rule, and the generating a candidate title based on the keyword and the preset title generation rule may include:
step S21, identifying a first entity word and an event action word in the keyword;
step S22, determining a second entity word corresponding to the first entity word;
step S23, retrieving a second reference article related to the word sequence based on the word sequence formed by the second entity word and the event action word;
step S24, generating a reference title according to the content of the second reference article and a title multiplexing rule;
and step S25, determining the target entity word in the reference title, and replacing the target entity word in the reference title with the first entity word to obtain a candidate title.
The title migration rule is based on the title multiplexing rule, and the title generated by the title multiplexing rule is rewritten to generate a candidate title.
Specifically, firstly, hot topics are screened from a hot search list and/or a hot word list of at least one platform, and keywords in the hot topics are extracted. Then, a first entity word and an event action word in the keyword are identified. Wherein, the first entity word refers to a specific entity word with a specific meaning. The term event action refers to a word containing an event or action.
In one example, suppose the following hot topics are screened from a hot search list of a platform: "Dengguan died for 35 years. The following keywords can be extracted from the hot topic: "Denggao first" and "died away". The keyword "first to live" is identified as a first entity word, which specifically refers to the character "first to live". "dead" is an event action word.
Next, a second entity word corresponding to the first entity word is determined. Wherein, the second entity word refers to the abstract entity word with the meaning of broad reference. For example, for the first entity word "dunga first", since "dunga first" is a well-known scientist in china. "Chinese famous scientists" are abstract physical words with broad meaning. Therefore, it can be determined that the second entity word corresponding to the first entity word "dunga first" is "chinese famous scientist".
In a specific implementation, the second entity word may be abstracted according to entity attributes of the first entity word. For example, the entity attributes of "dunga first" include: academicians of Chinese academy of sciences, nuclear physicists, two medals, etc. The second entity word can be abstracted as Chinese famous scientist from the first entity word 'Denggao'. For another example, the first entity word "Xian Qin terracotta soldiers" includes the entity attributes: the first heritage in the world of China, the eighth curiosity in the world, and the like. The first entity word "Xian Qin terracotta soldiers" can abstract out the ancient trails of the Chinese scenic spots and the historical sites.
After determining a second entity word corresponding to the first entity word, retrieving a second reference article related to the word sequence based on the word sequence formed by the second entity word and the event action word.
For example, a second reference article related to a word sequence is searched based on the word sequence "the death of the Chinese famous scientist" consisting of the second entity word "the Chinese famous scientist" and the event action word "the death".
The second reference article refers to an article which is screened from the retrieved articles related to the word sequence and meets a preset condition. The keywords in the title and/or body text of the second reference article match the keywords in the sequence of words. Meeting the preset conditions may include, but is not limited to: consumption data (such as the amount of praise, the amount of forwarding, the amount of reply, the amount of reading and the like) of the second reference article meet preset values, or the account level of the second reference article meets preset levels.
And finally, generating a reference title according to the content of the second reference article and the title multiplexing rule. And then determining a target entity word in the reference title, and replacing the target entity word in the reference title with the first entity word to obtain a candidate title.
For example, in this example, a second reference article associated with the word sequence "the death of Chinese famous scientist" is retrieved and is entitled "deep belief! Today, Yuan Yong Yao Shi! Why do he go away, why do we feel so sad? ". This second reference was a related article from another famous chinese scientist "yuanlongping" dead world. The content of the second reference article is input into a first generative model, through which the following reference titles can be output: "Yuan Longing Hospital" memorial! Why do he go away, why do we feel so sad? ". And replacing the target entity word in the reference title with the first entity word to obtain a candidate title.
The target entity word refers to the entity word to be replaced in the reference title. For example, in this example, the screened-out hotspot is entitled "dunga first dead 35 years", and the extracted keywords include: "dunga first" and "died away", that is, candidate titles should be generated based on the keywords "dunga first" and "died away". An entity word contained in a reference title generated by a second reference article retrieved after replacing a first entity word with a second entity word is another chinese famous scientist "yuanlong", and the entity word needs to be replaced with the original first entity word "dungao first". Therefore, it may be determined that the entity word "yuanlongpin" in the reference title is the target entity word, and the target entity word is replaced with the first entity word "dungao", so as to obtain the following candidate titles: commemorative Deng Gao Shi! Why do he go away, why do we feel so sad? ".
Optionally, the method may further include: and generating a candidate outline according to the content of the second reference article.
In an optional embodiment of the present invention, the preset title generation rule may include a title association rule, and the generating a candidate title based on the keyword and the preset title generation rule may include:
step S31, determining related topics of the keywords according to entity attributes of the keywords;
step S32, inputting the keyword and the related topic into a trained second generative model, and outputting candidate titles through the second generative model.
The title association rule is to diverge the keywords in the hot topics to obtain the associated topics of the keywords, and generate candidate titles by using the associated topics.
Specifically, firstly, hot topics are screened from a hot search list and/or a hot word list of at least one platform, and keywords in the hot topics are extracted. Then, according to the entity attribute of the keyword, determining the related topic of the keyword.
In the embodiment of the present invention, the entity attribute may be obtained according to a pre-established knowledge graph. The knowledge graph can be divided into a mode layer and a data layer in a logic structure, wherein the data layer mainly comprises a series of facts, and the knowledge is stored by taking the facts as units. If such triples (entity 1, relationship, entity 2), (entity, attribute value) are used to express facts. Knowledge elements such as entities, relationships, attributes and the like can be extracted from some published semi-structured and unstructured data through knowledge extraction technology.
In one example, suppose the following hot topics are screened from a hot search list of a platform: "A and B are exposed in a certain mood". Extracting the following keywords from the hot topic: "A certain", "B certain" and "love". For the keyword "A some," the topic associated with the keyword may be determined.
The embodiment of the invention does not limit the specific way of determining the associated topic. For example, a related topic library may be created in advance, and related topics corresponding to entity attributes of different categories, which are mined in advance, are stored in the related topic library. For example, the entity attributes of the public personae may contain the following related topics: birth, story, work, emotional experience, parents, maternity, etc. As another example, for entity attributes of tourist attractions, the following associated topics may be included: gourmet, journey, strategy, hotel, etc.
In this example, for the keyword "a somebody" whose entity attribute is a public character, it may be determined that the associated topic corresponding to the keyword "a somebody" may include at least one of liveness, event, work, emotional experience, parents, mother school, and the like.
Next, the extracted keywords and the associated topics are input into a trained second generative model, and candidate titles can be output through the second generative model.
In this example, the extracted keyword "a certain" and the associated topic "work" may be input together into the second generative model, and the following candidate titles may be output through the second generative model: "what are you listening to a certain work? ". For another example, the keyword "a certain" and the associated topic "emotional experience" are input into the second generative model together, and the following candidate titles can be output through the second generative model: "wife abandoning poor bran, a certain current practice probably related to his developmental experience? ".
When candidate titles are generated based on the title association rule, the associated topics corresponding to the extracted keywords may be determined, and candidate titles corresponding to different keywords may be generated.
For example, in this example, for the extracted keyword "B certain", assuming that the entity attribute of the keyword is also a public character, it may be determined that the associated topic corresponding to the keyword "B certain" may include at least one of birth, event, work, emotional experience, parents, maternity, and the like, and the keyword "B certain" and the associated topic are input into the second generation model, and the candidate title may be output through the second generation model.
The embodiment of the invention can respectively train the second generation models for generating different topic titles in advance. For example, a second generative model for generating topics related to public characters may be trained. As another example, a second generative model for generating a title for a topic related to tourist attractions may be trained, and so on.
The second generative model may be obtained by supervised training of an existing neural network based on a large number of training samples and machine learning methods. It should be noted that, the embodiment of the present invention does not limit the model structure and the training method of the second generative model. The second generative model may fuse a plurality of neural networks. The neural network includes, but is not limited to, at least one or a combination, superposition, nesting of at least two of the following: CNN, LSTM, RNN, attention neural network, etc.
Specifically, when training the second generative model of a topic title, a large number of related titles of the topic may be collected, keywords may be extracted from the collected related titles, and the collected related titles of the topic and the extracted keywords may be used as training data to train the second generative model of the topic title.
In an optional embodiment of the invention, the method may further comprise:
step S41, calculating an evaluation score for the generated candidate title based on an evaluation index, wherein the evaluation index comprises at least one of a topicality index, an operability index, a scarcity index, a timeliness index and a thermal value index of the candidate title;
and step S42, sorting the generated candidate titles according to the evaluation scores.
In calculating the evaluation score of the candidate title, each evaluation index may be weighted. Wherein the evaluation index may include, but is not limited to, at least one of a topicality index, an operability index, a scarcity index, a timeliness index, and a thermal value index of the candidate title.
The topicality index can measure the search amount of entity words contained in the candidate titles. For example, for the generated candidate title "which are all you hear for work of a? "including the entity" a certain ", the search amount of the entity word" a certain "on each platform can be obtained as the topicality index of the candidate title. The larger the search amount of the entity words included in the candidate title is, the higher the topicality index of the candidate title is.
The operability index can measure the material richness of the entity words contained in the candidate titles. For example, for the generated candidate title "which are all you hear for work of a? ", which includes the entity" a to ", may retrieve material from the repository and/or the network regarding the entity" a to ". The related material amount of the entity "a certain" is used as the operability index of the candidate title. The larger the amount of related material of the entity words contained in the candidate title is, the higher the operability index of the candidate title is.
The scarcity index may measure the saturation of the candidate title. The scarcity index can be measured by the amount of the same or similar titles, and if a large number of titles identical or similar to a candidate title exist, the candidate title is saturated, and if the candidate title is written, the concentration of the reader can be difficult to obtain. The smaller the number of titles that are the same as or similar to a candidate title, the higher the scarcity index of the candidate title.
The timeliness index can measure the timeliness of the candidate title. The timeliness index may be measured by the time of occurrence of the relevant topic of the candidate title. For example, for a generated candidate title "the real version of heusler-majestic-mausler" in heft, "if the relevant topic" the real version of mausler-mausler "for the candidate title determines" the time of occurrence is one year ago, then the candidate title will have a lower timeliness index. If the time of occurrence of the topic associated with the candidate title is three days ago, the candidate title will have a higher timeliness index. The more recent the time of occurrence of the topic related to the candidate title, the higher the timeliness index of the candidate title.
The heat value index can measure the heat of the candidate title. The heat value index can be measured by the search heat of the keywords contained in the candidate title in each platform. The higher the popularity of the keyword contained in the candidate title is, the higher the index of the heating power value of the candidate title is.
Of course, the above evaluation index is only an example of the present invention, and the specific kind and number of the evaluation index are not limited in the embodiment of the present invention.
For candidate titles generated by one or more of the title generation rules, the embodiment of the present invention may calculate an evaluation score for each generated candidate title based on one or more of the evaluation indexes, and may further rank the generated candidate titles according to the evaluation scores. Through the sorting, the high-quality candidate titles with higher comprehensive indexes can be arranged in front, so that the candidate titles with higher quality can be provided for the user.
In an optional embodiment of the invention, the method may further comprise:
step S51, receiving a title recommendation request of a user;
step S52, determining a recommended title corresponding to the title recommendation request based on the candidate title;
and step S53, returning the recommended title.
In the embodiment of the present invention, the title recommendation request may include a title recommendation request without an explicit requirement or a title recommendation request with an explicit requirement.
The title recommendation request without explicit requirement refers to a subject or direction which the user does not determine to write. At this time, according to candidate titles generated offline in advance, the embodiment of the present invention may return the candidate title of N (N is greater than or equal to 1) before ranking to the user as the recommended title according to the ranking of the evaluation scores.
A title recommendation request with explicit requirements refers to a subject or direction that a user has determined to write, and may provide writing requirements. The authoring requirements may include user-provided keywords. The embodiment of the invention can search the candidate titles generated offline in advance according to the keywords provided by the user, and return the matched candidate titles to the user as the recommended titles. If there is no recommended title matching with the keyword provided by the user in the candidate titles generated offline in advance, the candidate titles can be generated online in real time according to the method for generating the candidate titles and based on the keyword provided by the user, and the top N of the candidate titles generated online in real time is returned to the user as the recommended title.
The embodiment of the invention can provide two modes of obtaining the recommended title in an off-line mode and an on-line mode.
The off-line mode is to pre-construct a candidate title library and directly query the candidate title library to obtain a recommended title when a title recommendation request of a user is received. The candidate title library may be created in advance by generating a large number of candidate titles through the method for generating candidate titles described above. The candidate title library can be automatically updated periodically to ensure the timeliness of the candidate titles in the candidate title library.
The online mode is that when a title recommendation request of a user is received, candidate titles are generated online in real time by the method for generating the candidate titles, and a recommended title which meets the title recommendation request of the user is determined from the candidate titles generated in real time.
Optionally, the method may further include: and displaying each evaluation index of the recommended title.
According to the embodiment of the invention, after the recommended title is returned to the user, when the recommended title is displayed in the user interface, various evaluation indexes of the recommended title can be displayed. The user can comprehensively consider various evaluation indexes such as the topicality index, the operability index, the scarcity index, the timeliness index and the thermal value index of the displayed recommended title, and select the required target title.
The embodiment of the invention does not limit the way of displaying the evaluation index. Alternatively, each evaluation index of the recommended title may be graphically displayed. The graphic display mode is more visual, and the user can be assisted to visually judge whether each evaluation index of the recommended title meets the requirement of the user. Of course, each evaluation index of the recommended title can be displayed in any manner such as a numerical manner and a graph manner.
In summary, the embodiment of the present invention generates candidate titles based on the extracted keywords and the preset title generation rule. The keywords are extracted by screening hot topics with hot attributes from a data source, have high attention degree at any time, and can ensure timeliness and attention degree of the generated candidate titles. In addition, in the process of generating candidate titles, the embodiment of the invention adopts the preset title generation rule, can select a certain proper title generation rule according to the actual requirement, and can also adopt the combination of a plurality of title generation rules; the multi-angle candidate titles can be generated from multiple aspects to increase the richness and diversity of the generated candidate titles, so that more optional and high-quality candidate titles can be provided for the user, and the writing efficiency and quality of the auxiliary user can be improved. Moreover, whether the title of the article is attractive or not is an important factor influencing whether a reader reads the article or not, so that the attractive force of the generated candidate title can be improved through the embodiment of the invention, and the probability of reading the article can be further improved.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Device embodiment
Referring to fig. 2, a block diagram of a title generation apparatus according to an embodiment of the present invention is shown, where the apparatus may include:
the data screening module 201 is configured to screen hot topics with hot attributes from a data source, and extract keywords in the hot topics;
a title generating module 202, configured to generate a candidate title based on the keyword and a preset title generating rule, where the preset title generating rule includes any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
Optionally, the preset title generation rule includes a title multiplexing rule, and the title generation module includes:
the first retrieval sub-module is used for retrieving a first reference article related to the keyword based on the keyword;
and the first generation submodule is used for inputting the content of the first reference article into a trained first generation model and outputting candidate titles through the first generation model.
Optionally, the apparatus further comprises:
and the abstract outline generating module is used for generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
Optionally, the preset title generation rule includes a title migration rule, and the title generation module includes:
the entity identification submodule is used for identifying a first entity word and an event action word in the keyword;
the entity determining submodule is used for determining a second entity word corresponding to the first entity word;
the second retrieval submodule is used for retrieving a second reference article related to the word sequence based on the word sequence formed by the second entity word and the event action word;
the second generation submodule is used for generating a reference title according to the content of the second reference article and a title multiplexing rule;
and the entity replacing submodule is used for determining the target entity word in the reference title and replacing the target entity word in the reference title with the first entity word to obtain a candidate title.
Optionally, the preset title generation rule includes a title association rule, and the title generation module includes:
the association determining submodule is used for determining the associated topics of the keywords according to the entity attributes of the keywords;
and the third generation submodule is used for inputting the keywords and the associated topics into a trained second generation model, and outputting candidate titles through the second generation model.
Optionally, the apparatus further comprises:
the evaluation calculation module is used for calculating an evaluation score for the generated candidate title based on an evaluation index, wherein the evaluation index comprises at least one of a topicality index, an operability index, a scarcity index, a timeliness index and a thermal value index of the candidate title;
and the title sorting module is used for sorting the generated candidate titles according to the evaluation scores.
Optionally, the data screening module is specifically configured to screen out, from the data source, a hot topic having a hot attribute and related to a selection category based on the selection category of the user, where the selection category includes at least one of an entertainment circle category, an academic category, an international situation category, and a department of academic.
Optionally, the apparatus further comprises:
the request receiving module is used for receiving a title recommendation request of a user;
the recommendation determining module is used for determining a recommended title corresponding to the title recommendation request based on the candidate title;
and the recommendation returning module is used for returning the recommendation title.
The embodiment of the invention generates the candidate title based on the extracted keyword and the preset title generation rule. The keywords are extracted by screening hot topics with hot attributes from a data source, have high attention degree at any time, and can ensure timeliness and attention degree of the generated candidate titles. In addition, in the process of generating candidate titles, the embodiment of the invention adopts the preset title generation rule, can select a certain proper title generation rule according to the actual requirement, and can also adopt the combination of a plurality of title generation rules; the multi-angle candidate titles can be generated from multiple aspects to increase the richness and diversity of the generated candidate titles, so that more optional and high-quality candidate titles can be provided for the user, and the writing efficiency and quality of the auxiliary user can be improved. Moreover, whether the title of the article is attractive or not is an important factor influencing whether a reader reads the article or not, so that the attractive force of the generated candidate title can be improved through the embodiment of the invention, and the probability of reading the article can be further improved.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
An embodiment of the present invention provides an apparatus for title generation, comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by one or more processors, the one or more programs comprising instructions for: screening hot topics with hot attributes from a data source, and extracting keywords in the hot topics; generating candidate titles based on the keywords and a preset title generation rule, wherein the preset title generation rule comprises any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
Optionally, the generating the candidate title according to the keyword and the preset title generating rule includes:
retrieving a first reference article related to the keyword based on the keyword;
and inputting the content of the first reference article into a trained first generation model, and outputting candidate titles through the first generation model.
Optionally, the device is also configured to execute the one or more programs by one or more processors including instructions for:
and generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
Optionally, the generating the candidate title based on the keyword and the preset title generation rule includes:
identifying a first entity word and an event action word in the keyword;
determining a second entity word corresponding to the first entity word;
retrieving a second reference article related to the word sequence based on the word sequence composed of the second entity word and the event action word;
generating a reference title according to the content of the second reference article and a title multiplexing rule;
and determining a target entity word in the reference title, and replacing the target entity word in the reference title with a first entity word to obtain a candidate title.
Optionally, the preset title generation rule includes a title association rule, and the generating a candidate title based on the keyword and the preset title generation rule includes:
determining related topics of the keywords according to entity attributes of the keywords;
inputting the keywords and the associated topics into a trained second generation model, and outputting candidate titles through the second generation model.
Optionally, the device is also configured to execute the one or more programs by one or more processors including instructions for:
calculating an evaluation score for the generated candidate title based on an evaluation index, wherein the evaluation index comprises at least one of a topicality index, an operability index, a scarcity index, a timeliness index and a thermal value index of the candidate title;
and sorting the generated candidate titles according to the evaluation scores.
Optionally, the screening out the hot topics with the hot attributes from the data source includes:
based on a selection category of a user, screening out a hot topic with a hot attribute related to the selection category from a data source, wherein the selection category comprises at least one of a recreation circle category, an academic category, an international situation category and a department acquaintance category.
Optionally, the device is also configured to execute the one or more programs by one or more processors including instructions for:
receiving a title recommendation request of a user;
determining a recommended title corresponding to the title recommendation request based on the candidate title;
and returning the recommended title.
Fig. 3 is a block diagram illustrating an apparatus 800 for title generation according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.
Referring to fig. 3, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.
The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
Power components 806 provide power to the various components of device 800. The power components 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the apparatus 800.
The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice information processing mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor component 814 may detect the on/off status of the device 800, the relative positioning of components, such as a display and keypad of the apparatus 800, the sensor component 814 may also generate a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on radio frequency information processing (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
Fig. 4 is a schematic diagram of a server in some embodiments of the invention. The server 1900 may vary widely by configuration or performance and may include one or more Central Processing Units (CPUs) 1922 (e.g., one or more processors) and memory 1932, one or more storage media 1930 (e.g., one or more mass storage devices) storing applications 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instructions operating on a server. Still further, a central processor 1922 may be provided in communication with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.
The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
A non-transitory computer-readable storage medium in which instructions, when executed by a processor of an apparatus (server or terminal), enable the apparatus to perform a title generation method shown in fig. 1.
A non-transitory computer-readable storage medium in which instructions, when executed by a processor of an apparatus (server or terminal), enable the apparatus to perform a title generation method, the method comprising: screening hot topics with hot attributes from a data source, and extracting keywords in the hot topics; generating candidate titles based on the keywords and a preset title generation rule, wherein the preset title generation rule comprises any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
The title generation method, the title generation device and the title generation device provided by the invention are described in detail, and specific examples are applied in the text to explain the principle and the implementation of the invention, and the description of the above embodiments is only used to help understanding the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (15)

1. A title generation method, comprising:
screening hot topics with hot attributes from a data source, and extracting keywords in the hot topics;
generating candidate titles based on the keywords and a preset title generation rule, wherein the preset title generation rule comprises any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
2. The method of claim 1, wherein the preset headline generation rule comprises a headline multiplexing rule, and wherein generating candidate headlines based on the keywords and the preset headline generation rule comprises:
retrieving a first reference article related to the keyword based on the keyword;
and inputting the content of the first reference article into a trained first generation model, and outputting candidate titles through the first generation model.
3. The method of claim 2, further comprising:
and generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
4. The method of claim 1, wherein the preset headline generation rule comprises a headline migration rule, and wherein generating candidate headlines based on the keywords and the preset headline generation rule comprises:
identifying a first entity word and an event action word in the keyword;
determining a second entity word corresponding to the first entity word;
retrieving a second reference article related to the word sequence based on the word sequence composed of the second entity word and the event action word;
generating a reference title according to the content of the second reference article and a title multiplexing rule;
and determining a target entity word in the reference title, and replacing the target entity word in the reference title with a first entity word to obtain a candidate title.
5. The method of claim 1, wherein the preset title generation rule comprises a title association rule, and wherein generating candidate titles based on the keywords and the preset title generation rule comprises:
determining related topics of the keywords according to entity attributes of the keywords;
inputting the keywords and the associated topics into a trained second generation model, and outputting candidate titles through the second generation model.
6. The method of claim 1, further comprising:
calculating an evaluation score for the generated candidate title based on an evaluation index, wherein the evaluation index comprises at least one of a topicality index, an operability index, a scarcity index, a timeliness index and a thermal value index of the candidate title;
and sorting the generated candidate titles according to the evaluation scores.
7. The method of claim 1, wherein the screening of hot topics from data sources having hot attributes comprises:
based on a selection category of a user, screening out a hot topic with a hot attribute related to the selection category from a data source, wherein the selection category comprises at least one of a recreation circle category, an academic category, an international situation category and a department acquaintance category.
8. The method of claim 1, further comprising:
receiving a title recommendation request of a user;
determining a recommended title corresponding to the title recommendation request based on the candidate title;
and returning the recommended title.
9. A title generation apparatus, characterized in that the apparatus comprises:
the data screening module is used for screening hot topics with hot spot attributes from a data source and extracting key words in the hot topics;
a title generation module, configured to generate a candidate title based on the keyword and a preset title generation rule, where the preset title generation rule includes any one or more of the following items: title multiplexing rules, title migration rules, title association rules.
10. The apparatus of claim 9, wherein the preset title generation rule comprises a title multiplexing rule, and wherein the title generation module comprises:
the first retrieval sub-module is used for retrieving a first reference article related to the keyword based on the keyword;
and the first generation submodule is used for inputting the content of the first reference article into a trained first generation model and outputting candidate titles through the first generation model.
11. The apparatus of claim 10, further comprising:
and the abstract outline generating module is used for generating a candidate abstract and/or a candidate outline according to the content of the first reference article.
12. The apparatus of claim 9, wherein the preset title generation rule comprises a title migration rule, and wherein the title generation module comprises:
the entity identification submodule is used for identifying a first entity word and an event action word in the keyword;
the entity determining submodule is used for determining a second entity word corresponding to the first entity word;
the second retrieval submodule is used for retrieving a second reference article related to the word sequence based on the word sequence formed by the second entity word and the event action word;
the second generation submodule is used for generating a reference title according to the content of the second reference article and a title multiplexing rule;
and the entity replacing submodule is used for determining the target entity word in the reference title and replacing the target entity word in the reference title with the first entity word to obtain a candidate title.
13. The apparatus of claim 9, wherein the preset title generation rule comprises a title association rule, and wherein the title generation module comprises:
the association determining submodule is used for determining the associated topics of the keywords according to the entity attributes of the keywords;
and the third generation submodule is used for inputting the keywords and the associated topics into a trained second generation model, and outputting candidate titles through the second generation model.
14. An apparatus for title generation comprising a memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing the title generation method of any of claims 1-8.
15. A machine-readable medium having stored thereon instructions which, when executed by one or more processors of an apparatus, cause the apparatus to perform the title generation method of any of claims 1-8.
CN202111166929.8A 2021-09-30 2021-09-30 Title generation method and device and title generation device Pending CN114218930A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111166929.8A CN114218930A (en) 2021-09-30 2021-09-30 Title generation method and device and title generation device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111166929.8A CN114218930A (en) 2021-09-30 2021-09-30 Title generation method and device and title generation device

Publications (1)

Publication Number Publication Date
CN114218930A true CN114218930A (en) 2022-03-22

Family

ID=80696055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111166929.8A Pending CN114218930A (en) 2021-09-30 2021-09-30 Title generation method and device and title generation device

Country Status (1)

Country Link
CN (1) CN114218930A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116306514A (en) * 2023-05-22 2023-06-23 北京搜狐新媒体信息技术有限公司 Text processing method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116306514A (en) * 2023-05-22 2023-06-23 北京搜狐新媒体信息技术有限公司 Text processing method and device, electronic equipment and storage medium
CN116306514B (en) * 2023-05-22 2023-09-08 北京搜狐新媒体信息技术有限公司 Text processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20230306052A1 (en) Method and system for entity extraction and disambiguation
CN105701254B (en) Information processing method and device for information processing
US20170097984A1 (en) Method and system for generating a knowledge representation
US11042590B2 (en) Methods, systems and techniques for personalized search query suggestions
US11080287B2 (en) Methods, systems and techniques for ranking blended content retrieved from multiple disparate content sources
KR20160010416A (en) Customizable, real time intelligence channel
WO2017205036A1 (en) Task completion using world knowledge
US20170109339A1 (en) Application program activation method, user terminal, and server
CN111708943B (en) Search result display method and device for displaying search result
CN107341162B (en) Webpage processing method and device and webpage processing device
US11232522B2 (en) Methods, systems and techniques for blending online content from multiple disparate content sources including a personal content source or a semi-personal content source
US20170098012A1 (en) Methods, systems and techniques for ranking personalized and generic search query suggestions
Crestani et al. Mobile information retrieval
US11836169B2 (en) Methods, systems and techniques for providing search query suggestions based on non-personal data and user personal data according to availability of user personal data
JP2023520483A (en) SEARCH CONTENT DISPLAY METHOD, DEVICE, ELECTRONIC DEVICE, AND STORAGE MEDIUM
CN111580921A (en) Content creation method and device
CN112445389A (en) Sharing prompt method, device, client, server and storage medium
CN109582869A (en) A kind of data processing method, device and the device for data processing
CN112784142A (en) Information recommendation method and device
CN113705210A (en) Article outline generation method and device for generating article outline
CN110929176A (en) Information recommendation method and device and electronic equipment
CN114218930A (en) Title generation method and device and title generation device
CN110399468A (en) A kind of data processing method, device and the device for data processing
CN107291259B (en) Information display method and device for information display
CN107301188B (en) Method for acquiring user interest and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination