CN107977363B - Title generation method and device and electronic equipment - Google Patents

Title generation method and device and electronic equipment Download PDF

Info

Publication number
CN107977363B
CN107977363B CN201711384836.6A CN201711384836A CN107977363B CN 107977363 B CN107977363 B CN 107977363B CN 201711384836 A CN201711384836 A CN 201711384836A CN 107977363 B CN107977363 B CN 107977363B
Authority
CN
China
Prior art keywords
title
model
text
scoring
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711384836.6A
Other languages
Chinese (zh)
Other versions
CN107977363A (en
Inventor
陈笑
何径舟
周古月
付志宏
袁德璋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711384836.6A priority Critical patent/CN107977363B/en
Publication of CN107977363A publication Critical patent/CN107977363A/en
Application granted granted Critical
Publication of CN107977363B publication Critical patent/CN107977363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a title generation method, a title generation device and electronic equipment, wherein the title generation method comprises the following steps: acquiring a text of a title to be generated, and segmenting the text into a plurality of clauses; acquiring characteristic information of a plurality of clauses; inputting the characteristic information into the title support sentence model to extract at least one title support sentence; inputting at least one title support sentence into a title generation model to generate a corresponding title; and scoring the generated titles based on the title scoring model, and determining the titles corresponding to the texts according to the scores of the titles. According to the title generation method and device and the electronic equipment, the labor cost is reduced, the efficiency and the timeliness are improved, and the requirements of optimizing the title and improving the click rate can be met.

Description

Title generation method and device and electronic equipment
Technical Field
The present invention relates to the field of information processing technologies, and in particular, to a title generation method and apparatus, and an electronic device.
Background
With the rapid development of internet technology, there is a general demand for improving the flow of high-quality content on an internet content platform. For massive content data, how to attract users to click and browse is the most important thing to generate a high-quality title for the content. However, as the threshold of content producers is lowered and the production speed of content is greatly increased, it is difficult to ensure the quality of titles. Therefore, the title needs to be rewritten to improve the appeal to the user. At present, the title is mainly rewritten manually. However, manual rewriting is inefficient, costly, and inefficient in timeliness, and the difference between the rewritten title and the original title is small, which makes it difficult to satisfy the demands of improving the quality of the title and increasing the click rate.
Disclosure of Invention
The invention provides a title generation method, a title generation device and electronic equipment, and aims to solve at least one of the technical problems.
The embodiment of the invention provides a title generation method, which comprises the following steps:
acquiring a text of a title to be generated, and dividing the text into a plurality of clauses;
acquiring characteristic information of the multiple clauses;
inputting the characteristic information into a title support sentence model to extract at least one title support sentence;
inputting the at least one title support sentence into a title generation model to generate a corresponding title;
and scoring the generated title based on a title scoring model, and determining the title corresponding to the text according to the score of the title.
Optionally, the text is divided into a plurality of clauses, including:
based on the whole sentence granularity or the clause granularity, the text is divided into a plurality of clauses.
Optionally, the feature information includes at least one of length information, position information, importance information, and similarity information.
Optionally, the title support sentence model comprises a decision tree GBDT model.
Optionally, the method further includes:
and training the title scoring model.
Optionally, training the title scoring model includes:
acquiring a title sample and click data corresponding to the title sample;
and training the title scoring model according to the title samples and the click data.
Optionally, the title scoring model comprises a deep neural network DNN model.
Optionally, the method further includes:
and training the title generation model.
Optionally, the title generation model is a seq2seq model.
Optionally, determining the title corresponding to the text according to the score of the title includes:
filtering titles with scores lower than a preset score;
and sequencing the filtered titles, and determining the title corresponding to the text according to the sequencing result.
Another embodiment of the present invention provides a title generating apparatus, including:
the segmentation module is used for acquiring a text of a title to be generated and segmenting the text into a plurality of clauses;
the obtaining module is used for obtaining the characteristic information of the multiple clauses;
the extraction module is used for inputting the characteristic information into the title support sentence model so as to extract at least one title support sentence;
the generating module is used for inputting the at least one title supporting sentence into a title generating model so as to generate a corresponding title;
and the determining module is used for scoring the generated title based on the title scoring model and determining the title corresponding to the text according to the score of the title.
Optionally, the dividing module is configured to:
based on the whole sentence granularity or the clause granularity, the text is divided into a plurality of clauses.
Optionally, the feature information includes at least one of length information, position information, importance information, and similarity information.
Optionally, the title support sentence model comprises a decision tree GBDT model.
Optionally, the apparatus further comprises:
and the first training module is used for training the title scoring model.
Optionally, the first training module is configured to:
acquiring a title sample and click data corresponding to the title sample;
and training the title scoring model according to the title samples and the click data.
Optionally, the title scoring model comprises a deep neural network DNN model.
Optionally, the apparatus further comprises:
and the second training module is used for training the title generation model.
Optionally, the title generation model is a seq2seq model.
Optionally, the determining module is configured to:
filtering titles with scores lower than a preset score;
and sequencing the filtered titles, and determining the title corresponding to the text according to the sequencing result.
Yet another embodiment of the present invention provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the title generation method according to the first embodiment of the present invention.
Yet another embodiment of the present invention provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and executable on the processor, where the processor is configured to execute the title generation method described in the first embodiment of the present invention.
The technical scheme provided by the embodiment of the invention has the following beneficial effects: the method comprises the steps of obtaining a text of a title to be generated, dividing the text into a plurality of clauses, obtaining characteristic information of the plurality of clauses, inputting the characteristic information into a title supporting sentence model to extract at least one title supporting sentence, inputting the at least one title supporting sentence into a title generating model to generate a corresponding title, scoring the generated title based on a title scoring model, determining the title corresponding to the text according to the score of the title, reducing labor cost, improving efficiency and timeliness, and meeting the requirements of optimizing the title and improving click rate.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow diagram of a title generation method according to one embodiment of the invention;
fig. 2 is a flowchart of a title generation method according to another embodiment of the present invention;
FIG. 3 is a flow chart of a title generation method according to yet another embodiment of the present invention;
fig. 4 is a block diagram of the structure of a title generation apparatus according to an embodiment of the present invention;
fig. 5 is a block diagram of a title generation apparatus according to another embodiment of the present invention;
fig. 6 is a block diagram of a title generation apparatus according to still another embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
A title generation method, apparatus, and electronic device of an embodiment of the present invention are described below with reference to the accompanying drawings.
The internet content platform generally has the demand for improving the flow of high-quality content. For massive content data, how to attract users to click and browse is the most important thing to generate a high-quality title for the content. However, as the threshold of content producers is lowered and the production speed of content is greatly increased, it is difficult to ensure the quality of titles. On the other hand, it is difficult for a content producer to efficiently evaluate the attractiveness of a title to a user. Therefore, the title needs to be rewritten to improve the appeal to the user. How to help content producers generate a high-quality title that attracts users for high-quality content is an important topic of the internet content platform. At present, the title is mainly rewritten manually. However, manual rewriting is inefficient, costly, and inefficient in timeliness, and the difference between the rewritten title and the original title is small, which makes it difficult to satisfy the demands of improving the quality of the title and increasing the click rate. Therefore, the invention provides a title rewriting method, which generates a title with higher quality and more attractive to users by rewriting the original title of the high-quality content, improves the flow of the high-quality content and realizes the maximization of the value of the high-quality content. The application of the invention includes, but is not limited to, the news domain.
Fig. 1 is a flowchart of a title generation method according to an embodiment of the present invention.
As shown in fig. 1, the title generation method includes:
s101, obtaining a text of a title to be generated, and segmenting the text into a plurality of clauses.
In an embodiment of the present invention, a text of a title to be generated may be obtained first, and then the text is segmented into a plurality of clauses based on the whole sentence granularity or the clause granularity. Wherein, the clauses with the whole sentence granularity are mainly divided by punctuations which use sentence numbers, question marks, exclamation marks, ellipses and the like to represent the end of a sentence; clause granularity is mainly based on the whole sentence granularity, punctuation marks of different clauses separating the same sentence by commas, colons, spaces and the like are used for clause division.
It should be noted that, when text is cut, rules such as that punctuation marks in the title number are not used as clauses, that ASCii code strings cannot be cut, and the like are also included.
S102, obtaining characteristic information of a plurality of clauses.
Wherein the feature information may include at least one of length information, location information, importance information, and similarity information.
The length information, i.e., sentence length, can be normalized using, but is not limited to, the maximum value of the sentence length.
The position information refers to the position of the sentence in the text. The following encoding schemes may be used but are not limited to: and carrying out 0-1 coding according to the first segment, the middle segment, the last segment, the first sentence, the middle sentence and the last sentence.
The importance information is used to evaluate the importance of the sentence in the Text, and can be calculated by using, but not limited to, a Text Rank algorithm.
The similarity information is similarity between the sentence and the title, and literal similarity and semantic similarity can be used but are not limited thereto. The literal similarity may include the number and proportion of co-occurring words, the edit distance, the maximum common substring length, the maximum common subsequence length, the number and proportion of co-occurring words weighted by the inverse document frequency, and the above features after synonymy alignment. The semantic similarity is obtained by encoding sentences in an embedding imbedding mode and then calculating cosine values or calculating by using a trained model.
It should be understood that the similarity information may be of full sentence size or of clause size. When the clause granularity is adopted, the similarity between each clause of the sentence in the text and each clause in the title can be calculated, and the maximum value and the average value of the similarities are used as the similarity information of the clause granularity.
In addition, for feature information of continuous values, such as importance information, similarity information, and the like, it is also possible to sort the feature values of the feature information and then take the sorted result as a discretization feature.
S103, inputting the characteristic information into the title support sentence model to extract at least one title support sentence.
Wherein the title support sentence model comprises a Decision Tree (GBDT) model.
In an embodiment of the present invention, the obtained feature information of the clauses may be used as an input and input to a pre-trained title supporting sentence model, so as to extract at least one title supporting sentence.
S104, inputting at least one title support sentence into the title generation model to generate a corresponding title.
After that, the extracted title support sentence is input to the title generation model as an input, so as to generate a corresponding title.
And S105, scoring the generated titles based on the title scoring model, and determining the titles corresponding to the texts according to the scores of the titles.
After the titles are generated, the generated titles can be scored based on a title scoring model, the titles with the scores lower than the preset score are filtered, the filtered titles are ranked, and then the titles corresponding to the texts are determined according to the ranking result. Wherein the preset score can be the score of the existing title of the text. If the score of the newly generated title is lower than that of the original title, the title does not need to be rewritten, so that the titles with the scores lower than the preset score are filtered, and the better title is selected from the titles with the scores higher.
According to the title generation method, the text of the title to be generated is obtained, the text is divided into the multiple clauses, the characteristic information of the multiple clauses is obtained, the characteristic information is input into the title supporting sentence model to extract at least one title supporting sentence, at least one title supporting sentence is input into the title generation model to generate the corresponding title, the generated title is scored based on the title scoring model, the title corresponding to the text is determined according to the score of the title, the labor cost is reduced, the efficiency and the timeliness are improved, and the requirement of optimizing the title and improving the click rate can be met.
To implement the foregoing embodiment, as shown in fig. 2, the title generating method according to the embodiment of the present invention may further include:
and S106, training a title scoring model.
In one embodiment of the invention, click data of the title sample corresponding to the title sample can be obtained, and then the title scoring model is trained according to the title sample and the click data.
The following description will be made in detail by taking news titles as examples:
and scoring the news headlines, namely calculating Click-Through-Rate (CTR) corresponding to the news headlines. If x is the number of clicks and y is the number of clicks, the click rate CTR corresponding to the news headline is defined as
Figure BDA0001516395060000051
Where t is a scaling factor for the unchecked part, t ∈ (0, 1)]In order to keep the positive (click) to negative (no click) ratios from being too different. Inputting the click through rate CTR and the news title into the title scoring model, and outputting the real CTR, namely
Figure BDA0001516395060000061
Wherein the title scoring model comprises a Deep Neural Network (DNN) model.
To implement the foregoing embodiment, as shown in fig. 3, the title generating method according to the embodiment of the present invention may further include:
and S107, training a title generation model.
Wherein, the title generation model may be seq2seq model of tensoflow.
Wherein, tensorflow is an open source software library used for machine learning of various perception and language understanding tasks. The seq2seq model is a Sequence-to-Sequence model.
A seq2seq model is realized, ten million magnitude news data can be adopted, a title supporting sentence of news is used as input, an original title of the news is used as model output, and therefore the model is trained.
In order to implement the foregoing embodiment, the present invention further provides a title generating apparatus, and fig. 4 is a block diagram of a title generating apparatus according to an embodiment of the present invention, as shown in fig. 4, the apparatus includes a dividing module 410, an obtaining module 420, an extracting module 430, a generating module 440, and a determining module 450.
The segmenting module 410 is configured to obtain a text of a title to be generated, and segment the text into a plurality of clauses.
An obtaining module 420, configured to obtain feature information of the multiple clauses.
The extracting module 430 is configured to input the feature information into the title supporting sentence model to extract at least one title supporting sentence.
The generating module 440 is configured to input at least one title supporting sentence into the title generating model to generate a corresponding title.
And the determining module 450 is configured to score the generated title based on the title scoring model, and determine the title corresponding to the text according to the score of the title.
As shown in fig. 5, the title generating apparatus of the present invention may further include a first training module 460.
The first training module 460 is used for training the title scoring model.
As shown in fig. 6, the title generating apparatus of the present invention may further include a second training module 470.
And a second training module 470 for training the title generation model.
It should be noted that the foregoing explanation of the title generation method is also applicable to the title generation apparatus in the embodiment of the present invention, and details not disclosed in the embodiment of the present invention are not repeated herein.
The title generation device of the embodiment of the invention has the advantages that the text of the title to be generated is obtained, the text is divided into the multiple clauses, the characteristic information of the multiple clauses is obtained, the characteristic information is input into the title supporting sentence model to extract at least one title supporting sentence, at least one title supporting sentence is input into the title generation model to generate the corresponding title, the generated title is scored based on the title scoring model, the title corresponding to the text is determined according to the score of the title, the labor cost is reduced, the efficiency and the timeliness are improved, and the requirement of optimizing the title and improving the click rate can be met.
In order to implement the above embodiments, the present invention further provides an electronic device.
The electronic device comprises a processor, a memory and a computer program stored on the memory and executable on the processor, the processor being configured to perform the title generation method according to the embodiments of the first aspect of the present invention.
For example, the computer program may be executed by a processor to perform a title generation method of the steps of:
s101', obtaining a text of a title to be generated, and segmenting the text into a plurality of clauses.
S102', obtaining characteristic information of a plurality of clauses.
S103', inputting the characteristic information into the title supporting sentence model to extract at least one title supporting sentence.
S104', inputting at least one title supporting sentence into the title generation model to generate a corresponding title.
And S105', scoring the generated title based on the title scoring model, and determining the title corresponding to the text according to the score of the title.
According to the electronic device provided by the embodiment of the invention, the text of the title to be generated is obtained, the text is divided into the multiple clauses, the characteristic information of the multiple clauses is obtained, the characteristic information is input into the title supporting sentence model to extract at least one title supporting sentence, at least one title supporting sentence is input into the title generating model to generate the corresponding title, the generated title is scored based on the title scoring model, the title corresponding to the text is determined according to the score of the title, the labor cost is reduced, the efficiency and the timeliness are improved, and the requirement of optimizing the title and improving the click rate can be met.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware that is related to instructions of a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (16)

1. A title generation method, comprising:
acquiring a text of a title to be generated, and dividing the text into a plurality of clauses;
acquiring feature information of the multiple clauses, wherein the feature information comprises at least one of length information, position information, importance information and similarity information;
inputting the characteristic information into a title support sentence model to extract at least one title support sentence;
inputting the at least one title support sentence into a title generation model to generate a corresponding title;
scoring the generated titles based on a title scoring model, and determining the titles corresponding to the texts according to the scores of the titles; the method comprises the following steps: training the title scoring model, the training the title scoring model, further comprising: acquiring a title sample and click data corresponding to the title sample; training the title scoring model according to the title samples and the click data; the scoring the generated title based on the title scoring model comprises: and calculating the click rate corresponding to the title, and inputting the click rate and the title into a title scoring model.
2. The method of claim 1, wherein the text is segmented into a plurality of clauses, comprising:
based on the whole sentence granularity or the clause granularity, the text is divided into a plurality of clauses.
3. The method of claim 1, wherein the title support sentence model comprises a decision tree GBDT model.
4. The method of claim 1, in which the title scoring model comprises a Deep Neural Network (DNN) model.
5. The method of claim 1, further comprising:
and training the title generation model.
6. The method of claim 5, wherein the title generation model is a seq2seq model.
7. The method of claim 1, wherein determining the title to which the text corresponds based on the score of the title comprises:
filtering titles with scores lower than a preset score;
and sequencing the filtered titles, and determining the title corresponding to the text according to the sequencing result.
8. A title generation apparatus, comprising:
the segmentation module is used for acquiring a text of a title to be generated and segmenting the text into a plurality of clauses;
an obtaining module, configured to obtain feature information of the multiple clauses, where the feature information includes at least one of length information, position information, importance information, and similarity information;
the extraction module is used for inputting the characteristic information into the title support sentence model so as to extract at least one title support sentence;
the generating module is used for inputting the at least one title supporting sentence into a title generating model so as to generate a corresponding title;
the determining module is used for scoring the generated title based on the title scoring model and determining the title corresponding to the text according to the score of the title; further comprising:
the first training module is used for training the title scoring model; the first training module is to: acquiring a title sample and click data corresponding to the title sample; training the title scoring model according to the title samples and the click data; the scoring the generated title based on the title scoring model comprises: and calculating the click rate corresponding to the title, and inputting the click rate and the title into a title scoring model.
9. The apparatus of claim 8, wherein the slicing module is to:
based on the whole sentence granularity or the clause granularity, the text is divided into a plurality of clauses.
10. The apparatus of claim 8, wherein the title support sentence model comprises a decision tree GBDT model.
11. The apparatus of claim 8, in which the title scoring model comprises a Deep Neural Network (DNN) model.
12. The apparatus of claim 8, further comprising:
and the second training module is used for training the title generation model.
13. The apparatus of claim 12, wherein the title generation model is a seq2seq model.
14. The apparatus of claim 8, wherein the determination module is to:
filtering titles with scores lower than a preset score;
and sequencing the filtered titles, and determining the title corresponding to the text according to the sequencing result.
15. A computer-readable storage medium on which a computer program is stored, the program being characterized by implementing the title generation method of any one of claims 1 to 7 when executed by a processor.
16. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the title generation method of any of claims 1-7 via execution of the executable instructions.
CN201711384836.6A 2017-12-20 2017-12-20 Title generation method and device and electronic equipment Active CN107977363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711384836.6A CN107977363B (en) 2017-12-20 2017-12-20 Title generation method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711384836.6A CN107977363B (en) 2017-12-20 2017-12-20 Title generation method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN107977363A CN107977363A (en) 2018-05-01
CN107977363B true CN107977363B (en) 2021-12-17

Family

ID=62006947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711384836.6A Active CN107977363B (en) 2017-12-20 2017-12-20 Title generation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN107977363B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832299B (en) * 2017-11-17 2021-11-23 北京百度网讯科技有限公司 Title rewriting processing method and device based on artificial intelligence and readable medium
CN109472028B (en) * 2018-10-31 2023-12-15 北京字节跳动网络技术有限公司 Method and device for generating information
CN111209725B (en) * 2018-11-19 2023-04-25 阿里巴巴集团控股有限公司 Text information generation method and device and computing equipment
CN109299477A (en) * 2018-11-30 2019-02-01 北京字节跳动网络技术有限公司 Method and apparatus for generating text header
CN110287491B (en) * 2019-06-25 2024-01-12 北京百度网讯科技有限公司 Event name generation method and device
CN110532344A (en) * 2019-08-06 2019-12-03 北京如优教育科技有限公司 Automatic Selected Topic System based on deep neural network model
CN110795930A (en) * 2019-10-24 2020-02-14 网娱互动科技(北京)股份有限公司 Article title optimization method, system, medium and equipment
CN110852801B (en) * 2019-11-08 2022-09-09 北京字节跳动网络技术有限公司 Information processing method, device and equipment
CN110851797A (en) * 2020-01-13 2020-02-28 支付宝(杭州)信息技术有限公司 Block chain-based work creation method and device and electronic equipment
CN111460801B (en) * 2020-03-30 2023-08-18 北京百度网讯科技有限公司 Title generation method and device and electronic equipment
CN112560458A (en) * 2020-12-09 2021-03-26 杭州艾耕科技有限公司 Article title generation method based on end-to-end deep learning model

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004029968A (en) * 2002-06-21 2004-01-29 Advanced Telecommunication Research Institute International Method for generating topic estimation model and topic estimation method
WO2013185856A1 (en) * 2012-06-15 2013-12-19 Qatar Foundation Joint topic model for cross-media news summarization
CN104331449A (en) * 2014-10-29 2015-02-04 百度在线网络技术(北京)有限公司 Method and device for determining similarity between inquiry sentence and webpage, terminal and server
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
CN106557554A (en) * 2016-11-04 2017-04-05 北京百度网讯科技有限公司 Display packing and device based on the Search Results of artificial intelligence
CN107193792A (en) * 2017-05-18 2017-09-22 北京百度网讯科技有限公司 The method and apparatus of generation article based on artificial intelligence
CN107463552A (en) * 2017-07-20 2017-12-12 北京奇艺世纪科技有限公司 A kind of method and apparatus for generating video subject title
CN107832299A (en) * 2017-11-17 2018-03-23 北京百度网讯科技有限公司 Rewriting processing method, device and the computer-readable recording medium of title based on artificial intelligence

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9792534B2 (en) * 2016-01-13 2017-10-17 Adobe Systems Incorporated Semantic natural language vector space

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004029968A (en) * 2002-06-21 2004-01-29 Advanced Telecommunication Research Institute International Method for generating topic estimation model and topic estimation method
WO2013185856A1 (en) * 2012-06-15 2013-12-19 Qatar Foundation Joint topic model for cross-media news summarization
CN104331449A (en) * 2014-10-29 2015-02-04 百度在线网络技术(北京)有限公司 Method and device for determining similarity between inquiry sentence and webpage, terminal and server
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
CN106557554A (en) * 2016-11-04 2017-04-05 北京百度网讯科技有限公司 Display packing and device based on the Search Results of artificial intelligence
CN107193792A (en) * 2017-05-18 2017-09-22 北京百度网讯科技有限公司 The method and apparatus of generation article based on artificial intelligence
CN107463552A (en) * 2017-07-20 2017-12-12 北京奇艺世纪科技有限公司 A kind of method and apparatus for generating video subject title
CN107832299A (en) * 2017-11-17 2018-03-23 北京百度网讯科技有限公司 Rewriting processing method, device and the computer-readable recording medium of title based on artificial intelligence

Also Published As

Publication number Publication date
CN107977363A (en) 2018-05-01

Similar Documents

Publication Publication Date Title
CN107977363B (en) Title generation method and device and electronic equipment
CN108009228B (en) Method and device for setting content label and storage medium
CN108287922B (en) Text data viewpoint abstract mining method fusing topic attributes and emotional information
CN109710947B (en) Electric power professional word bank generation method and device
CN103473263B (en) News event development process-oriented visual display method
CN111241267A (en) Abstract extraction and abstract extraction model training method, related device and storage medium
CN1617134A (en) System for identifying paraphrases using machine translation techniques
CN109446423B (en) System and method for judging sentiment of news and texts
CN110309114B (en) Method and device for processing media information, storage medium and electronic device
CN109086355B (en) Hot-spot association relation analysis method and system based on news subject term
CN112256861B (en) Rumor detection method based on search engine return result and electronic device
CN109902289A (en) A kind of news video topic division method towards fuzzy text mining
CN108052630B (en) Method for extracting expansion words based on Chinese education videos
CN111460162B (en) Text classification method and device, terminal equipment and computer readable storage medium
CN113780007A (en) Corpus screening method, intention recognition model optimization method, equipment and storage medium
CN108228612B (en) Method and device for extracting network event keywords and emotional tendency
CN115630640A (en) Intelligent writing method, device, equipment and medium
CN111552773A (en) Method and system for searching key sentence of question or not in reading and understanding task
KR102376489B1 (en) Text document cluster and topic generation apparatus and method thereof
CN112667815A (en) Text processing method and device, computer readable storage medium and processor
CN109299463B (en) Emotion score calculation method and related equipment
CN116882414B (en) Automatic comment generation method and related device based on large-scale language model
CN110738047A (en) Microblog user interest mining method and system based on image-text data and time effect
CN108694165B (en) Cross-domain dual emotion analysis method for product comments
CN113705217B (en) Literature recommendation method and device for knowledge learning in electric power field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant