CN107977363A - Title generation method, device and electronic equipment - Google Patents

Title generation method, device and electronic equipment Download PDF

Info

Publication number
CN107977363A
CN107977363A CN201711384836.6A CN201711384836A CN107977363A CN 107977363 A CN107977363 A CN 107977363A CN 201711384836 A CN201711384836 A CN 201711384836A CN 107977363 A CN107977363 A CN 107977363A
Authority
CN
China
Prior art keywords
title
sentence
model
text
generation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711384836.6A
Other languages
Chinese (zh)
Other versions
CN107977363B (en
Inventor
陈笑
何径舟
周古月
付志宏
袁德璋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201711384836.6A priority Critical patent/CN107977363B/en
Publication of CN107977363A publication Critical patent/CN107977363A/en
Application granted granted Critical
Publication of CN107977363B publication Critical patent/CN107977363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of title generation method, device and electronic equipment, wherein, title generation method includes:The text of title to be generated is obtained, and is multiple subordinate sentences by text dividing;Obtain the characteristic information of multiple subordinate sentences;Characteristic information is inputted to title and supports sentence model, to extract at least one title support sentence;At least one title support sentence is inputted to title and generates model, to generate corresponding title;Given a mark based on title scoring model to the title of generation, and the corresponding title of text is determined according to the score of title.Title generation method, device and the electronic equipment of the embodiment of the present invention, reduce cost of labor, improve efficiency and timeliness, and disclosure satisfy that optimization title improves the demand of clicking rate.

Description

Title generation method, device and electronic equipment
Technical field
The present invention relates to technical field of information processing, more particularly to a kind of title generation method, device and electronic equipment.
Background technology
With the high speed development of Internet technology, internet content platform generally existing improves the need of premium content flow Ask.For the content-data of magnanimity, how to attract user to click on and browse, it is most important be exactly for content generation one it is excellent The title of matter.But as the threshold of contents producer reduces, the speed of production of content is substantially improved, the quality of title is difficult to It is protected.Therefore, it is necessary to be rewritten to title, to improve the attraction to user.At present, it is main still by artificial Mode rewrites title.However, the efficiency manually rewritten is low, of high cost, poor in timeliness, and revised title and original Title difference is smaller, it is difficult to meets to improve title quality, increases the demand of clicking rate.
The content of the invention
The present invention provides a kind of title generation method, device and electronic equipment, to solve in above-mentioned technical problem at least One.
The embodiment of the present invention provides a kind of title generation method, including:
The text of title to be generated is obtained, and is multiple subordinate sentences by the text dividing;
Obtain the characteristic information of the multiple subordinate sentence;
The characteristic information is inputted to title and supports sentence model, to extract at least one title support sentence;
At least one title support sentence is inputted to title and generates model, to generate corresponding title;
Given a mark based on title scoring model to the title of generation, and determine that the text corresponds to according to the score of title Title.
Optionally, the text dividing is multiple subordinate sentences, including:
It is multiple subordinate sentences by the text dividing based on whole sentence granularity or clause's granularity.
Optionally, the characteristic information is included in length information, positional information, material information and similarity information It is at least one.
Optionally, the title support sentence model includes decision tree GBDT models.
Optionally, the method further includes:
The training title scoring model.
Optionally, the training title scoring model, including:
Obtain title sample click data corresponding with the title sample;
The title scoring model is trained according to the title sample and the click data.
Optionally, the title scoring model includes deep neural network DNN models.
Optionally, the method further includes:
The training title generation model.
Optionally, the title generation model is seq2seq models.
Optionally, the corresponding title of the text is determined according to the score of title, including:
Filter the title that score is less than preset fraction;
Title after filtering is ranked up, and the corresponding title of the text is determined according to ranking results.
Another embodiment of the present invention provides a kind of title generating means, including:
Cutting module, is multiple subordinate sentences for obtaining the text of title to be generated, and by the text dividing;
Acquisition module, for obtaining the characteristic information of the multiple subordinate sentence;
Abstraction module, supports sentence model, to extract at least one title for inputting the characteristic information to title Support sentence;
Generation module, model is generated for inputting at least one title support sentence to title, corresponding to generate Title;
Determining module, for being given a mark based on title scoring model to the title of generation, and it is true according to the score of title Determine the corresponding title of the text.
Optionally, the cutting module, is used for:
It is multiple subordinate sentences by the text dividing based on whole sentence granularity or clause's granularity.
Optionally, the characteristic information is included in length information, positional information, material information and similarity information It is at least one.
Optionally, the title support sentence model includes decision tree GBDT models.
Optionally, device further includes:
First training module, for training the title scoring model.
Optionally, first training module, is used for:
Obtain title sample click data corresponding with the title sample;
The title scoring model is trained according to the title sample and the click data.
Optionally, the title scoring model includes deep neural network DNN models.
Optionally, device further includes:
Second training module, for training the title generation model.
Optionally, the title generation model is seq2seq models.
Optionally, the determining module, is used for:
Filter the title that score is less than preset fraction;
Title after filtering is ranked up, and the corresponding title of the text is determined according to ranking results.
A further embodiment of the present invention provides a kind of non-transitorycomputer readable storage medium, is stored thereon with computer journey Sequence, realizes the title generation method as described in first aspect present invention embodiment when which is executed by processor.
Further embodiment of this invention provides a kind of electronic equipment, including processor, memory and is stored in the memory Computer program that is upper and can running on the processor, the processor are used to perform first aspect present invention embodiment institute The title generation method stated.
Technical solution provided in an embodiment of the present invention can include the following benefits:By the text for obtaining title to be generated This, and be multiple subordinate sentences by the text dividing, and the characteristic information of the multiple subordinate sentence of acquisition, and by the characteristic information Input to title supports sentence model, supports sentence to extract at least one title, and at least one title is supported sentence Input to title generates model, to generate corresponding title, and is given a mark to the title of generation based on title scoring model, And the corresponding title of the text is determined according to the score of title, cost of labor is reduced, improves efficiency and timeliness, and It disclosure satisfy that optimization title improves the demand of clicking rate.
The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.
Brief description of the drawings
Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein:
Fig. 1 is the flow chart of title generation method according to an embodiment of the invention;
Fig. 2 is the flow chart of title generation method in accordance with another embodiment of the present invention;
Fig. 3 is the flow chart of the title generation method of another embodiment according to the present invention;
Fig. 4 is the structure diagram of title generating means according to an embodiment of the invention;
Fig. 5 is the structure diagram of title generating means in accordance with another embodiment of the present invention;
Fig. 6 is the structure diagram of the title generating means of another embodiment according to the present invention.
Embodiment
The embodiment of the present invention is described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or has the function of same or like element.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.
Below with reference to the accompanying drawings title generation method, device and the electronic equipment of the embodiment of the present invention are described.
Internet content platform generally existing improves the demand of premium content flow.For the content-data of magnanimity, how User can be attracted to click on and browse, most important is exactly to generate a good title for content.But with contents production The threshold of person reduces, the speed of production of content is substantially improved, and the quality of title is difficult to be protected.On the other hand, contents production Person is difficult to effectively assess Attraction Degree of the title to user.Therefore, it is necessary to be rewritten to title, to improve the attraction to user Power.How the help content producer generates a good title for attracting user for premium content, is internet content platform An important topic.At present, mainly still title is rewritten by artificial mode.However, the efficiency manually rewritten Low, of high cost, poor in timeliness, and revised title and former title difference are smaller, it is difficult to meet to improve title quality, increase The demand of clicking rate.For this reason, the present invention proposes a kind of title Improvement, and by rewriting the original header of premium content, generation One title that is more high-quality, more attracting user, improves premium content flow, realizes the maximization of premium content value.The present invention Application direction include but are not limited to News Field.
Fig. 1 is the flow chart of title generation method according to an embodiment of the invention.
As shown in Figure 1, the title generation method includes:
S101, obtains the text of title to be generated, and is multiple subordinate sentences by text dividing.
In one embodiment of the invention, the text of title to be generated can be obtained first, be then based on whole sentence granularity or Clause's granularity, is multiple subordinate sentences by text dividing.Wherein, the subordinate sentence of whole sentence granularity, mainly by using fullstop, question mark, exclamation Number, the punctuation mark that in short terminates of the expression such as ellipsis carry out subordinate sentence;The subordinate sentence of clause's granularity, predominantly in whole sentence granularity On the basis of, the punctuation mark that the different clauses of same a word are separated using comma, colon, space etc. carries out subordinate sentence.
It should be noted that when carrying out cutting to text, following rule is further included, such as the punctuation mark in punctuation marks used to enclose the title not Accorded with as subordinate sentence, ASCii code characters string cannot be split.
S102, obtains the characteristic information of multiple subordinate sentences.
Wherein, characteristic information may include in length information, positional information, material information and similarity information at least It is a kind of.
Length information, that is, sentence length, can be used but not limited to the maximum normalization result of sentence length.
Positional information refers to the position of sentence in the text.It can be used but not limited to following coding mode:By first section, centre Section, latter end, first sentence, middle sentence, last sentence carry out 0-1 codings.
Material information is used to assess the importance of sentence in the text, can be used but not limited to Text Rank algorithms It is calculated.
Similarity of the similarity information between sentence and title, can be used but not limited to literal similitude and semantic phase Like property.Wherein, literal similitude may include co-occurrence word number and proportion, editing distance, maximum public substring length, maximum It is above-mentioned after common subsequence length, the co-occurrence word number of anti-document frequency weighting and proportion and the synonymous alignment of use Feature.Semantic Similarity then encodes sentence using embedded embedding modes, then calculate cosine cosine values or Calculated and obtained using trained model.
Can be whole sentence granularity or clause's granularity it should be appreciated that to similarity information.Using son During sentence granularity, the similarity between each clause in each clause and title of the sentence in text can be calculated, is made The similarity information of clause's granularity is used as by the use of their maximum and average value.
In addition, characteristic information such as material information and similarity information for successive value etc., can also be to these features The characteristic value of information is ranked up, then using ranking results as discretized features.
S103, characteristic information is inputted to title and supports sentence model, to extract at least one title support sentence.
Wherein, title support sentence model includes decision tree (GBDT, Gradient Boosting Decision Tree) mould Type.
In one embodiment of the invention, can be inputted using the characteristic information of the subordinate sentence got as input to advance Trained title supports sentence model, so as to extract at least one title support sentence.
S104, at least one title support sentence is inputted to title and generates model, to generate corresponding title.
After this, then using the title extracted support sentence as input, input to title generation model, so as to generate pair The title answered.
S105, gives a mark the title of generation based on title scoring model, and determines text pair according to the score of title The title answered.
After title is generated, it can be given a mark based on title scoring model to the title of generation, filter out score and be less than The title of preset fraction, then the title after filtering is ranked up, then the corresponding title of text is determined according to ranking results.Its In, preset fraction can be text headed score.If the score of newly-generated title is less than former headed score, So with regard to It is not necessary to rewritten to title, therefore the title that score is less than preset fraction is filtered out, then from the high mark of score More preferably title is chosen in topic.
The title generation method of the embodiment of the present invention, is more by obtaining the text of title to be generated, and by text dividing A subordinate sentence, and obtain the characteristic information of multiple subordinate sentences, and characteristic information is inputted to title and supports sentence model, with extract to Few title support sentence, and at least one title support sentence is inputted to title and generates model, to generate corresponding title, And given a mark based on title scoring model to the title of generation, and the corresponding title of text is determined according to the score of title, Cost of labor is reduced, improves efficiency and timeliness, and disclosure satisfy that optimization title improves the demand of clicking rate.
To realize above-described embodiment, as shown in Fig. 2, the title generation method of the embodiment of the present invention, may also include:
S106, training title scoring model.
In one embodiment of the invention, title sample click data corresponding with title sample, Ran Hougen can be obtained According to title sample and click data training title scoring model.
It is described in detail below by taking headline as an example:
Give a mark for headline, that is, calculate the corresponding clicking rate (CTR, Click-Through-Rate) of headline.It is false If x=hits, y=does not click on number, then the corresponding clicking rate CTR of headline isWherein, t is not click on portion The zoom factor divided, and t ∈ (0,1] it is to make just (click) negative (not clicking on) ratio excessively not greatly different.It will click on rate CTR With headline as inputting, input to title scoring model, and then export real CTR, i.e.,
Wherein, title scoring model includes deep neural network (DNN, Deep Neural Network) model.
To realize above-described embodiment, as shown in figure 3, the title generation method of the embodiment of the present invention, may also include:
S107, training title generation model.
Wherein, title generation model can be the seq2seq models of tensorflow.
Wherein, tensorflow is an open source software storehouse, for various perception and the machine learning of language understanding task. Seq2seq models, that is, sequence is to sequence Sequence-to-Sequence models.
To realize a seq2seq model, ten million magnitude news data can be used, the title support sentence of news is used as input, The original header of news is exported as model, carrys out training pattern with this.
In order to realize above-described embodiment, the invention also provides a kind of title generating means, Fig. 4 is one according to the present invention The structure diagram of the title generating means of embodiment, as shown in figure 4, the device includes cutting module 410, acquisition module 420, takes out Modulus block 430, generation module 440 and determining module 450.
Wherein, cutting module 410, is multiple subordinate sentences for obtaining the text of title to be generated, and by text dividing.
Acquisition module 420, for obtaining the characteristic information of multiple subordinate sentences.
Abstraction module 430, supports sentence model, to extract at least one title branch for inputting characteristic information to title Support sentence.
Generation module 440, generates model, to generate corresponding mark for inputting at least one title support sentence to title Topic.
Determining module 450, for being given a mark based on title scoring model to the title of generation, and according to the score of title Determine the corresponding title of text.
As shown in figure 5, the title generating means of the present invention, may also include the first training module 460.
First training module 460, for training title scoring model.
As shown in fig. 6, the title generating means of the present invention, may also include the second training module 470.
Second training module 470, for training title to generate model.
It should be noted that the foregoing explanation to title generation method, the title of the embodiment of the present invention is also applied for Generating means, unpub details in the embodiment of the present invention, details are not described herein.
The title generating means of the embodiment of the present invention, are more by obtaining the text of title to be generated, and by text dividing A subordinate sentence, and obtain the characteristic information of multiple subordinate sentences, and characteristic information is inputted to title and supports sentence model, with extract to Few title support sentence, and at least one title support sentence is inputted to title and generates model, to generate corresponding title, And given a mark based on title scoring model to the title of generation, and the corresponding title of text is determined according to the score of title, Cost of labor is reduced, improves efficiency and timeliness, and disclosure satisfy that optimization title improves the demand of clicking rate.
In order to realize above-described embodiment, the invention also provides a kind of electronic equipment.
The computer journey that electronic equipment includes processor, memory and storage on a memory and can run on a processor Sequence, processor are used for the title generation method for performing first aspect present invention embodiment.
For example, computer program can be executed by processor to complete the title generation method of following steps:
S101 ', obtains the text of title to be generated, and is multiple subordinate sentences by text dividing.
S102 ', obtains the characteristic information of multiple subordinate sentences.
S103 ', characteristic information is inputted to title and supports sentence model, to extract at least one title support sentence.
S104 ', at least one title support sentence is inputted to title and generates model, to generate corresponding title.
S105 ', gives a mark the title of generation based on title scoring model, and determines text pair according to the score of title The title answered.
The electronic equipment of the embodiment of the present invention, is multiple points by obtaining the text of title to be generated, and by text dividing Sentence, and the characteristic information of multiple subordinate sentences is obtained, and characteristic information is inputted to title and supports sentence model, to extract at least one A title supports sentence, and at least one title support sentence is inputted to title and generates model, to generate corresponding title, and Given a mark based on title scoring model to the title of generation, and the corresponding title of text is determined according to the score of title, reduced Cost of labor, improves efficiency and timeliness, and disclosure satisfy that optimization title improves the demand of clicking rate.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment of the present invention or example.In the present specification, schematic expression of the above terms is not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office Combined in an appropriate manner in one or more embodiments or example.In addition, without conflicting with each other, the skill of this area Art personnel can be tied the different embodiments or example described in this specification and different embodiments or exemplary feature Close and combine.
In addition, term " first ", " second " are only used for description purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, " multiple " are meant that at least two, such as two, three It is a etc., unless otherwise specifically defined.
Any process or method described otherwise above description in flow chart or herein is construed as, and represents to include Module, fragment or the portion of the code of the executable instruction of one or more the step of being used for realization specific logical function or process Point, and the scope of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discuss suitable Sequence, including according to involved function by it is basic at the same time in the way of or in the opposite order, carry out perform function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system including the system of processor or other can be held from instruction The system of row system, device or equipment instruction fetch and execute instruction) use, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium " can any can be included, store, communicate, propagate or pass Defeated program is for instruction execution system, device or equipment or the dress used with reference to these instruction execution systems, device or equipment Put.The more specifically example (non-exhaustive list) of computer-readable medium includes following:Electricity with one or more wiring Connecting portion (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only storage (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device, and portable optic disk is read-only deposits Reservoir (CDROM).In addition, computer-readable medium can even is that can on it the paper of print routine or other suitable be situated between Matter, because can be for example by carrying out optical scanner to paper or other media, then into edlin, interpretation or if necessary with other Suitable method is handled electronically to obtain program, is then stored in computer storage.
It should be appreciated that each several part of the present invention can be realized with hardware, software, firmware or combinations thereof.Above-mentioned In embodiment, software that multiple steps or method can be performed in memory and by suitable instruction execution system with storage Or firmware is realized.If, and in another embodiment, can be with well known in the art for example, realized with hardware Any one of row technology or their combination are realized:With the logic gates for realizing logic function to data-signal Discrete logic, have suitable combinational logic gate circuit application-specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are appreciated that to realize all or part of step that above-described embodiment method carries Suddenly be that relevant hardware can be instructed to complete by program, program can be stored in a kind of computer-readable recording medium In, the program upon execution, including one or a combination set of the step of embodiment of the method.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing module, can also That unit is individually physically present, can also two or more units be integrated in a module.Above-mentioned integrated mould Block can both be realized in the form of hardware, can also be realized in the form of software function module.If integrated module with The form of software function module realize and be used as independent production marketing or in use, can also be stored in one it is computer-readable Take in storage medium.
Storage medium mentioned above can be read-only storage, disk or CD etc..Although have been shown and retouch above The embodiment of the present invention is stated, it is to be understood that above-described embodiment is exemplary, it is impossible to be interpreted as the limit to the present invention System, those of ordinary skill in the art can be changed above-described embodiment, change, replace and become within the scope of the invention Type.

Claims (22)

  1. A kind of 1. title generation method, it is characterised in that including:
    The text of title to be generated is obtained, and is multiple subordinate sentences by the text dividing;
    Obtain the characteristic information of the multiple subordinate sentence;
    The characteristic information is inputted to title and supports sentence model, to extract at least one title support sentence;
    At least one title support sentence is inputted to title and generates model, to generate corresponding title;
    Given a mark based on title scoring model to the title of generation, and the corresponding mark of the text is determined according to the score of title Topic.
  2. 2. the method as described in claim 1, it is characterised in that the text dividing is multiple subordinate sentences, including:
    It is multiple subordinate sentences by the text dividing based on whole sentence granularity or clause's granularity.
  3. 3. the method as described in claim 1, it is characterised in that the characteristic information includes length information, positional information, important Property at least one of information and similarity information.
  4. 4. the method as described in claim 1, it is characterised in that the title support sentence model includes decision tree GBDT models.
  5. 5. the method as described in claim 1, it is characterised in that further include:
    The training title scoring model.
  6. 6. method as claimed in claim 5, it is characterised in that the training title scoring model, including:
    Obtain title sample click data corresponding with the title sample;
    The title scoring model is trained according to the title sample and the click data.
  7. 7. the method as described in claim 5 or 6, it is characterised in that the title scoring model includes deep neural network DNN Model.
  8. 8. the method as described in claim 1, it is characterised in that further include:
    The training title generation model.
  9. 9. method as claimed in claim 8, it is characterised in that the title generation model is seq2seq models.
  10. 10. the method as described in claim 1, it is characterised in that the corresponding title of the text is determined according to the score of title, Including:
    Filter the title that score is less than preset fraction;
    Title after filtering is ranked up, and the corresponding title of the text is determined according to ranking results.
  11. A kind of 11. title generating means, it is characterised in that including:
    Cutting module, is multiple subordinate sentences for obtaining the text of title to be generated, and by the text dividing;
    Acquisition module, for obtaining the characteristic information of the multiple subordinate sentence;
    Abstraction module, sentence model is supported for inputting the characteristic information to title, is supported with extracting at least one title Sentence;
    Generation module, generates model, to generate corresponding title for inputting at least one title support sentence to title;
    Determining module, for giving a mark based on title scoring model to the title of generation, and determines institute according to the score of title State the corresponding title of text.
  12. 12. device as claimed in claim 11, it is characterised in that the cutting module, is used for:
    It is multiple subordinate sentences by the text dividing based on whole sentence granularity or clause's granularity.
  13. 13. device as claimed in claim 11, it is characterised in that the characteristic information includes length information, positional information, again At least one of the property wanted information and similarity information.
  14. 14. device as claimed in claim 11, it is characterised in that the title support sentence model includes decision tree GBDT moulds Type.
  15. 15. device as claimed in claim 11, it is characterised in that further include:
    First training module, for training the title scoring model.
  16. 16. device as claimed in claim 15, it is characterised in that first training module, is used for:
    Obtain title sample click data corresponding with the title sample;
    The title scoring model is trained according to the title sample and the click data.
  17. 17. the device as described in claim 15 or 16, it is characterised in that the title scoring model includes deep neural network DNN models.
  18. 18. device as claimed in claim 11, it is characterised in that further include:
    Second training module, for training the title generation model.
  19. 19. device as claimed in claim 18, it is characterised in that the title generation model is seq2seq models.
  20. 20. device as claimed in claim 11, it is characterised in that the determining module, is used for:
    Filter the title that score is less than preset fraction;
    Title after filtering is ranked up, and the corresponding title of the text is determined according to ranking results.
  21. 21. a kind of computer-readable recording medium, is stored thereon with computer program, it is characterised in that the program is by processor Claim 1~10 any one of them title generation method is realized during execution.
  22. 22. a kind of electronic equipment, it is characterised in that including:
    Processor;And
    Memory, for storing the executable instruction of the processor;
    Wherein, the processor is configured to come described in perform claim 1~10 any one of requirement via the execution executable instruction Title generation method.
CN201711384836.6A 2017-12-20 2017-12-20 Title generation method and device and electronic equipment Active CN107977363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711384836.6A CN107977363B (en) 2017-12-20 2017-12-20 Title generation method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711384836.6A CN107977363B (en) 2017-12-20 2017-12-20 Title generation method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN107977363A true CN107977363A (en) 2018-05-01
CN107977363B CN107977363B (en) 2021-12-17

Family

ID=62006947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711384836.6A Active CN107977363B (en) 2017-12-20 2017-12-20 Title generation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN107977363B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832299A (en) * 2017-11-17 2018-03-23 北京百度网讯科技有限公司 Rewriting processing method, device and the computer-readable recording medium of title based on artificial intelligence
CN109299477A (en) * 2018-11-30 2019-02-01 北京字节跳动网络技术有限公司 Method and apparatus for generating text header
CN109472028A (en) * 2018-10-31 2019-03-15 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN110287491A (en) * 2019-06-25 2019-09-27 北京百度网讯科技有限公司 Event name generation method and device
CN110532344A (en) * 2019-08-06 2019-12-03 北京如优教育科技有限公司 Automatic Selected Topic System based on deep neural network model
CN110795930A (en) * 2019-10-24 2020-02-14 网娱互动科技(北京)股份有限公司 Article title optimization method, system, medium and equipment
CN110852801A (en) * 2019-11-08 2020-02-28 北京字节跳动网络技术有限公司 Information processing method, device and equipment
CN110851797A (en) * 2020-01-13 2020-02-28 支付宝(杭州)信息技术有限公司 Block chain-based work creation method and device and electronic equipment
CN111209725A (en) * 2018-11-19 2020-05-29 阿里巴巴集团控股有限公司 Text information generation method and device and computing equipment
CN111460801A (en) * 2020-03-30 2020-07-28 北京百度网讯科技有限公司 Title generation method and device and electronic equipment
CN112560458A (en) * 2020-12-09 2021-03-26 杭州艾耕科技有限公司 Article title generation method based on end-to-end deep learning model
CN114154490A (en) * 2020-08-18 2022-03-08 阿里巴巴集团控股有限公司 Model training method, title extracting method, device, electronic equipment and computer readable medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004029968A (en) * 2002-06-21 2004-01-29 Advanced Telecommunication Research Institute International Method for generating topic estimation model and topic estimation method
WO2013185856A1 (en) * 2012-06-15 2013-12-19 Qatar Foundation Joint topic model for cross-media news summarization
CN104331449A (en) * 2014-10-29 2015-02-04 百度在线网络技术(北京)有限公司 Method and device for determining similarity between inquiry sentence and webpage, terminal and server
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
CN106557554A (en) * 2016-11-04 2017-04-05 北京百度网讯科技有限公司 Display packing and device based on the Search Results of artificial intelligence
US20170200066A1 (en) * 2016-01-13 2017-07-13 Adobe Systems Incorporated Semantic Natural Language Vector Space
CN107193792A (en) * 2017-05-18 2017-09-22 北京百度网讯科技有限公司 The method and apparatus of generation article based on artificial intelligence
CN107463552A (en) * 2017-07-20 2017-12-12 北京奇艺世纪科技有限公司 A kind of method and apparatus for generating video subject title
CN107832299A (en) * 2017-11-17 2018-03-23 北京百度网讯科技有限公司 Rewriting processing method, device and the computer-readable recording medium of title based on artificial intelligence

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004029968A (en) * 2002-06-21 2004-01-29 Advanced Telecommunication Research Institute International Method for generating topic estimation model and topic estimation method
WO2013185856A1 (en) * 2012-06-15 2013-12-19 Qatar Foundation Joint topic model for cross-media news summarization
CN104331449A (en) * 2014-10-29 2015-02-04 百度在线网络技术(北京)有限公司 Method and device for determining similarity between inquiry sentence and webpage, terminal and server
CN106156204A (en) * 2015-04-23 2016-11-23 深圳市腾讯计算机系统有限公司 The extracting method of text label and device
US20170200066A1 (en) * 2016-01-13 2017-07-13 Adobe Systems Incorporated Semantic Natural Language Vector Space
CN106557554A (en) * 2016-11-04 2017-04-05 北京百度网讯科技有限公司 Display packing and device based on the Search Results of artificial intelligence
CN107193792A (en) * 2017-05-18 2017-09-22 北京百度网讯科技有限公司 The method and apparatus of generation article based on artificial intelligence
CN107463552A (en) * 2017-07-20 2017-12-12 北京奇艺世纪科技有限公司 A kind of method and apparatus for generating video subject title
CN107832299A (en) * 2017-11-17 2018-03-23 北京百度网讯科技有限公司 Rewriting processing method, device and the computer-readable recording medium of title based on artificial intelligence

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832299B (en) * 2017-11-17 2021-11-23 北京百度网讯科技有限公司 Title rewriting processing method and device based on artificial intelligence and readable medium
CN107832299A (en) * 2017-11-17 2018-03-23 北京百度网讯科技有限公司 Rewriting processing method, device and the computer-readable recording medium of title based on artificial intelligence
CN109472028A (en) * 2018-10-31 2019-03-15 北京字节跳动网络技术有限公司 Method and apparatus for generating information
CN109472028B (en) * 2018-10-31 2023-12-15 北京字节跳动网络技术有限公司 Method and device for generating information
CN111209725B (en) * 2018-11-19 2023-04-25 阿里巴巴集团控股有限公司 Text information generation method and device and computing equipment
CN111209725A (en) * 2018-11-19 2020-05-29 阿里巴巴集团控股有限公司 Text information generation method and device and computing equipment
CN109299477A (en) * 2018-11-30 2019-02-01 北京字节跳动网络技术有限公司 Method and apparatus for generating text header
CN110287491A (en) * 2019-06-25 2019-09-27 北京百度网讯科技有限公司 Event name generation method and device
CN110287491B (en) * 2019-06-25 2024-01-12 北京百度网讯科技有限公司 Event name generation method and device
CN110532344A (en) * 2019-08-06 2019-12-03 北京如优教育科技有限公司 Automatic Selected Topic System based on deep neural network model
CN110795930A (en) * 2019-10-24 2020-02-14 网娱互动科技(北京)股份有限公司 Article title optimization method, system, medium and equipment
CN110852801A (en) * 2019-11-08 2020-02-28 北京字节跳动网络技术有限公司 Information processing method, device and equipment
CN110851797A (en) * 2020-01-13 2020-02-28 支付宝(杭州)信息技术有限公司 Block chain-based work creation method and device and electronic equipment
CN111460801A (en) * 2020-03-30 2020-07-28 北京百度网讯科技有限公司 Title generation method and device and electronic equipment
CN111460801B (en) * 2020-03-30 2023-08-18 北京百度网讯科技有限公司 Title generation method and device and electronic equipment
CN114154490A (en) * 2020-08-18 2022-03-08 阿里巴巴集团控股有限公司 Model training method, title extracting method, device, electronic equipment and computer readable medium
CN112560458A (en) * 2020-12-09 2021-03-26 杭州艾耕科技有限公司 Article title generation method based on end-to-end deep learning model

Also Published As

Publication number Publication date
CN107977363B (en) 2021-12-17

Similar Documents

Publication Publication Date Title
CN107977363A (en) Title generation method, device and electronic equipment
CN106845645B (en) Method and system for generating semantic network and for media composition
CN108052659A (en) Searching method, device and electronic equipment based on artificial intelligence
CN104281702B (en) Data retrieval method and device based on electric power critical word participle
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN106095762A (en) A kind of news based on ontology model storehouse recommends method and device
CN103995885B (en) The recognition methods of physical name and device
CN106940726B (en) Creative automatic generation method and terminal based on knowledge network
CN108268600A (en) Unstructured Data Management and device based on AI
CN105975558A (en) Method and device for establishing statement editing model as well as method and device for automatically editing statement
CN106874410A (en) Chinese microblogging text mood sorting technique and its system based on convolutional neural networks
CN111222318B (en) Trigger word recognition method based on double-channel bidirectional LSTM-CRF network
CN109726289A (en) Event detecting method and device
CN109902289A (en) A kind of news video topic division method towards fuzzy text mining
CN106571139A (en) Artificial intelligence based voice search result processing method and device
CN105243083B (en) Document subject matter method for digging and device
CN109446423B (en) System and method for judging sentiment of news and texts
CN112231563B (en) Content recommendation method, device and storage medium
CN110197279A (en) Transformation model training method, device, equipment and storage medium
CN110134970B (en) Header error correction method and apparatus
CN106445915A (en) New word discovery method and device
CN109271516A (en) Entity type classification method and system in a kind of knowledge mapping
CN112328857B (en) Product knowledge aggregation method and device, computer equipment and storage medium
CN115017303A (en) Method, computing device and medium for enterprise risk assessment based on news text
CN103942274B (en) A kind of labeling system and method for the biologic medical image based on LDA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant