CN109597886A - It extracts and generates mixed type abstraction generating method - Google Patents

It extracts and generates mixed type abstraction generating method Download PDF

Info

Publication number
CN109597886A
CN109597886A CN201811238086.6A CN201811238086A CN109597886A CN 109597886 A CN109597886 A CN 109597886A CN 201811238086 A CN201811238086 A CN 201811238086A CN 109597886 A CN109597886 A CN 109597886A
Authority
CN
China
Prior art keywords
sentence
critical
document
critical sentence
extraction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811238086.6A
Other languages
Chinese (zh)
Other versions
CN109597886B (en
Inventor
周玉
朱军楠
张家俊
宗成庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201811238086.6A priority Critical patent/CN109597886B/en
Publication of CN109597886A publication Critical patent/CN109597886A/en
Application granted granted Critical
Publication of CN109597886B publication Critical patent/CN109597886B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention belongs to natural language fields, specifically provide a kind of extraction generation mixed type abstraction generating method, it is intended to solve the problems, such as that existing extraction-type auto-abstracting method and production auto-abstracting method exist.The present invention provides a kind of extractions to generate mixed type abstraction generating method, including the entity and number in identification document and utilizes the entity and number in preset tag replacement document;Multiple first critical sentences are extracted in the document after carrying out tag replacement using extraction-type documentation summary abstracting method;Multiple first critical sentences are compressed respectively to obtain corresponding second critical sentence of each first critical sentence;By the comparison result of the length of the first critical sentence and preset length threshold, the property of can choose using the first critical sentence or the second critical sentence as the first critical sentence to be synthesized;The abstract of document is generated according to all first critical sentences to be synthesized.The abstract for meeting document semantic expression had both can be generated in method provided by the invention, it can also be ensured that readable.

Description

It extracts and generates mixed type abstraction generating method
Technical field
The invention belongs to natural language technical fields, and in particular to a kind of extraction generation mixed type abstraction generating method.
Background technique
Autoabstract is to realize text analyzing, the skill that content is concluded and abstract automatically generates automatically using computer system Art can in brief be expressed the main contents of original text by the requirement of reader (or user).Autoabstract technology can have Effect ground helps reader (or user) to find interested content from the article retrieved, improves reading rate and quality.The skill Art can be more succinct by document boil down to expression, and guarantee cover the valuable theme of original document.
Existing autoabstract technology mainly includes two methods: extraction-type auto-abstracting method and production autoabstract Method.Extraction-type auto-abstracting method is that the segment extracted from document is formed to digest, and implementation method is simple, readable good It is good, but obtained abstract precision is not high;Production auto-abstracting method is that abstract is generated directly from document expression of significance, difficult Degree is big, but the essence of closer abstract.
Therefore, how to propose it is a kind of can both filter unessential content of text in document, retain the fluency of abstract, again The scheme that the precision of abstract can be improved is the current problem to be solved of those skilled in the art.
Summary of the invention
In order to solve the above problem in the prior art, in order to solve existing extraction-type auto-abstracting method and generation Formula auto-abstracting method there are the problem of, the present invention provides a kind of extraction generate mixed type abstraction generating method, comprising:
It identifies the entity and number in document and utilizes the entity and number in document described in preset tag replacement;
Multiple first critical sentences are extracted in the document after carrying out tag replacement using extraction-type documentation summary abstracting method;
The multiple first critical sentence is compressed respectively to obtain corresponding second key of each first critical sentence Sentence;
Judge whether the length of first critical sentence is more than or equal to preset length threshold: if so, by described first Corresponding second critical sentence of critical sentence is as the first critical sentence to be synthesized;If it is not, then directly using first critical sentence as institute State the first critical sentence to be synthesized;
The abstract of the document is generated according to all first critical sentences to be synthesized.
In the optimal technical scheme of above scheme, " using extraction-type documentation summary abstracting method from carry out tag replacement Multiple first critical sentences are extracted in document afterwards " the step of include:
Utilize document of the extraction-type documentation summary abstracting method based on Submodular function after carrying out tag replacement It is middle to extract multiple first critical sentences;
Obtain the original critical sentence corresponding with first critical sentence in carrying out the document before tag replacement;
According to collating sequence of each former critical sentence in the document before the progress tag replacement to corresponding the The sequence of one critical sentence.
In the optimal technical scheme of above scheme, " respectively the multiple first critical sentence is compressed to obtain each The step of corresponding second critical sentence of first critical sentence " includes:
First critical sentence is compressed based on the sentence abstract model constructed in advance to obtain corresponding second key Sentence;
Wherein, the sentence abstract model is the model based on attention mechanism construction.
It is " crucial to described first based on the sentence abstract model constructed in advance in the optimal technical scheme of above scheme Sentence compressed to obtain corresponding second critical sentence " the step of include:
Obtain the unregistered word generated when compressing to first critical sentence;
It obtains and pays attention to the highest word of force value at the generation moment of the unregistered word and utilize acquired attention force value Highest word replaces the unregistered word.
In the optimal technical scheme of above scheme, " respectively the multiple first critical sentence is compressed to obtain it is every Before the step of corresponding second critical sentence of a first critical sentence ", the method also includes:
Identify the entity and number that preset text data is concentrated;
The entity and number concentrated using text data described in preset tag replacement;
Model training is carried out to sentence abstract model according to the text data set after progress tag replacement.
In the optimal technical scheme of above scheme, " plucking for the document is generated according to all first critical sentences to be synthesized Want " the step of include:
Label in described first critical sentence to be synthesized is reduced to corresponding entity and number, obtain corresponding second to Synthesize critical sentence;
The abstract of the document is generated according to the described second critical sentence to be synthesized.
Compared with the immediate prior art, above-mentioned technical proposal is at least had the following beneficial effects:
1, extraction provided by the invention generates mixed type abstraction generating method, can pass through extraction-type documentation summary extraction side Method extracts the first critical sentence, and the first critical sentence is compressed to obtain the second critical sentence, by the length of the first critical sentence and pre- If length threshold comparison result, the property of can choose using the first critical sentence or the second critical sentence as the first pass to be synthesized Key sentence generates documentation summary according to the first critical sentence to be synthesized, combines extraction-type documentation summary abstracting method and production text The advantages of shelves abstract abstracting method, the abstract for meeting document semantic expression both can be generated, it can also be ensured that readable.
2, extraction provided by the invention generates mixed type abstraction generating method, it can be determined that whether the length of the first critical sentence More than or equal to preset length threshold, if so, using corresponding second critical sentence of the first critical sentence as the first key to be synthesized Sentence, if it is not, directly using the first critical sentence as the first critical sentence to be synthesized, so that subsequent available one more robust is plucked It wants, that is, while guaranteeing that there is a degree of informativeness to the fact, ensures readability as far as possible.
3, extraction provided by the invention generates mixed type abstraction generating method, can first pass through the extraction of extraction-type documentation summary Method extracts the first critical sentence from document, some not too important content of text can be filtered, so that the later period passes through production Auto-abstracting method is quickly generated the abstract of document, obtains high-precision documentation summary.
Detailed description of the invention
Fig. 1 is that the extraction of an embodiment of the present invention generates the key step schematic diagram of mixed type abstraction generating method;
Fig. 2 is that the extraction of an embodiment of the present invention generates the major architectural schematic diagram of mixed type abstraction generating method.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
The preferred embodiment of the present invention described with reference to the accompanying drawings.It will be apparent to a skilled person that this A little embodiments are used only for explaining technical principle of the invention, it is not intended that limit the scope of the invention.
Refering to attached drawing 1, Fig. 1, which illustratively gives to extract in the present embodiment, generates the main of mixed type abstraction generating method Step.Include the following steps: as shown in Figure 1, being extracted in the present embodiment and generating mixed type abstraction generating method
Step S101: the entity and number in document are identified and utilizes the entity sum number in preset tag replacement document Word.
It (is first extracted some important sentences from original text then to carry out these sentences by digest procedure is manually write Conclude rewrite) inspiration, the present invention by extract generate mixed type abstraction generating method generate long text text snippet.This hair Bright method both can use extraction-type documentation summary abstracting method and filter some not too important content of text, while can be with Retain the fluency that production documentation summary abstracting method generates text snippet.Extraction of the invention generates mixed type summarization generation Method mainly consists of two parts: important sentence and carrying out compression rewriting to the sentence of extraction in abstracting document.
Specifically, the entity and number in document can be identified and utilize the entity in preset tag replacement document And number.Assuming that giving an input document:
It’s just an example for illustration.There are 56nationalities in China.
Utilize the document after preset tag replacement as follows with number the entity in document:
It’s just an example for illustration.There are number-1nationalities in entity-1.
Wherein, n presentation-entity and number name entity can respectively in the sequence of the entity set of original document and digital convergence To be name, mechanism name, place name and other all entities with entitled mark, wider entity can also include number Word, date, currency and address etc. can identify entity and number in document by name Entity recognition tool Spacy, can To utilize the entity and number in preset tag replacement document by Python regular expression.
Step S102: multiple the are extracted in document after carrying out tag replacement using extraction-type documentation summary abstracting method One critical sentence.
Extraction-type documentation summary abstracting method can extract some representative text fragments structures from original document At abstract, these segments can be sentence, paragraph or trifle in entire document.Specifically, it can use and be based on The extraction-type documentation summary abstracting method of Submodular function extracts multiple first and closes in the document after carrying out tag replacement Key sentence obtains the original critical sentence corresponding with the first critical sentence in carrying out the document before tag replacement, according to each former critical sentence It sorts carrying out the collating sequence in the document before tag replacement to corresponding first critical sentence.Wherein, multiple first critical sentences Vocabulary sum be less than preset vocabulary amount threshold, vocabulary amount threshold can be 200.
Step S103: multiple first critical sentences are compressed to obtain corresponding second key of each first critical sentence respectively Sentence.
Although by multiple first critical sentences that extraction-type documentation summary abstracting method extracts can filter it is some not Too important content of text, but obtained abstract precision is not high, in order to which the abstract of generation can more meet the table of document meaning It reaches, obtains to compress multiple first critical sentences closer to the abstract manually write.It specifically, can be based on preparatory The sentence abstract model of building compresses the first critical sentence, obtains corresponding second critical sentence, wherein sentence abstract model It is the model based on attention mechanism construction.
" the first critical sentence is compressed to obtain corresponding second critical sentence based on the sentence abstract model constructed in advance " The step of include:
Obtain the unregistered word generated when compressing to the first critical sentence;
It obtains and pays attention to the highest word of force value at the generation moment of unregistered word and utilize acquired attention force value highest Word replace unregistered word.
Sentence abstract model is the model based on attention mechanism construction, which can be attached to Encoder- Under Decoder frame, which can be regarded as a kind of research mode in deep learning field, and Encoder is the sentence to input Son is encoded, and is converted intermediate semantic expressiveness by nonlinear transformation for the sentence of input, Encoder can be interpreted as compiling Decoder, can be interpreted as decoding end by code end, and Decoder is to have generated according to the intermediate semantic expressiveness of sentence and before Historical information generate the particular moment word to be generated, it is when occurring unregistered word in sentence, available to be not logged in The generation moment of word pays attention to the highest word of force value and using the acquired highest word replacement unregistered word of attention force value, improves The readability of abstract.
Before being compressed to obtain the second critical sentence to multiple first critical sentences, the model that can also make a summary to sentence is carried out Training, specific steps are as follows:
Identify the entity and number that preset text data is concentrated;
The entity and number concentrated using preset tag replacement text data;
Model training is carried out to sentence abstract model according to the text data set after progress tag replacement, until sentence is made a summary Model convergence, wherein text data set can be Gigaword data set.
Step S104: judging whether the length of the first critical sentence is more than or equal to preset length threshold, if so, executing step Rapid S105;If it is not, thening follow the steps S106.
The abstract of a more robust is use up while guarantee to the fact with a degree of informativeness in order to obtain Amount ensures readability, it can be determined that whether the length of the first critical sentence is more than or equal to preset length threshold, according to judging result Execute corresponding operation.
Step S105: using corresponding second critical sentence of the first critical sentence as the first critical sentence to be synthesized.
If the length of the first critical sentence is more than or equal to preset length threshold, in order to control the abstract vocabulary number ultimately generated Amount control is in reasonable length and improves readability, can be using corresponding second critical sentence of the first critical sentence as first wait close At critical sentence.
Step S106: directly using the first critical sentence as the first critical sentence to be synthesized.
If the length of the first critical sentence is less than preset length threshold, it may be considered that first extracted from document is crucial Sentence meets the vocabulary quantitative requirement for ultimately generating abstract, directly using the first critical sentence as the first critical sentence to be synthesized.
Step S107: the abstract of document is generated according to all first critical sentences to be synthesized.
Specifically, the label in the first critical sentence can be reduced to corresponding entity and number, obtains corresponding second It is to be synthesized to be arranged in order second according to the sequence of the corresponding document Central Plains sentence of the second critical sentence to be synthesized for critical sentence to be synthesized Critical sentence generates the abstract of document.
Refering to subordinate list 1, subordinate list 1 illustratively gives the present embodiment and extracts generation mixed type abstraction generating method and be based on Attention (S2S+attn) model of sequence to sequence is (random to take out 100 documents as test in CNN/DailyMail data set Data) ROUGE value.Sentence-title training dataset includes 3,803,957 data pair, and validation data set includes 189, 651 data pair, test data set include 1951 data pair, as can be seen that the extraction generation of the present embodiment is mixed from subordinate list 1 Mould assembly abstraction generating method can be obviously improved two indexs of ROUGE-1 and ROUGE-L.In addition, the sentence of the present embodiment is made a summary Model is the training on Gigaword data set, by means of the thought of transfer learning, and existing S2S+attn model be Training obtains on CNN/Daily Mail data set, and the model of the embodiment of the present invention has better migration.
1 present invention of subordinate list is compared with the ROUGE value based on sequence to series model (S2S+attn)
Refering to attached drawing 2, Fig. 2, which illustratively gives to extract in the present embodiment, generates the main of mixed type abstraction generating method Frame.As shown in Fig. 2, the major architectural for extracting generation mixed type abstraction generating method in the present embodiment is as follows:
First by extracting to original document, multiple first critical sentences are obtained, then the sentence by constructing in advance Abstract model compresses the first critical sentence to obtain corresponding second critical sentence, finally by the length of the first critical sentence and pre- If length threshold comparison result, the property of can choose using the first critical sentence or the second critical sentence as the first pass to be synthesized Key sentence generates the abstract of document according to all first critical sentences to be synthesized.
Extraction provided by the invention generates mixed type abstraction generating method, combine extraction-type documentation summary abstracting method and The abstract for meeting document semantic expression had both can be generated, it can also be ensured that readable in the advantages of production documentation summary abstracting method Property, extraction-type documentation summary abstracting method can be first passed through and extract the first critical sentence from document, can be filtered some less heavy The content of text wanted obtains high-precision so that the later period is quickly generated by production auto-abstracting method the abstract of document Documentation summary.
Although each step is described in the way of above-mentioned precedence in above-described embodiment, this field Technical staff is appreciated that the effect in order to realize the present embodiment, executes between different steps not necessarily in such order, It (parallel) execution simultaneously or can be executed with reverse order, these simple variations all protection scope of the present invention it It is interior.
Those skilled in the art should be able to recognize that, side described in conjunction with the examples disclosed in the embodiments of the present disclosure Method step, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate electronic hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is executed actually with electronic hardware or software mode, specific application and design constraint depending on technical solution. Those skilled in the art can use different methods to achieve the described function each specific application, but this reality Now it should not be considered as beyond the scope of the present invention.
It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, rather than be used to describe or indicate specific sequence or precedence.It should be understood that this The data that sample uses can be interchanged in appropriate circumstances, so that the embodiment of the present invention described herein can be in addition at this In illustrate or description those of other than sequence implement.
So far, it has been combined preferred embodiment shown in the drawings and describes technical solution of the present invention, still, this field Technical staff is it is easily understood that protection scope of the present invention is expressly not limited to these specific embodiments.Without departing from this Under the premise of the principle of invention, those skilled in the art can make equivalent change or replacement to the relevant technologies feature, these Technical solution after change or replacement will fall within the scope of protection of the present invention.

Claims (6)

1. a kind of extraction generates mixed type abstraction generating method, characterized by comprising:
It identifies the entity and number in document and utilizes the entity and number in document described in preset tag replacement;
Multiple first critical sentences are extracted in the document after carrying out tag replacement using extraction-type documentation summary abstracting method;
The multiple first critical sentence is compressed respectively to obtain corresponding second critical sentence of each first critical sentence;
Judge whether the length of first critical sentence is more than or equal to preset length threshold: if so, crucial by described first Corresponding second critical sentence of sentence is as the first critical sentence to be synthesized;If it is not, then directly using first critical sentence as described One critical sentence to be synthesized;
The abstract of the document is generated according to all first critical sentences to be synthesized.
2. extraction according to claim 1 generates mixed type abstraction generating method, which is characterized in that " utilize extraction-type text Shelves abstract abstracting method from carry out tag replacement after document in extract multiple first critical sentences " the step of include:
It is taken out in the document after carrying out tag replacement using the extraction-type documentation summary abstracting method based on Submodular function Take multiple first critical sentences;
Obtain the original critical sentence corresponding with first critical sentence in carrying out the document before tag replacement;
It is closed according to collating sequence of each former critical sentence in the document before the progress tag replacement to corresponding first The sequence of key sentence.
3. extraction according to claim 1 generates mixed type abstraction generating method, which is characterized in that " respectively to described more A first critical sentence is compressed to obtain corresponding second critical sentence of each first critical sentence " the step of include:
First critical sentence is compressed to obtain corresponding second critical sentence based on the sentence abstract model constructed in advance;
Wherein, the sentence abstract model is the model based on attention mechanism construction.
4. extraction according to claim 3 generates mixed type abstraction generating method, which is characterized in that " based on building in advance Sentence abstract model first critical sentence is compressed to obtain corresponding second critical sentence " the step of include:
Obtain the unregistered word generated when compressing to first critical sentence;
It obtains and pays attention to the highest word of force value at the generation moment of the unregistered word and utilize acquired attention force value highest Word replace the unregistered word.
5. extraction according to claim 4 generates mixed type abstraction generating method, which is characterized in that " respectively to described Multiple first critical sentences are compressed to obtain corresponding second critical sentence of each first critical sentence " the step of before, it is described Method further include:
Identify the entity and number that preset text data is concentrated;
The entity and number concentrated using text data described in preset tag replacement;
Model training is carried out to sentence abstract model according to the text data set after progress tag replacement.
6. extraction according to any one of claim 1 to 5 generates mixed type abstraction generating method, which is characterized in that " root The abstract of the document is generated according to all first critical sentences to be synthesized " the step of include:
Label in described first critical sentence to be synthesized is reduced to corresponding entity and number, it is to be synthesized to obtain corresponding second Critical sentence;
The abstract of the document is generated according to the described second critical sentence to be synthesized.
CN201811238086.6A 2018-10-23 2018-10-23 Extraction generation mixed abstract generation method Active CN109597886B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811238086.6A CN109597886B (en) 2018-10-23 2018-10-23 Extraction generation mixed abstract generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811238086.6A CN109597886B (en) 2018-10-23 2018-10-23 Extraction generation mixed abstract generation method

Publications (2)

Publication Number Publication Date
CN109597886A true CN109597886A (en) 2019-04-09
CN109597886B CN109597886B (en) 2021-07-06

Family

ID=65957961

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811238086.6A Active CN109597886B (en) 2018-10-23 2018-10-23 Extraction generation mixed abstract generation method

Country Status (1)

Country Link
CN (1) CN109597886B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119444A (en) * 2019-04-23 2019-08-13 中电科大数据研究院有限公司 A kind of official document summarization generation model that extraction-type is combined with production
CN111026861A (en) * 2019-12-10 2020-04-17 腾讯科技(深圳)有限公司 Text abstract generation method, text abstract training method, text abstract generation device, text abstract training device, text abstract equipment and text abstract training medium
CN111581358A (en) * 2020-04-08 2020-08-25 北京百度网讯科技有限公司 Information extraction method and device and electronic equipment
CN111858913A (en) * 2020-07-08 2020-10-30 北京嘀嘀无限科技发展有限公司 Method and system for automatically generating text abstract
CN112732901A (en) * 2021-01-15 2021-04-30 联想(北京)有限公司 Abstract generation method and device, computer readable storage medium and electronic equipment
CN113011160A (en) * 2019-12-19 2021-06-22 中国移动通信有限公司研究院 Text abstract generation method, device, equipment and storage medium
CN113032552A (en) * 2021-05-25 2021-06-25 南京鸿程信息科技有限公司 Text abstract-based policy key point extraction method and system
CN113836892A (en) * 2021-09-08 2021-12-24 灵犀量子(北京)医疗科技有限公司 Sample size data extraction method and device, electronic equipment and storage medium
CN116205234A (en) * 2023-04-24 2023-06-02 中国电子科技集团公司第二十八研究所 Text recognition and generation algorithm based on deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1609845A (en) * 2003-10-22 2005-04-27 国际商业机器公司 Method and apparatus for improving readability of automatic generated abstract by machine
US20090210381A1 (en) * 2008-02-15 2009-08-20 Yahoo! Inc. Search result abstract quality using community metadata
CN104503958A (en) * 2014-11-19 2015-04-08 百度在线网络技术(北京)有限公司 Method and device for generating document summarization
CN108228541A (en) * 2016-12-22 2018-06-29 深圳市北科瑞声科技股份有限公司 The method and apparatus for generating documentation summary

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1609845A (en) * 2003-10-22 2005-04-27 国际商业机器公司 Method and apparatus for improving readability of automatic generated abstract by machine
US20090210381A1 (en) * 2008-02-15 2009-08-20 Yahoo! Inc. Search result abstract quality using community metadata
CN104503958A (en) * 2014-11-19 2015-04-08 百度在线网络技术(北京)有限公司 Method and device for generating document summarization
CN108228541A (en) * 2016-12-22 2018-06-29 深圳市北科瑞声科技股份有限公司 The method and apparatus for generating documentation summary

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CAGLAR GULCEHRE等: "Pointing the Unknown Words", 《PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS》 *
仲夏199603: "抽取式文档摘要方法(一)", 《HTTPS://WWW.PIANSHEN.COM/ARTICLE/52201321841/IT610》 *
尹存燕等: "Internet上文本的自动摘要技术", 《计算机工程》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110119444B (en) * 2019-04-23 2023-06-30 中电科大数据研究院有限公司 Drawing type and generating type combined document abstract generating model
CN110119444A (en) * 2019-04-23 2019-08-13 中电科大数据研究院有限公司 A kind of official document summarization generation model that extraction-type is combined with production
CN111026861A (en) * 2019-12-10 2020-04-17 腾讯科技(深圳)有限公司 Text abstract generation method, text abstract training method, text abstract generation device, text abstract training device, text abstract equipment and text abstract training medium
CN111026861B (en) * 2019-12-10 2023-07-04 腾讯科技(深圳)有限公司 Text abstract generation method, training device, training equipment and medium
CN113011160A (en) * 2019-12-19 2021-06-22 中国移动通信有限公司研究院 Text abstract generation method, device, equipment and storage medium
CN111581358A (en) * 2020-04-08 2020-08-25 北京百度网讯科技有限公司 Information extraction method and device and electronic equipment
CN111581358B (en) * 2020-04-08 2023-08-18 北京百度网讯科技有限公司 Information extraction method and device and electronic equipment
CN111858913A (en) * 2020-07-08 2020-10-30 北京嘀嘀无限科技发展有限公司 Method and system for automatically generating text abstract
CN112732901A (en) * 2021-01-15 2021-04-30 联想(北京)有限公司 Abstract generation method and device, computer readable storage medium and electronic equipment
CN113032552A (en) * 2021-05-25 2021-06-25 南京鸿程信息科技有限公司 Text abstract-based policy key point extraction method and system
CN113032552B (en) * 2021-05-25 2021-08-27 南京鸿程信息科技有限公司 Text abstract-based policy key point extraction method and system
CN113836892A (en) * 2021-09-08 2021-12-24 灵犀量子(北京)医疗科技有限公司 Sample size data extraction method and device, electronic equipment and storage medium
CN113836892B (en) * 2021-09-08 2023-08-08 灵犀量子(北京)医疗科技有限公司 Sample size data extraction method and device, electronic equipment and storage medium
CN116205234A (en) * 2023-04-24 2023-06-02 中国电子科技集团公司第二十八研究所 Text recognition and generation algorithm based on deep learning

Also Published As

Publication number Publication date
CN109597886B (en) 2021-07-06

Similar Documents

Publication Publication Date Title
CN109597886A (en) It extracts and generates mixed type abstraction generating method
CN106919673B (en) Text mood analysis system based on deep learning
CN110134953B (en) Traditional Chinese medicine named entity recognition method and recognition system based on traditional Chinese medicine ancient book literature
CN108763483A (en) A kind of Text Information Extraction method towards judgement document
CN109684648A (en) A kind of Chinese automatic translating method at all times of multiple features fusion
CN105243129A (en) Commodity property characteristic word clustering method
CN106407235B (en) A kind of semantic dictionary construction method based on comment data
CN104199871A (en) High-speed test question inputting method for intelligent teaching
CN107247739B (en) A kind of financial bulletin text knowledge extracting method based on factor graph
CN111695346B (en) Method for improving public opinion entity recognition rate in financial risk prevention and control field
CN109933796A (en) A kind of bulletin text key message extracting method and equipment
CN107368474A (en) A kind of automatical and efficient translation conversion method of Chinese to braille
CN110046356A (en) Label is embedded in the application study in the classification of microblogging text mood multi-tag
CN110110087A (en) A kind of Feature Engineering method for Law Text classification based on two classifiers
CN107436931B (en) Webpage text extraction method and device
CN106570133A (en) Method and device for constructing visual webpage information extracting rule
CN111178047B (en) Ancient medical record prescription extraction method based on hierarchical sequence labeling
CN116737924A (en) Medical text data processing method and device
CN113268714B (en) Automatic extraction method for license terms of open source software
Al-Sultany et al. Enriching tweets for topic modeling via linking to the wikipedia
CN110516069B (en) Fasttext-CRF-based quotation metadata extraction method
CN109857746A (en) Automatic update method, device and the electronic equipment of bilingual word bank
CN114722829A (en) Automatic generation method of ancient poems based on language model
CN113990421A (en) Electronic medical record named entity identification method based on data enhancement
CN109918622A (en) The method and system converted from Word document to LaTeX document are realized based on JAVA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant