CN108804495A - A kind of Method for Automatic Text Summarization semantic based on enhancing - Google Patents
A kind of Method for Automatic Text Summarization semantic based on enhancing Download PDFInfo
- Publication number
- CN108804495A CN108804495A CN201810281684.5A CN201810281684A CN108804495A CN 108804495 A CN108804495 A CN 108804495A CN 201810281684 A CN201810281684 A CN 201810281684A CN 108804495 A CN108804495 A CN 108804495A
- Authority
- CN
- China
- Prior art keywords
- text
- abstract
- semantic
- summarization
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Abstract
The invention discloses a kind of Method for Automatic Text Summarization semantic based on enhancing, and steps are as follows:To Text Pretreatment, is arranged from high to low according to word frequency information, word is switched into id;List entries is encoded using a single-layer bidirectional LSTM, extracts text message feature;The text semantic vector that coding obtains is decoded using single layer unidirectional LSTM and obtains hidden layer state;The calculating of context vector is carried out, extract in list entries and currently exports the most useful information;The probability distribution for obtaining a vocabulary size after the decoding is taken certain strategy to carry out abstract selected ci poem and is selected, and the semantic similarity that fusion is generated abstract and source text by the training stage carries out costing bio disturbance, improves the semantic similarity of abstract and source text.The present invention characterizes text using LSTM deep learning models, incorporates the semantic relation of context, and enhances the semantic relation of abstract and source text, and the abstract of generation can more agree with the theme of text, and application prospect is extensive.
Description
Technical field
The present invention relates to natural language processing technique fields, and in particular to a kind of automatic text summarization semantic based on enhancing
Method.
Background technology
With the fast development of science and technology and internet, the arriving in big data epoch, the network information covered the sky and the earth and day are all
Increase.Wherein, the explosive increase of representative text message amount, such as news, blog, chat, report, microblogging so that
Information overload, huge information make people be taken a significant amount of time in brose and reading.Therefore, how quickly from a large amount of texts
Key content is extracted in this information, solves the problems, such as information overload, it has also become a urgent demand, automatic text summarization technology
It comes into being.
According to generating, writing abstract can be divided into extraction-type abstract to automatic text summarization technology and production is made a summary.The former be by
Sentence in original text carries out importance ranking according to certain method, using the highest preceding n sentence of importance as abstract;Afterwards
Person is by excavating deeper semantic information, reporting original text central idea, summarize.For extraction-type abstract
By largely studying, but this method is merely resting on the lexical information on surface, and production abstract more meets people's generation and plucks
The process wanted.
In recent years, due to the rise of deep learning, few achievement is achieved in many fields, has been also introduced into automatic
Digest field.Based on sequence to sequence seq2seq models, production abstract may be implemented, use for reference the successful application of machine translation,
Automatic abstract based on seq2seq models has become the research hotspot of natural language processing, but there is also some continuities, readable
The problem of property.Traditional extraction-type abstract would generally cause prodigious information loss, especially be embodied in long text, therefore deeply
Production automatic abstract is studied, is of great significance for really solving information overload.
Invention content
The purpose of the present invention is to solve drawbacks described above in the prior art, provide a kind of based on the automatic of enhancing semanteme
Text snippet method, this method is based on seq2seq models, while introducing attention mechanism, utilizes generation abstract and source document
This Semantic Similarity is trained, and is improved the semantic relevancy for generating abstract and source text, is improved abstract quality.
The purpose of the present invention can be reached by adopting the following technical scheme that:
A kind of Method for Automatic Text Summarization semantic based on enhancing, the Method for Automatic Text Summarization include:
Text Pretreatment step, text is segmented, form reduction and reference resolution, according to word frequency information from height to
Word is switched to id by low arrangement;
Coding step encodes list entries, obtains carrying the hidden layer of text sequence information by neural network
State vector;
Decoding step initializes the last hidden layer state obtained by encoder, and it is each to proceed by decoding acquisition
Step hides layer state st;
Attention distribution calculates step, in conjunction with the hidden layer of the hiding layer state and current time decoding acquisition of list entries
State stThe calculating for carrying out context vector, obtains the context vector u of current t momentt;
Summarization generation step, by decoding step obtain output by two linear layers be mapped as vocabulary size dimension to
Amount, each ties up the probability for representing word in vocabulary, selects candidate word with certain selection strategy, generates abstract.
Further, the data of text are the corpus crawled by reptile or increase income in the Text Pretreatment step
Corpus, and by article-abstract to forming.
Further, in the Text Pretreatment step, the word of preceding 200k is obtained as basic vocabulary, while will be special
It marks [PAD], [UNK], [START] and [STOP] that vocabulary is added, and the word of text is switched to id, one sequence of each correspondence
Row.
Further, the list entries is the corresponding term vector of id sequences for obtaining text after conversion, word
Vector dimension 128, sequence maximum length are taken as 700.
Further, the neural network is the LSTM of a single-layer bidirectional, and hidden layer unit number is 256, will be positive and negative
To hidden layer state h connect to obtain final hidden state.
Further, the decoding step process is as follows:
The term vector and last moment hidden layer state for receiving input are obtained by the unidirectional LSTM neural networks of single layer
Current time hidden layer state st, hidden unit number is 256.
Further, the context vector utCalculation it is as follows:
Wherein, v, Wh, WsAnd battIt is the parameter for needing to learn, hiFor the hidden layer state value of encoder, N is input sequence
The length of row.
Further, the selection strategy refers to that test phase is selected generally with beam search algorithms in each step
Maximum 4 of rate is as a result, to the last obtain the abstract sequence of maximum probability, and the training stage only selects the word of maximum probability, plucks
After generating completely comparative evaluation is carried out with reference to making a summary.
Further, in the summarization generation step, each step only generates a word, ultimately generates abstract maximum length
It is 100, that is, from coding step to summarization generation step maximum cycle is 100, when end of output mark or reaches
Stop when maximum length, probability calculation formula is as follows:
pv=soft max (V1(V2[st,ut]+b2)+b1)
Wherein, V1, V2, b1, b2All it is the parameter for needing to learn, pvTo predict that next word provides foundation.
Further, the summarization generation step further includes:By finally obtained prediction abstract and source text sequence into
Row semantic similarity Re l are calculated, and training process punishes the abstract of low semantic relevancy, are calculated as follows:
Wherein,WithIt is the hiding layer state of forward and backward, G respectivelytIt is that encoder hides layer state, λ is one
Adjustable factors, M are the abstract sequence length generated, losstIt is the loss of each step, is bonded with semantic similarity Re l
Total loss loss.
The present invention has the following advantages and effects with respect to the prior art:
The present invention is based on seq2seq models, the automatic text summarization model based on LSTM is constructed, is introduced in decoder
Attention mechanism obtains the context vector at each moment, and introduces semantic similarity to enhance the semanteme for generating abstract and source text
Similarity is fused in loss function in training, avoids model from wandering off, improve the quality of abstract by the degree of correlation.
Description of the drawings
Fig. 1 is the step flow chart of the Method for Automatic Text Summarization semantic based on enhancing of the present invention;
Fig. 2 is the Semantic Similarity Measurement structure chart in the present invention;
The algorithm flow chart of Fig. 3 each steps when being the decoding generation abstract word in the present invention.
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
The every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Embodiment
As shown in Figure 1, including based on the semantic Method for Automatic Text Summarization of enhancing:Text Pretreatment step, coding step,
Decoding step, attention step, summarization generation step.Wherein:
Text Pretreatment step, text data here can be the corpus crawled by reptile, can also be to increase income
Corpus, by taking CNN/Daily Mail as an example, by article-abstract to forming, every article is averaged 780 words, abstract
Average 56 words.Source text is segmented, form reduction, after reference resolution, according to word frequency height, the word of 200k is made before obtaining
For basic vocabulary, and extension vocabulary corresponding with the word of each text composition, while by special marking [PAD], [UNK],
[START], vocabulary is added in [STOP], and the word of text is switched to id, and one sequence of each correspondence is made a summary similarly, training set
Including 287226 samples, verification collection includes 13368 samples, and test set includes 11490 samples.
Coding step obtains the vector of 128 dimensions, by neural network after carrying out word embedding to list entries
Obtain the text representation vector of a carrying text sequence information.
Wherein, list entries is the id sequences for obtaining article after conversion, and maximum length is taken as 700, shortest length
It is 30.
Wherein, the neural network in coding step is the LSTM compositions of a single-layer bidirectional, and hidden layer unit number is 256,
It connects forward and reverse hidden layer state h to obtain final hidden state.
Decoding step receives the term vector of list entries, by the unidirectional LSTM neural networks of single layer, obtains final hidden layer
State st, hidden unit number is 256.
Attention calculates step, and decoded state s is obtained in conjunction with current time decoding steptWith the list entries of coding step
Hiding layer state, obtain the context vector u at current timet。
Wherein, t moment context vector calculation is as follows:
Wherein, v, Wh, WsAnd battIt is the parameter for needing to learn, hiFor the hidden layer state value of encoder, N is input sequence
The length of row.
Summarization generation step, by decoding step obtain output by two linear layers be mapped as vocabulary size dimension to
Amount, each is tieed up the probability for representing word in vocabulary, candidate word is selected with certain selection strategy.
Wherein, selection strategy refers to that test phase selects 4 knots of maximum probability with each step of beam search algorithms
Fruit, to the last obtains the abstract sequence of maximum probability, and the training stage only takes the word of maximum probability, make a summary after generating completely with
Comparative evaluation is carried out with reference to abstract.
Wherein, it is 100 to generate abstract maximum length, and probability calculation formula is as follows:
pv=soft max (V1(V2[st,ut]+b2)+b1)
Wherein, V1, V2, b1, b2All it is the parameter for needing to learn, pvTo predict that next word provides foundation.
Wherein, summarization generation step further includes that finally obtained prediction abstract and source text sequence are carried out semantic similarity
Re l are calculated, and training process punishes the abstract of low semantic relevancy, are calculated as follows:
Wherein,WithIt is the hiding layer state of forward and backward, G respectivelytIt is that encoder hides layer state, λ is one
Adjustable factors, it is the abstract sequence length generated, loss to be defaulted as 1, MtIt is the loss of each step, is bonded with similarity
Total loss.
In the training process, using back-propagation algorithm, using Adagrad optimizers, learning rate 0.15, initially
Accelerator value is 0.1.
Decoding step is divided into training stage and test phase, wherein the training stage will refer to abstract as input, test rank
Section will export last moment to be inputted as this moment.
Assessment reference is made a summary and the index of prediction abstract is ROUGE indexs.Linux operating systems are used, and on GPU
Program is run, the programming language used is python, platform tensorflow.Introduce the model running time of semantic similarity
About 4 days, about 380000 iteration are carried out, the experimental results are shown inthe following table.
1. 3 kinds of model result comparisons of table
Experimental model | ROUGE-1 | ROUGE-2 | ROUGE-L |
Basic LSTM models | 0.2896 | 0.1028 | 0.2613 |
LSTM+Attention | 0.3116 | 0.1127 | 0.2920 |
LSTM+Attention+Rel | 0.3493 | 0.1390 | 0.3342 |
The present invention gives full play to seq2seq models and carries out profound excavation text semantic letter by merging attention mechanism
The ability of breath focuses on to currently exporting useful information in list entries when decoding being allow to generate abstract, and incorporates semanteme
Similarity carries out costing bio disturbance, so that model is paid close attention to the semantic similarity with source text when generating abstract, is more met
The sentence of original text semanteme.Compared with traditional automaticabstracting based on statistics, the model based on deep learning more has characterization
Ability has great advantage in automatic text summarization task.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment
Limitation, it is other it is any without departing from the spirit and principles of the present invention made by changes, modifications, substitutions, combinations, simplifications,
Equivalent substitute mode is should be, is included within the scope of the present invention.
Claims (10)
1. a kind of Method for Automatic Text Summarization semantic based on enhancing, which is characterized in that the Method for Automatic Text Summarization packet
It includes:
Text Pretreatment step, text is segmented, form reduction and reference resolution, arranged from high to low according to word frequency information
Row, switch to id by word;
Coding step encodes list entries, obtains carrying the hiding layer state of text sequence information by neural network
Vector;
Decoding step initializes the last hidden layer state obtained by encoder, and it is hidden to proceed by each step of decoding acquisition
Hide layer state st;
Attention distribution calculates step, in conjunction with the hiding layer state of the hiding layer state and current time decoding acquisition of list entries
stThe calculating for carrying out context vector, obtains the context vector u of current t momentt;
The output that decoding step obtains is passed through the vector that two linear layers are mapped as vocabulary size dimension by summarization generation step,
Each ties up the probability for representing word in vocabulary, selects candidate word with certain selection strategy, generates abstract.
2. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
The data of text are the corpus crawled by reptile or the corpus increased income in Text Pretreatment step, and by article-abstract
To composition.
3. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
In Text Pretreatment step, the word of 200k is as basic vocabulary before obtaining, while by special marking [PAD], [UNK], [START]
Vocabulary is added in [STOP], and the word of text is switched to id, one sequence of each correspondence.
4. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
List entries is the corresponding term vector of id sequences for obtaining text after conversion, term vector dimension 128, sequence maximum length
It is taken as 700.
5. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
Neural network is the LSTM of a single-layer bidirectional, and hidden layer unit number is 256, and forward and reverse hidden layer state h is connected
To final hidden state.
6. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
Decoding step process is as follows:
The term vector and last moment hidden layer state for receiving input obtain current by the unidirectional LSTM neural networks of single layer
Moment hidden layer state st, hidden unit number is 256.
7. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
Context vector utCalculation it is as follows:
Wherein, v, Wh, WsAnd battIt is the parameter for needing to learn, hiFor the hidden layer state value of encoder, N is list entries
Length.
8. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
Selection strategy refers to that test phase selects 4 of maximum probability as a result, to the last with beam search algorithms in each step
Obtain the abstract sequence of maximum probability, and the training stage only selects the word of maximum probability, make a summary after generating completely with reference to make a summary into
Row comparative evaluation.
9. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
In summarization generation step, each step only generates a word, and it is 100 to ultimately generate abstract maximum length, that is, from coding step
It is 100 to summarization generation step maximum cycle, when end of output mark or while reaching maximum length stop, probability calculation
Formula is as follows:
pv=soft max (V1(V2[st,ut]+b2)+b1)
Wherein, V1, V2, b1, b2All it is the parameter for needing to learn, pvTo predict that next word provides foundation.
10. a kind of Method for Automatic Text Summarization semantic based on enhancing according to claim 1, which is characterized in that described
Summarization generation step further include:Finally obtained prediction abstract and source text sequence are carried out semantic similarity Re l to calculate,
Training process punishes the abstract of low semantic relevancy, calculates as follows:
Wherein,WithIt is the hiding layer state of forward and backward, G respectivelytIt is that encoder hides layer state, λ is one adjustable
The factor, M are the abstract sequence length generated, losstIt is the loss of each step, total damage is bonded with semantic similarity Re l
Lose loss.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810281684.5A CN108804495B (en) | 2018-04-02 | 2018-04-02 | Automatic text summarization method based on enhanced semantics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810281684.5A CN108804495B (en) | 2018-04-02 | 2018-04-02 | Automatic text summarization method based on enhanced semantics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108804495A true CN108804495A (en) | 2018-11-13 |
CN108804495B CN108804495B (en) | 2021-10-22 |
Family
ID=64095279
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810281684.5A Expired - Fee Related CN108804495B (en) | 2018-04-02 | 2018-04-02 | Automatic text summarization method based on enhanced semantics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108804495B (en) |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109620205A (en) * | 2018-12-26 | 2019-04-16 | 上海联影智能医疗科技有限公司 | Electrocardiogram (ECG) data classification method, device, computer equipment and storage medium |
CN109800390A (en) * | 2018-12-21 | 2019-05-24 | 北京石油化工学院 | A kind of calculation method and device of individualized emotion abstract |
CN109829161A (en) * | 2019-01-30 | 2019-05-31 | 延边大学 | A kind of method of multilingual autoabstract |
CN109885673A (en) * | 2019-02-13 | 2019-06-14 | 北京航空航天大学 | A kind of Method for Automatic Text Summarization based on pre-training language model |
CN109947931A (en) * | 2019-03-20 | 2019-06-28 | 华南理工大学 | Text automatic abstracting method, system, equipment and medium based on unsupervised learning |
CN110119444A (en) * | 2019-04-23 | 2019-08-13 | 中电科大数据研究院有限公司 | A kind of official document summarization generation model that extraction-type is combined with production |
CN110134782A (en) * | 2019-05-14 | 2019-08-16 | 南京大学 | A kind of text snippet model and Method for Automatic Text Summarization based on improved selection mechanism and LSTM variant |
CN110209802A (en) * | 2019-06-05 | 2019-09-06 | 北京金山数字娱乐科技有限公司 | A kind of method and device for extracting summary texts |
CN110209801A (en) * | 2019-05-15 | 2019-09-06 | 华南理工大学 | A kind of text snippet automatic generation method based on from attention network |
CN110222840A (en) * | 2019-05-17 | 2019-09-10 | 中山大学 | A kind of cluster resource prediction technique and device based on attention mechanism |
CN110334362A (en) * | 2019-07-12 | 2019-10-15 | 北京百奥知信息科技有限公司 | A method of the solution based on medical nerve machine translation generates untranslated word |
CN110390103A (en) * | 2019-07-23 | 2019-10-29 | 中国民航大学 | Short text auto-abstracting method and system based on Dual-encoder |
CN110532554A (en) * | 2019-08-26 | 2019-12-03 | 南京信息职业技术学院 | A kind of Chinese abstraction generating method, system and storage medium |
CN110688479A (en) * | 2019-08-19 | 2020-01-14 | 中国科学院信息工程研究所 | Evaluation method and sequencing network for generating abstract |
CN110765264A (en) * | 2019-10-16 | 2020-02-07 | 北京工业大学 | Text abstract generation method for enhancing semantic relevance |
CN110795556A (en) * | 2019-11-01 | 2020-02-14 | 中山大学 | Abstract generation method based on fine-grained plug-in decoding |
CN111078866A (en) * | 2019-12-30 | 2020-04-28 | 华南理工大学 | Chinese text abstract generation method based on sequence-to-sequence model |
CN111339763A (en) * | 2020-02-26 | 2020-06-26 | 四川大学 | English mail subject generation method based on multi-level neural network |
CN111414505A (en) * | 2020-03-11 | 2020-07-14 | 上海爱数信息技术股份有限公司 | Rapid image abstract generation method based on sequence generation model |
CN111460109A (en) * | 2019-01-22 | 2020-07-28 | 阿里巴巴集团控股有限公司 | Abstract and dialogue abstract generation method and device |
CN111563160A (en) * | 2020-04-15 | 2020-08-21 | 华南理工大学 | Text automatic summarization method, device, medium and equipment based on global semantics |
CN111639174A (en) * | 2020-05-15 | 2020-09-08 | 民生科技有限责任公司 | Text abstract generation system, method and device and computer readable storage medium |
CN111708877A (en) * | 2020-04-20 | 2020-09-25 | 中山大学 | Text abstract generation method based on key information selection and variation latent variable modeling |
CN111797196A (en) * | 2020-06-01 | 2020-10-20 | 武汉大学 | Service discovery method combining attention mechanism LSTM and neural topic model |
CN112364157A (en) * | 2020-11-02 | 2021-02-12 | 北京中科凡语科技有限公司 | Multi-language automatic abstract generation method, device, equipment and storage medium |
CN113157855A (en) * | 2021-02-22 | 2021-07-23 | 福州大学 | Text summarization method and system fusing semantic and context information |
CN113221577A (en) * | 2021-04-28 | 2021-08-06 | 西安交通大学 | Education text knowledge induction method, system, equipment and readable storage medium |
CN113407711A (en) * | 2021-06-17 | 2021-09-17 | 成都崇瑚信息技术有限公司 | Gibbs limited text abstract generation method by using pre-training model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291699A (en) * | 2017-07-04 | 2017-10-24 | 湖南星汉数智科技有限公司 | A kind of sentence semantic similarity computational methods |
CN107484017A (en) * | 2017-07-25 | 2017-12-15 | 天津大学 | Supervision video abstraction generating method is had based on attention model |
CN107832300A (en) * | 2017-11-17 | 2018-03-23 | 合肥工业大学 | Towards minimally invasive medical field text snippet generation method and device |
CN107844469A (en) * | 2017-10-26 | 2018-03-27 | 北京大学 | The text method for simplifying of word-based vector query model |
-
2018
- 2018-04-02 CN CN201810281684.5A patent/CN108804495B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291699A (en) * | 2017-07-04 | 2017-10-24 | 湖南星汉数智科技有限公司 | A kind of sentence semantic similarity computational methods |
CN107484017A (en) * | 2017-07-25 | 2017-12-15 | 天津大学 | Supervision video abstraction generating method is had based on attention model |
CN107844469A (en) * | 2017-10-26 | 2018-03-27 | 北京大学 | The text method for simplifying of word-based vector query model |
CN107832300A (en) * | 2017-11-17 | 2018-03-23 | 合肥工业大学 | Towards minimally invasive medical field text snippet generation method and device |
Non-Patent Citations (2)
Title |
---|
NHI-THAO TRAN ET AL: "Effective Attention-based Neural Architectures for Sentence Compression with Bidirectional Long Short-Term Memory", 《PROCEEDINGS OF THE SEVENTH SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY》 * |
SHUMING MA ET AL: "Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization", 《ACL》 * |
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800390A (en) * | 2018-12-21 | 2019-05-24 | 北京石油化工学院 | A kind of calculation method and device of individualized emotion abstract |
CN109620205A (en) * | 2018-12-26 | 2019-04-16 | 上海联影智能医疗科技有限公司 | Electrocardiogram (ECG) data classification method, device, computer equipment and storage medium |
CN111460109A (en) * | 2019-01-22 | 2020-07-28 | 阿里巴巴集团控股有限公司 | Abstract and dialogue abstract generation method and device |
CN111460109B (en) * | 2019-01-22 | 2023-12-26 | 阿里巴巴集团控股有限公司 | Method and device for generating abstract and dialogue abstract |
CN109829161B (en) * | 2019-01-30 | 2023-08-04 | 延边大学 | Method for automatically abstracting multiple languages |
CN109829161A (en) * | 2019-01-30 | 2019-05-31 | 延边大学 | A kind of method of multilingual autoabstract |
CN109885673A (en) * | 2019-02-13 | 2019-06-14 | 北京航空航天大学 | A kind of Method for Automatic Text Summarization based on pre-training language model |
CN109947931B (en) * | 2019-03-20 | 2021-05-14 | 华南理工大学 | Method, system, device and medium for automatically abstracting text based on unsupervised learning |
CN109947931A (en) * | 2019-03-20 | 2019-06-28 | 华南理工大学 | Text automatic abstracting method, system, equipment and medium based on unsupervised learning |
CN110119444B (en) * | 2019-04-23 | 2023-06-30 | 中电科大数据研究院有限公司 | Drawing type and generating type combined document abstract generating model |
CN110119444A (en) * | 2019-04-23 | 2019-08-13 | 中电科大数据研究院有限公司 | A kind of official document summarization generation model that extraction-type is combined with production |
CN110134782A (en) * | 2019-05-14 | 2019-08-16 | 南京大学 | A kind of text snippet model and Method for Automatic Text Summarization based on improved selection mechanism and LSTM variant |
CN110134782B (en) * | 2019-05-14 | 2021-05-18 | 南京大学 | Text summarization model based on improved selection mechanism and LSTM variant and automatic text summarization method |
CN110209801B (en) * | 2019-05-15 | 2021-05-14 | 华南理工大学 | Text abstract automatic generation method based on self-attention network |
CN110209801A (en) * | 2019-05-15 | 2019-09-06 | 华南理工大学 | A kind of text snippet automatic generation method based on from attention network |
CN110222840A (en) * | 2019-05-17 | 2019-09-10 | 中山大学 | A kind of cluster resource prediction technique and device based on attention mechanism |
CN110209802B (en) * | 2019-06-05 | 2021-12-28 | 北京金山数字娱乐科技有限公司 | Method and device for extracting abstract text |
CN110209802A (en) * | 2019-06-05 | 2019-09-06 | 北京金山数字娱乐科技有限公司 | A kind of method and device for extracting summary texts |
CN110334362B (en) * | 2019-07-12 | 2023-04-07 | 北京百奥知信息科技有限公司 | Method for solving and generating untranslated words based on medical neural machine translation |
CN110334362A (en) * | 2019-07-12 | 2019-10-15 | 北京百奥知信息科技有限公司 | A method of the solution based on medical nerve machine translation generates untranslated word |
CN110390103B (en) * | 2019-07-23 | 2022-12-27 | 中国民航大学 | Automatic short text summarization method and system based on double encoders |
CN110390103A (en) * | 2019-07-23 | 2019-10-29 | 中国民航大学 | Short text auto-abstracting method and system based on Dual-encoder |
CN110688479A (en) * | 2019-08-19 | 2020-01-14 | 中国科学院信息工程研究所 | Evaluation method and sequencing network for generating abstract |
CN110688479B (en) * | 2019-08-19 | 2022-06-17 | 中国科学院信息工程研究所 | Evaluation method and sequencing network for generating abstract |
CN110532554B (en) * | 2019-08-26 | 2023-05-05 | 南京信息职业技术学院 | Chinese abstract generation method, system and storage medium |
CN110532554A (en) * | 2019-08-26 | 2019-12-03 | 南京信息职业技术学院 | A kind of Chinese abstraction generating method, system and storage medium |
CN110765264A (en) * | 2019-10-16 | 2020-02-07 | 北京工业大学 | Text abstract generation method for enhancing semantic relevance |
CN110795556B (en) * | 2019-11-01 | 2023-04-18 | 中山大学 | Abstract generation method based on fine-grained plug-in decoding |
CN110795556A (en) * | 2019-11-01 | 2020-02-14 | 中山大学 | Abstract generation method based on fine-grained plug-in decoding |
CN111078866B (en) * | 2019-12-30 | 2023-04-28 | 华南理工大学 | Chinese text abstract generation method based on sequence-to-sequence model |
CN111078866A (en) * | 2019-12-30 | 2020-04-28 | 华南理工大学 | Chinese text abstract generation method based on sequence-to-sequence model |
CN111339763A (en) * | 2020-02-26 | 2020-06-26 | 四川大学 | English mail subject generation method based on multi-level neural network |
CN111339763B (en) * | 2020-02-26 | 2022-06-28 | 四川大学 | English mail subject generation method based on multi-level neural network |
CN111414505B (en) * | 2020-03-11 | 2023-10-20 | 上海爱数信息技术股份有限公司 | Quick image abstract generation method based on sequence generation model |
CN111414505A (en) * | 2020-03-11 | 2020-07-14 | 上海爱数信息技术股份有限公司 | Rapid image abstract generation method based on sequence generation model |
CN111563160B (en) * | 2020-04-15 | 2023-03-31 | 华南理工大学 | Text automatic summarization method, device, medium and equipment based on global semantics |
CN111563160A (en) * | 2020-04-15 | 2020-08-21 | 华南理工大学 | Text automatic summarization method, device, medium and equipment based on global semantics |
CN111708877B (en) * | 2020-04-20 | 2023-05-09 | 中山大学 | Text abstract generation method based on key information selection and variational potential variable modeling |
CN111708877A (en) * | 2020-04-20 | 2020-09-25 | 中山大学 | Text abstract generation method based on key information selection and variation latent variable modeling |
CN111639174A (en) * | 2020-05-15 | 2020-09-08 | 民生科技有限责任公司 | Text abstract generation system, method and device and computer readable storage medium |
CN111639174B (en) * | 2020-05-15 | 2023-12-22 | 民生科技有限责任公司 | Text abstract generation system, method, device and computer readable storage medium |
CN111797196B (en) * | 2020-06-01 | 2021-11-02 | 武汉大学 | Service discovery method combining attention mechanism LSTM and neural topic model |
CN111797196A (en) * | 2020-06-01 | 2020-10-20 | 武汉大学 | Service discovery method combining attention mechanism LSTM and neural topic model |
CN112364157A (en) * | 2020-11-02 | 2021-02-12 | 北京中科凡语科技有限公司 | Multi-language automatic abstract generation method, device, equipment and storage medium |
CN113157855A (en) * | 2021-02-22 | 2021-07-23 | 福州大学 | Text summarization method and system fusing semantic and context information |
CN113221577A (en) * | 2021-04-28 | 2021-08-06 | 西安交通大学 | Education text knowledge induction method, system, equipment and readable storage medium |
CN113407711B (en) * | 2021-06-17 | 2023-04-07 | 成都崇瑚信息技术有限公司 | Gibbs limited text abstract generation method by using pre-training model |
CN113407711A (en) * | 2021-06-17 | 2021-09-17 | 成都崇瑚信息技术有限公司 | Gibbs limited text abstract generation method by using pre-training model |
Also Published As
Publication number | Publication date |
---|---|
CN108804495B (en) | 2021-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108804495A (en) | A kind of Method for Automatic Text Summarization semantic based on enhancing | |
CN111897949B (en) | Guided text abstract generation method based on Transformer | |
CN109508462B (en) | Neural network Mongolian Chinese machine translation method based on encoder-decoder | |
CN107291836B (en) | Chinese text abstract obtaining method based on semantic relevancy model | |
CN110210016B (en) | Method and system for detecting false news of bilinear neural network based on style guidance | |
CN110147451B (en) | Dialogue command understanding method based on knowledge graph | |
CN110795556A (en) | Abstract generation method based on fine-grained plug-in decoding | |
CN110119444B (en) | Drawing type and generating type combined document abstract generating model | |
CN111061861B (en) | Text abstract automatic generation method based on XLNet | |
CN111858932A (en) | Multiple-feature Chinese and English emotion classification method and system based on Transformer | |
CN107357899B (en) | Short text sentiment analysis method based on sum-product network depth automatic encoder | |
CN110807324A (en) | Video entity identification method based on IDCNN-crf and knowledge graph | |
CN111178053B (en) | Text generation method for generating abstract extraction by combining semantics and text structure | |
CN111209749A (en) | Method for applying deep learning to Chinese word segmentation | |
CN109325109A (en) | Attention encoder-based extraction type news abstract generating device | |
CN111814477B (en) | Dispute focus discovery method and device based on dispute focus entity and terminal | |
CN108710672B (en) | Theme crawler method based on incremental Bayesian algorithm | |
CN113111663A (en) | Abstract generation method fusing key information | |
CN111984782A (en) | Method and system for generating text abstract of Tibetan language | |
CN110992943B (en) | Semantic understanding method and system based on word confusion network | |
CN114298055B (en) | Retrieval method and device based on multilevel semantic matching, computer equipment and storage medium | |
CN114972848A (en) | Image semantic understanding and text generation based on fine-grained visual information control network | |
CN111008277B (en) | Automatic text summarization method | |
CN116483991A (en) | Dialogue abstract generation method and system | |
CN114548090B (en) | Fast relation extraction method based on convolutional neural network and improved cascade labeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20211022 |