CN108804495B - Automatic text summarization method based on enhanced semantics - Google Patents

Automatic text summarization method based on enhanced semantics Download PDF

Info

Publication number
CN108804495B
CN108804495B CN201810281684.5A CN201810281684A CN108804495B CN 108804495 B CN108804495 B CN 108804495B CN 201810281684 A CN201810281684 A CN 201810281684A CN 108804495 B CN108804495 B CN 108804495B
Authority
CN
China
Prior art keywords
text
hidden layer
abstract
sequence
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810281684.5A
Other languages
Chinese (zh)
Other versions
CN108804495A (en
Inventor
史景伦
洪冬梅
宁培阳
王桂鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201810281684.5A priority Critical patent/CN108804495B/en
Publication of CN108804495A publication Critical patent/CN108804495A/en
Application granted granted Critical
Publication of CN108804495B publication Critical patent/CN108804495B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an automatic text summarization method based on enhanced semantics, which comprises the following steps: preprocessing a text, arranging the text from high to low according to word frequency information, and converting words into id; encoding the input sequence by using a single-layer bidirectional LSTM, and extracting text information characteristics; decoding the text semantic vector obtained by encoding by using a single-layer unidirectional LSTM to obtain a hidden layer state; calculating a context vector, and extracting the most useful information from the input sequence and the current output; and in the training stage, loss calculation is carried out on the semantic similarity of the fusion generated abstract and the source text, so that the semantic similarity of the abstract and the source text is improved. The invention uses the LSTM deep learning model to represent the text, integrates the semantic relation of the context, enhances the semantic relation between the abstract and the source text, generates the abstract which can be more suitable for the theme of the text, and has wide application prospect.

Description

Automatic text summarization method based on enhanced semantics
Technical Field
The invention relates to the technical field of natural language processing, in particular to an automatic text summarization method based on enhanced semantics.
Background
With the rapid development of science and technology and the internet, the big data era comes, and the network information of the covered area is increasing day by day. In which, the explosive increase of representative text information amount, such as news, blog, chat, report, microblog, etc., makes the information burden heavy, and the huge information makes people spend a lot of time when browsing and reading. Therefore, how to quickly extract key contents from a large amount of text information and solve the problem of information overload becomes an urgent need, and an automatic text summarization technology comes along.
The automatic text summarization technique is classified into an abstract summary and a generative summary according to the type of the generative summary. The former is to sort the sentences in the original text according to a certain method, and take the first n sentences with the highest importance as the abstract; the latter is to describe and summarize the original text center thought by mining deeper semantic information. There has been a lot of research on abstract, but this method is only the vocabulary information staying on the surface, and the generated abstract is more suitable for the process of human generating abstract.
In recent years, due to the rise of deep learning, a lot of achievements have been achieved in many fields, and the field of automatic summarization has also been introduced. Based on the sequence-to-sequence seq2seq model, a generative abstract can be realized, and by using the successful application of machine translation as a reference, the automatic abstract based on the seq2seq model has become a research hotspot of natural language processing, but has some problems of continuity and readability. The traditional abstraction generally causes great information loss, which is particularly reflected in long texts, so that the deep research of the generated automatic abstraction has important significance for really solving the problem of information overload.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides an automatic text summarization method based on enhanced semantics, which is based on a seq2seq model, introduces an attention mechanism and trains by using semantic similarity between a generated summary and a source text, thereby improving semantic relevancy between the generated summary and the source text and improving summary quality.
The purpose of the invention can be achieved by adopting the following technical scheme:
an automatic text summarization method based on enhanced semantics, the automatic text summarization method comprising:
a text preprocessing step, namely performing word segmentation, form reduction and reference resolution on the text, arranging the words from high to low according to word frequency information, and converting the words into id;
coding, namely coding an input sequence, and obtaining a hidden layer state vector carrying text sequence information through a neural network;
decoding step, initializing the last hidden layer state obtained by the encoder, and starting decoding to obtain the hidden layer state s of each stept
An attention distribution calculation step for combining the hidden layer state of the input sequence with the hidden layer state s obtained by decoding at the current momenttCalculating the context vector to obtain the context vector u at the current time tt
And an abstract generation step, namely mapping the output obtained in the decoding step into vectors of the dimension of the size of the word list through two linear layers, wherein each dimension represents the probability of the word in the word list, and selecting a candidate word by using a certain selection strategy to generate an abstract.
Further, the data of the text in the text preprocessing step is a corpus crawled by a crawler or an open-source corpus, and consists of article-abstract pairs.
Further, in the step of preprocessing the text, the first 200k words are obtained as a basic word list, and meanwhile, special marks [ PAD ], [ UNK ], [ START ] and [ STOP ] are added into the word list, and the words of the text are converted into id, and each piece corresponds to a sequence.
Further, the input sequence is a word vector corresponding to an id sequence obtained by converting the text, the dimension of the word vector is 128, and the maximum length of the sequence is 700.
Further, the neural network is a single-layer bidirectional LSTM, the number of hidden layer units is 256, and the forward and reverse hidden layer states h are connected to obtain a final hidden layer state.
Further, the decoding step process is as follows:
receiving an input word vector and a previous-time hidden layer state, and obtaining a current-time hidden layer state s through a single-layer unidirectional LSTM neural networktThe number of hidden units is 256.
Further, the context vector utThe calculation method of (c) is as follows:
Figure GDA0003206627590000031
Figure GDA0003206627590000032
Figure GDA0003206627590000033
wherein, v, Wh,WsAnd battIs a parameter to be learned, hiIs the hidden layer state value of encoder, N is the outputThe length of the incoming sequence.
Furthermore, the selection strategy refers to that 4 results with the maximum probability are selected in each step by using a beam search algorithm in the testing stage until a summary sequence with the maximum probability is obtained finally, while only the words with the maximum probability are selected in the training stage, and the summary is compared and evaluated with the reference summary after being completely generated.
Further, in the digest generation step, only one word is generated in each step, and the maximum length of the generated digest is 100, that is, the maximum number of cycles from the encoding step to the digest generation step is 100, and when the end flag is output or the maximum length is reached, the probability calculation formula is as follows:
pv=softmax(V1(V2[st,ut]+b2)+b1)
wherein, V1,V2,b1,b2Are all parameters that need to be learned, pvProviding basis for predicting the next word.
Further, the digest generation step further includes: and performing semantic similarity Rel calculation on the finally obtained prediction abstract and the source text sequence, and punishing the abstract with low semantic relevance in the training process, wherein the calculation is as follows:
Figure GDA0003206627590000041
Figure GDA0003206627590000042
wherein,
Figure GDA0003206627590000043
and
Figure GDA0003206627590000044
hidden layer states, G, in forward and backward directions, respectivelytIs the encoder hidden layer state, λ is an adjustable factor, M is the length of the generated digest sequence, losstIs the loss of each step, combined with the semantic similarity Rel to form the total loss.
Compared with the prior art, the invention has the following advantages and effects:
the invention constructs an automatic text abstract model based on an LSTM based on a seq2seq model, introduces an attention mechanism to obtain a context vector at each moment when a decoder is used, introduces semantic similarity to enhance the semantic relevance between a generated abstract and a source text, fuses the similarity into a loss function during training, avoids model bias and improves the quality of the abstract.
Drawings
FIG. 1 is a flow chart of the steps of the enhanced semantic based automatic text summarization method of the present invention;
FIG. 2 is a diagram of a semantic similarity calculation structure in the present invention;
fig. 3 is a flowchart of an algorithm of each step when generating the abstract word in decoding according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
As shown in fig. 1, the automatic text summarization method based on enhanced semantics includes: the method comprises the steps of text preprocessing, encoding, decoding, attention and abstract generation. Wherein:
text preprocessing step, where the text data may be a corpus crawled by a crawler, or an open-source corpus, for example, CNN/daisy Mail, composed of article-abstract pairs, where each article has 780 words on average and the abstract has 56 words on average. The method comprises the steps of performing word segmentation and morphological restoration on a source text, obtaining the first 200k words as a basic word list according to the word frequency and the word frequency after resolution, adding special marks [ PAD ], [ UNK ], [ START ], [ STOP ] into the word list at the same time, converting the words of the text into id, wherein each word corresponds to a sequence, the abstract has the same principle, a training set comprises 287226 samples, a verification set comprises 13368 samples, and a test set comprises 11490 samples.
And a coding step, namely performing word embedding on the input sequence to obtain a 128-dimensional vector, and obtaining a text expression vector carrying text sequence information through a neural network.
The input sequence is an id sequence obtained by converting an article, the maximum length is 700, and the minimum length is 30.
The neural network in the encoding step is composed of a single-layer bidirectional LSTM, the number of hidden layer units is 256, and the forward and reverse hidden layer states h are connected to obtain the final hidden layer state.
Decoding step, receiving word vector of input sequence, passing through single-layer unidirectional LSTM neural network to obtain final hidden layer state stThe number of hidden units is 256.
An attention calculating step, which combines the decoding step at the current moment to obtain a decoding state stAnd the hidden layer state of the input sequence of the encoding step, obtaining the context vector u at the current momentt
The context vector at time t is calculated as follows:
Figure GDA0003206627590000061
Figure GDA0003206627590000062
Figure GDA0003206627590000063
wherein, v, Wh,WsAnd battIs a parameter to be learned, hiIs the hidden layer state value of the encoder, and N is the length of the input sequence.
And an abstract generating step, namely mapping the output obtained in the decoding step into vectors of the dimension of the size of the word list through two linear layers, wherein each dimension represents the probability of the word in the word list, and selecting candidate words by using a certain selection strategy.
The selection strategy refers to that 4 results with the maximum probability are selected by the beam search algorithm in each step in the testing stage until a summary sequence with the maximum probability is obtained finally, the training stage only takes the words with the maximum probability, and the summary is compared and evaluated with the reference summary after being completely generated.
The maximum length of the generated abstract is 100, and the probability calculation formula is as follows:
pv=softmax(V1(V2[st,ut]+b2)+b1)
wherein, V1,V2,b1,b2Are all parameters that need to be learned, pvProviding a basis for predicting the next word.
The abstract generating step also comprises the following steps of carrying out semantic similarity Rel calculation on the finally obtained prediction abstract and the source text sequence, and punishing the abstract with low semantic relevance in the training process, wherein the calculation is as follows:
Figure GDA0003206627590000064
Figure GDA0003206627590000071
wherein,
Figure GDA0003206627590000072
and
Figure GDA0003206627590000073
hidden layer states, G, in forward and backward directions, respectivelytIs a plaitThe state of a hidden layer of a coder, lambda is an adjustable factor and defaults to 1, M is the length of a generated summary sequence, losstIs the loss of each step, combined with the similarity to make up the total loss.
In the training process, a back propagation algorithm is adopted, an Adagarad optimizer is used, the learning rate is 0.15, and the initial accelerator value is 0.1.
The decoding step is divided into a training stage and a testing stage, wherein the training stage takes the reference abstract as input, and the testing stage takes the last moment output as the moment input.
The indicators evaluating the reference summary and the prediction summary are the ROUGE indicators. A linux operating system is adopted, a program is run on a GPU, the used programming language is python, and the platform is tensorflow. The model with semantic similarity introduced runs for about 4 days, with about 380000 iterations, and the experimental results are shown in the following table.
TABLE 1 comparison of the results of the three models
Experimental model ROUGE-1 ROUGE-2 ROUGE-L
Basic LSTM model 0.2896 0.1028 0.2613
LSTM+Attention 0.3116 0.1127 0.2920
LSTM+Attention+Rel 0.3493 0.1390 0.3342
The method fully exerts the capability of the seq2seq model for deeply excavating text semantic information by fusing an attention mechanism, so that information which is useful for current output in an input sequence can be focused when the abstract is generated by decoding, and loss calculation is carried out by fusing semantic similarity, so that the semantic similarity with a source text can be focused when the abstract is generated by the model, and sentences which are more in line with the original text semantics can be obtained. Compared with the traditional automatic summarization method based on statistics, the model based on deep learning has more representation capability and has great advantages on the task of automatic text summarization.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (7)

1. An automatic text summarization method based on enhanced semantics, characterized in that the automatic text summarization method comprises:
a text preprocessing step, namely performing word segmentation, form reduction and reference resolution on the text, arranging words from high to low according to word frequency information, and converting the words into an id sequence;
coding, namely coding an input sequence, and obtaining a hidden layer state vector carrying text sequence information through a neural network;
decoding step, initializing the last hidden layer state obtained by the encoder, and starting decoding to obtain the hidden layer state s of each stept
An attention distribution calculation step for combining the hidden layer state of the input sequence with the hidden layer state s obtained by decoding at the current momenttCalculating the context vector to obtain the context vector u at the current time tt
A summary generation step, namely mapping the output obtained in the decoding step into vectors of the dimension of the size of the word list through two linear layers, wherein each dimension represents the probability of a word in the word list, selecting a candidate word by using a selection strategy, and generating a summary; the selection strategy refers to that 4 results with the maximum probability are selected in each step by using a beam search algorithm in the testing stage until a summary sequence with the maximum probability is obtained finally, only the words with the maximum probability are selected in the training stage, and the summary is compared and evaluated with a reference summary after being completely generated;
the abstract generating step further comprises: and performing semantic similarity Rel calculation on the finally obtained prediction abstract and the source text sequence, and punishing the abstract with low semantic relevance in the training process, wherein the calculation is as follows:
Figure FDA0003206627580000011
Figure FDA0003206627580000012
wherein,
Figure FDA0003206627580000013
and
Figure FDA0003206627580000014
hidden layer states, G, in forward and backward directions, respectivelytIs the encoder hidden layer state, λ is an adjustable factor, M is the length of the generated digest sequence, losstLoss of each step is combined with semantic similarity Rel to form total loss;
in the abstract generating step, only one word is generated in each step, the maximum length of the generated abstract is 100, that is, the maximum cycle number from the encoding step to the abstract generating step is 100, and when the output end mark or the maximum length is reached, the probability calculation formula is as follows:
pv=softmax(V1(V2[st,ut]+b2)+b1)
wherein, V1,V2,b1,b2Are all parameters that need to be learned, pvProviding basis for predicting the next word.
2. The method for automatically abstracting text based on enhanced semantics as claimed in claim 1, wherein the data of the text in the text preprocessing step is a corpus crawled by a crawler or an open-source corpus, and is composed of article-abstract pairs.
3. The method for automatically summarizing text based on enhanced semantics of claim 1, wherein in the text preprocessing step, the top 200k words are obtained as basic vocabulary, and the special labels [ PAD ], [ UNK ], [ START ] and [ STOP ] are added into the vocabulary, and the words of the text are converted into id sequences, each corresponding to a sequence.
4. The method for automatically abstracting text based on enhanced semantics of claim 1, wherein the input sequence is a word vector corresponding to an id sequence obtained by converting a text, the dimension of the word vector is 128, and the maximum length of the sequence is 700.
5. The method according to claim 1, wherein the neural network is a single-layer bi-directional LSTM, the number of hidden layer units is 256, and forward and reverse hidden layer states h are connected to obtain a final hidden layer state.
6. The method for automatic text summarization based on enhanced semantics of claim 1 wherein the decoding step is performed as follows:
receiving an input word vector and a previous-time hidden layer state, and obtaining a current-time hidden layer state s through a single-layer unidirectional LSTM neural networktThe number of hidden units is 256.
7. The method according to claim 1, wherein the context vector u is a semantic vector with a semantic meaning that is different from the semantic meaning of the text to be extractedtThe calculation method of (c) is as follows:
Figure FDA0003206627580000031
Figure FDA0003206627580000032
Figure FDA0003206627580000033
wherein, v, Wh,WsAnd battIs a parameter to be learned, hiIs the hidden layer state value of the encoder, and N is the length of the input sequence.
CN201810281684.5A 2018-04-02 2018-04-02 Automatic text summarization method based on enhanced semantics Expired - Fee Related CN108804495B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810281684.5A CN108804495B (en) 2018-04-02 2018-04-02 Automatic text summarization method based on enhanced semantics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810281684.5A CN108804495B (en) 2018-04-02 2018-04-02 Automatic text summarization method based on enhanced semantics

Publications (2)

Publication Number Publication Date
CN108804495A CN108804495A (en) 2018-11-13
CN108804495B true CN108804495B (en) 2021-10-22

Family

ID=64095279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810281684.5A Expired - Fee Related CN108804495B (en) 2018-04-02 2018-04-02 Automatic text summarization method based on enhanced semantics

Country Status (1)

Country Link
CN (1) CN108804495B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800390B (en) * 2018-12-21 2023-08-18 北京石油化工学院 Method and device for calculating personalized emotion abstract
CN109620205B (en) * 2018-12-26 2022-10-28 上海联影智能医疗科技有限公司 Electrocardiogram data classification method and device, computer equipment and storage medium
CN111460109B (en) * 2019-01-22 2023-12-26 阿里巴巴集团控股有限公司 Method and device for generating abstract and dialogue abstract
CN109829161B (en) * 2019-01-30 2023-08-04 延边大学 Method for automatically abstracting multiple languages
CN109885673A (en) * 2019-02-13 2019-06-14 北京航空航天大学 A kind of Method for Automatic Text Summarization based on pre-training language model
CN109947931B (en) * 2019-03-20 2021-05-14 华南理工大学 Method, system, device and medium for automatically abstracting text based on unsupervised learning
CN110119444B (en) * 2019-04-23 2023-06-30 中电科大数据研究院有限公司 Drawing type and generating type combined document abstract generating model
CN110134782B (en) * 2019-05-14 2021-05-18 南京大学 Text summarization model based on improved selection mechanism and LSTM variant and automatic text summarization method
CN110209801B (en) * 2019-05-15 2021-05-14 华南理工大学 Text abstract automatic generation method based on self-attention network
CN110222840B (en) * 2019-05-17 2023-05-05 中山大学 Cluster resource prediction method and device based on attention mechanism
CN110209802B (en) * 2019-06-05 2021-12-28 北京金山数字娱乐科技有限公司 Method and device for extracting abstract text
CN110334362B (en) * 2019-07-12 2023-04-07 北京百奥知信息科技有限公司 Method for solving and generating untranslated words based on medical neural machine translation
CN110390103B (en) * 2019-07-23 2022-12-27 中国民航大学 Automatic short text summarization method and system based on double encoders
CN110688479B (en) * 2019-08-19 2022-06-17 中国科学院信息工程研究所 Evaluation method and sequencing network for generating abstract
CN110532554B (en) * 2019-08-26 2023-05-05 南京信息职业技术学院 Chinese abstract generation method, system and storage medium
CN110765264A (en) * 2019-10-16 2020-02-07 北京工业大学 Text abstract generation method for enhancing semantic relevance
CN110795556B (en) * 2019-11-01 2023-04-18 中山大学 Abstract generation method based on fine-grained plug-in decoding
CN111078866B (en) * 2019-12-30 2023-04-28 华南理工大学 Chinese text abstract generation method based on sequence-to-sequence model
CN111339763B (en) * 2020-02-26 2022-06-28 四川大学 English mail subject generation method based on multi-level neural network
CN111414505B (en) * 2020-03-11 2023-10-20 上海爱数信息技术股份有限公司 Quick image abstract generation method based on sequence generation model
CN111563160B (en) * 2020-04-15 2023-03-31 华南理工大学 Text automatic summarization method, device, medium and equipment based on global semantics
CN111708877B (en) * 2020-04-20 2023-05-09 中山大学 Text abstract generation method based on key information selection and variational potential variable modeling
CN111639174B (en) * 2020-05-15 2023-12-22 民生科技有限责任公司 Text abstract generation system, method, device and computer readable storage medium
CN111797196B (en) * 2020-06-01 2021-11-02 武汉大学 Service discovery method combining attention mechanism LSTM and neural topic model
CN112364157A (en) * 2020-11-02 2021-02-12 北京中科凡语科技有限公司 Multi-language automatic abstract generation method, device, equipment and storage medium
CN113157855B (en) * 2021-02-22 2023-02-21 福州大学 Text summarization method and system fusing semantic and context information
CN113221577A (en) * 2021-04-28 2021-08-06 西安交通大学 Education text knowledge induction method, system, equipment and readable storage medium
CN113111663B (en) * 2021-04-28 2024-09-06 东南大学 Abstract generation method for fusing key information
CN113407711B (en) * 2021-06-17 2023-04-07 成都崇瑚信息技术有限公司 Gibbs limited text abstract generation method by using pre-training model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291699A (en) * 2017-07-04 2017-10-24 湖南星汉数智科技有限公司 A kind of sentence semantic similarity computational methods

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107484017B (en) * 2017-07-25 2020-05-26 天津大学 Supervised video abstract generation method based on attention model
CN107844469B (en) * 2017-10-26 2020-06-26 北京大学 Text simplification method based on word vector query model
CN107832300A (en) * 2017-11-17 2018-03-23 合肥工业大学 Towards minimally invasive medical field text snippet generation method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107291699A (en) * 2017-07-04 2017-10-24 湖南星汉数智科技有限公司 A kind of sentence semantic similarity computational methods

Also Published As

Publication number Publication date
CN108804495A (en) 2018-11-13

Similar Documents

Publication Publication Date Title
CN108804495B (en) Automatic text summarization method based on enhanced semantics
CN110119765B (en) Keyword extraction method based on Seq2Seq framework
CN111897949B (en) Guided text abstract generation method based on Transformer
CN111061862B (en) Method for generating abstract based on attention mechanism
Ji et al. Representation learning for text-level discourse parsing
CN110348016A (en) Text snippet generation method based on sentence association attention mechanism
JP5128629B2 (en) Part-of-speech tagging system, part-of-speech tagging model training apparatus and method
CN110020438A (en) Enterprise or tissue Chinese entity disambiguation method and device based on recognition sequence
CN112215013B (en) Clone code semantic detection method based on deep learning
CN111241816A (en) Automatic news headline generation method
CN112183058B (en) Poetry generation method and device based on BERT sentence vector input
CN111061861A (en) XLNET-based automatic text abstract generation method
CN112732862B (en) Neural network-based bidirectional multi-section reading zero sample entity linking method and device
CN113609284A (en) Method and device for automatically generating text abstract fused with multivariate semantics
CN111984782A (en) Method and system for generating text abstract of Tibetan language
CN114281982B (en) Book propaganda abstract generation method and system adopting multi-mode fusion technology
Tomer et al. STV-BEATS: skip thought vector and bi-encoder based automatic text summarizer
CN117708644A (en) Method and system for generating judicial judge document abstract
Zhang et al. Extractive Document Summarization based on hierarchical GRU
CN116069924A (en) Text abstract generation method and system integrating global and local semantic features
CN109992774A (en) The key phrase recognition methods of word-based attribute attention mechanism
CN114996442A (en) Text abstract generation system combining abstract degree judgment and abstract optimization
CN114358006A (en) Text content abstract generation method based on knowledge graph
CN114357154A (en) Chinese abstract generation method based on double-coding-pointer hybrid network
KR102214754B1 (en) Method and apparatus for generating product evaluation criteria

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20211022