CN114547289A - NLP technology-based Chinese abstract automatic generation method and system - Google Patents

NLP technology-based Chinese abstract automatic generation method and system Download PDF

Info

Publication number
CN114547289A
CN114547289A CN202210204288.9A CN202210204288A CN114547289A CN 114547289 A CN114547289 A CN 114547289A CN 202210204288 A CN202210204288 A CN 202210204288A CN 114547289 A CN114547289 A CN 114547289A
Authority
CN
China
Prior art keywords
abstract
text
word
generate
automatically generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210204288.9A
Other languages
Chinese (zh)
Inventor
王峥
段京华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanxi Jubo Tianhao Technology Co ltd
Original Assignee
Shanxi Jubo Tianhao Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanxi Jubo Tianhao Technology Co ltd filed Critical Shanxi Jubo Tianhao Technology Co ltd
Priority to CN202210204288.9A priority Critical patent/CN114547289A/en
Publication of CN114547289A publication Critical patent/CN114547289A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to the field of automatic generation of abstracts, and particularly provides a Chinese abstract automatic generation method and system based on an NLP (non-line-of-sight) technology, which comprises the following steps: s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word; s2: automatically generating an evaluation index; s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index; s4: and adopting an abstract generation model to extract sentences of the text to generate an abstract. The invention automatically generates the abstract through a natural language processing technology, and refers to automatically generating a segment of abstract which retains key information in an input text and has smooth, concise and accurate semantics according to one or more documents. The automatic text summarization can generate the summarization quickly, accurately and in real time, and overcomes the defects of manual summarization.

Description

NLP technology-based Chinese abstract automatic generation method and system
Technical Field
The invention relates to the field of automatic generation of abstracts, in particular to a Chinese abstract automatic generation method and system based on an NLP technology.
Background
Natural Language Processing (NLP) is a discipline that studies the linguistic problems of human interaction with computers. According to different technical implementation difficulties, such systems can be divided into three types, namely simple matching type, fuzzy matching type and paragraph understanding type. The simple matching type tutoring and answering system mainly realizes the matching of questions proposed by students and related answering items in an answer library through a simple keyword matching technology, thereby realizing the automatic answering of the questions or the related tutoring. The fuzzy matching type tutoring and answering system increases the matching of synonyms and antonyms on the basis of the fuzzy matching type tutoring and answering system. Thus, even if the student does not find a directly matching answer in the answer library according to the original keyword in the question, if the words synonymous with the keyword or antisense to the keyword can be matched, the relevant answer item can be found in the answer library. Paragraph understanding type tutoring and answering system is the most ideal and truly intelligent tutoring and answering system (simple matching type and fuzzy matching type, which can only be called "automatic tutoring and answering system" rather than "intelligent tutoring and answering system" strictly speaking). However, the system relates to paragraph understanding of natural language, and for Chinese, the understanding relates to various complex technologies in the NLP field such as automatic word segmentation, part of speech analysis, syntactic analysis and semantic analysis, so that the realization difficulty is high. In recent years, automatic text summarization has become one of the important research directions in the fields of artificial intelligence and natural language processing. The automatic text summarization aims to extract key information in an original text and generate a summary which is semantically smooth, concise and accurate, and aims to improve the information browsing efficiency of a user. With the development of deep learning, the automatic text summarization model of the present day is mainly constructed based on a sequence-to-sequence framework. However, the application of the current sequence-to-sequence framework in automatic text summarization also has many problems, such as difficulty in generating out-of-set words, inability to effectively model the connections between words, lack of modeling for the key information extraction process, etc.
Disclosure of Invention
The invention mainly aims to provide a Chinese abstract automatic generation method and system based on NLP technology, so as to solve the problems in the related technology.
In order to achieve the above object, according to an aspect of the present invention, there is provided a method and a system for automatically generating a chinese abstract based on NLP technology, comprising the following steps:
s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word;
s2: automatically generating an evaluation index;
s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index;
s4: extracting sentences from the text by adopting an abstract generation model to generate an abstract;
further, the target training of the text needing to generate the abstract specifically includes:
Figure BDA0003530820410000021
wherein £ (theta) is the probability of generating each target word to the maximum, D is the training data set, x is the input text, y is the target abstract, and theta is a parameter of the model.
Further, the automatic generation evaluation index is one or the combination of two of the ROUGE-N, ROUGE-L.
Further, the ROUGE-N index is specifically as follows:
Figure BDA0003530820410000022
where S represents a sentence, gram, in the reference abstractnRepresents an n-tuple, Count (gram)n) Represents the number of n-tuples, Count in Smatch(gramn) Representing the number of n-tuples that the model-generated digest and the reference digest match.
Further, the ROUGE-L index is specifically as follows:
Figure BDA0003530820410000023
Figure BDA0003530820410000024
Figure BDA0003530820410000025
wherein X is the reference summary, m is its length, Y is the summary generated by the model, n is its length,
Figure BDA0003530820410000026
further, the sentence extraction of the text specifically includes representing the text content as a set composed of feature items, extracting a topic from the set according to the feature items, extracting a word from a word distribution corresponding to the extracted topic, and repeating the above process until the abstract is generated.
Further, the representing the text content as a set of feature items is specifically: doc (t)1,t2,…,tn) Specifying tkFor the feature item, the text is represented by the feature item and the corresponding weight thereof to form a vector, and the vector is in the form of: doc ((t)1,w1),(t2,w2),…,(tn,wn) Wherein w)kIs a feature item tkThe weight of (c).
Further, the extracting a topic from the set according to the feature items, and the extracting a word from the word distribution corresponding to the extracted topic specifically includes:
Figure BDA0003530820410000027
p(w|d)=p(w|t)×p(t|d)
where t is the topic of the abstraction, w is the word being abstracted, d is the set being abstracted, and p is the abstract of the composition.
On the other hand, the system for automatically generating the Chinese abstract based on the NLP technology comprises a text input unit, an encoding unit and a decoding unit, wherein the text input unit is used for inputting a text needing to generate the abstract through a user terminal, the encoding unit is used for encoding the text needing to generate the abstract to obtain a text representation, and the decoding unit is used for decoding the text representation of the input text to generate the abstract.
Further, the coding unit is formed by stacking N identical coding layers, and the coding process of the l-th layer of the coding unit is as follows:
Figure BDA0003530820410000031
Figure BDA0003530820410000032
Figure BDA0003530820410000033
wherein,
Figure BDA0003530820410000034
indicating that the l-1 level of the encoder is for the ith word x in the input text xiThe output of the l-1 layer is the input of the l layer; Self-Attn denotes the application of a Self-attentive mechanism to the input, LayerNorm denotes layer normalization, FFN denotes a feed-forward neural network,
Figure BDA0003530820410000035
and
Figure BDA0003530820410000036
expressed as an intermediate result of the calculation process;
the decoding unit passes a probability distribution PuocabObtaining the output word of the current step, the probability distribution Puocab=softmax(WoS+bo) Wherein W isoAnd boFor trainable parameters, S is the output of the last layer of the decoder, and softmax is a softmax function.
Compared with the prior art, the invention has the following beneficial effects: the invention automatically generates the abstract through a natural language processing technology, and refers to automatically generating a segment of abstract which retains key information in an input text and has smooth, concise and accurate semantics according to one or more documents. The automatic text summarization can generate the summarization quickly, accurately and in real time, and overcomes the defects of manual summarization.
Drawings
FIG. 1 is a schematic view of the overall process of the present invention;
FIG. 2 is an overall system block diagram of the present invention;
FIG. 3 is a schematic diagram of a portion of the modules of the present invention.
In the figure: 100. a text input unit; 200. an encoding unit; 300. and a decoding unit.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "upper", "lower", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. It should be noted that when one component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present.
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
An automatic Chinese abstract generation method based on NLP technology comprises the following steps:
s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word;
s2: automatically generating an evaluation index;
s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index;
s4: extracting sentences from the text by adopting an abstract generation model to generate an abstract;
further, the target training of the text needing to generate the abstract specifically includes:
Figure BDA0003530820410000041
wherein £ (theta) is the probability of generating each target word to the maximum, D is the training data set, x is the input text, y is the target abstract, and theta is a parameter of the model.
The model of the embodiment is trained in two steps: pre-training and fine-tuning. To better exploit the pre-training model and limit the hardware conditions, the inventors replaced the pre-training process with a model MASS. And fine-tuning the pre-trained model on the data set of the text abstract. In fine tuning, maximum likelihood estimation is used to maximize the conditional probability of generating each target word given the model parameters θ and the input text x, which is equivalent to minimizing the negative log-likelihood between the model-generated word and the target word.
Further, the automatic generation evaluation index is one or the combination of two of the ROUGE-N, ROUGE-L.
Further, the ROUGE-N index is specifically as follows:
Figure BDA0003530820410000042
wherein, S isTabulated sentences, grams in the reference abstractnRepresents an n-tuple, Count (gram)n) Represents the number of n-tuples, Count in Smatch(gramn) Representing the number of n-tuples that the model-generated digest and the reference digest match. The index counts the co-occurrence recall rate between the reference digest and the model generation digest n-tuple.
Further, the ROUGE-L index is specifically as follows:
Figure BDA0003530820410000043
Figure BDA0003530820410000044
Figure BDA0003530820410000045
wherein X is the reference summary, m is its length, Y is the summary generated by the model, n is its length,
Figure BDA0003530820410000051
the index measures the quality of the model generated abstract according to the longest common substring between the model generated abstract and the reference abstract.
Further, the sentence extraction of the text specifically includes representing the text content as a set composed of feature items, extracting a topic from the set according to the feature items, extracting a word from a word distribution corresponding to the extracted topic, and repeating the above process until the abstract is generated.
Further, the representing the text content as a set of feature items is specifically: doc (t)1,t2,…,tn) Specifying tkFor the feature item, the text is represented by the feature item and the corresponding weight thereof to form a vector, and the vector is in the form of: doc ((t)1,w1),(t2,w2),…,(tn,wn) Therein), whichIn, wkIs a feature item tkThe weight of (c).
In this embodiment, the topic model realizes extension of the BOW model by introducing a "topic (topic)" as a hidden variable, and abstracting the association relationship between words and documents as: the topic model maps words or phrases with the same topic to the same dimension, and the judgment basis of two different words belonging to the same topic is as follows: two different words may have a higher probability of being generated than other words if both words have a higher probability of occurring in the same document at the same time, or given a topic. The topic model is a special probability map model, the mathematical basis is complete, and the inference based on Gibbs sampling is simple and effective. Assuming that there are K topics (which are generally set by human, and this is a possible problem of the model), an article is represented as a K-dimensional vector, each dimension of the vector represents a topic, and the weight represents the probability that the article belongs to the corresponding topic. Thus, the topic model calculates the word distribution of topics in the corpus of text, and calculates the topic distribution of each article.
The text features are extracted from an original text, and can be characters, words, phrases, sentences or other forms to form nodes, the same feature items only form one node, the total number of the nodes is the number of the feature items which are different from each other in the text, a node set V is formed, edges are formed by the relationship among the nodes in V, the relationship is the most simple co-occurrence relationship, and if two feature items appear in a window, such as a sentence, a specific number of character intervals, a document and the like, the connecting edges exist between the nodes corresponding to the feature items in the window. The text graph can be directional or non-directional, the set of all edges forms an edge set E. In addition to the co-occurrence relationships forming a textual co-occurrence graph, a grammatical relationship graph or semantic relationship graph structure for the text may be similarly constructed.
Further, the extracting a topic from the set according to the feature items, and the extracting a word from the word distribution corresponding to the extracted topic specifically includes:
Figure BDA0003530820410000052
p(w|d)=p(w|t)×p(t|d)
where t is the topic of the extraction, w is the word of the extraction, d is the set being extracted, and p is the summary of the composition.
On the other hand, the system for automatically generating the Chinese abstract based on the NLP technology comprises a text input unit 100, an encoding unit 200 and a decoding unit 300, wherein the text input unit 100 is used for inputting a text needing to generate the abstract through a user terminal, the encoding unit 200 is used for encoding the text needing to generate the abstract to obtain a text representation, and the decoding unit 300 is used for decoding the text representation of the input text to generate the abstract.
Further, the encoding unit 200 is formed by stacking N identical encoding layers, and the encoding process of the l-th layer of the encoding unit 200 is as follows:
Figure BDA0003530820410000061
Figure BDA0003530820410000062
Figure BDA0003530820410000063
wherein,
Figure BDA0003530820410000064
indicating that the encoder level l-1 is for the ith word x in the input text xiThe output of the l-1 layer is the input of the l layer; Self-Attn denotes the application of a Self-attentive mechanism to the input, LayerNorm denotes layer normalization, FFN denotes a feed-forward neural network,
Figure BDA0003530820410000065
and
Figure BDA0003530820410000066
expressed as an intermediate result of the calculation process;
the decoding unit 300 passes through a probability distribution PuocabObtaining the output word of the current step, the probability distribution Puocab=softmax(WoS+bo) Wherein W isoAnd boFor trainable parameters, S is the output of the last layer of the decoder, and softmax is a softmax function.
Spatially relative terms, such as "above … …," "above … …," "above … … surface," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial relationship to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary term "above … …" can include both an orientation of "above … …" and "below … …". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances such that, for example, embodiments of the application described herein may be implemented in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An automatic Chinese abstract generation method based on NLP technology is characterized by comprising the following steps:
s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word;
s2: automatically generating an evaluation index;
s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index;
s4: and adopting an abstract generation model to extract sentences of the text to generate an abstract.
2. The method for automatically generating a chinese abstract based on NLP technology according to claim 1, wherein the target training of the text to be generated with an abstract specifically comprises:
Figure FDA0003530820400000011
wherein £ (theta) is the probability of generating each target word to the maximum, D is the training data set, x is the input text, y is the target abstract, and theta is a parameter of the model.
3. The method for automatically generating a chinese abstract according to claim 1, wherein the automatically generated evaluation index is any one or a combination of two of the root-N, ROUGE-L.
4. The method and system for automatically generating a chinese abstract based on NLP technology according to claim 3, wherein the route-N index is specifically:
Figure FDA0003530820400000012
where S represents a sentence, gram, in the reference abstractnRepresents an n-tuple, Count (gram)n) Represents the number of n-tuples, Count in Smatch(gramn) Representing the number of n-tuples that the model-generated digest and the reference digest match.
5. The method for automatically generating a chinese abstract according to claim 3, wherein the route-L indicators are specifically:
Figure FDA0003530820400000013
Figure FDA0003530820400000014
Figure FDA0003530820400000015
wherein X is the reference summary, m is its length, Y is the summary generated by the model, n is its length,
Figure FDA0003530820400000016
6. the method according to claim 1, wherein the sentence extraction of the text specifically includes representing the text content as a set of feature items, extracting a topic from the set according to the feature items, extracting a word from a word distribution corresponding to the extracted topic, and repeating the above process until the abstract is generated.
7. The method for automatically generating a chinese abstract based on NLP technology according to claim 1, wherein the representing of text content as a set of feature items specifically is: doc (t)1,t2,…,tn) Specifying tkFor the feature item, the text is represented by the feature item and the corresponding weight thereof to form a vector, and the vector is in the form of: doc ((t)1,w1),(t2,w2),…,(tn,wn) Wherein w)kIs a feature item tkThe weight of (c). .
8. The method for automatically generating a chinese abstract based on NLP technology according to claim 7, wherein said extracting a topic from the set according to the feature items, and extracting a word from the word distribution corresponding to the extracted topic specifically comprises:
Figure FDA0003530820400000021
p(w|d)=p(w|t)×p(t|d)
where t is the topic of the extraction, w is the word of the extraction, d is the set being extracted, and p is the summary of the composition.
9. The Chinese abstract automatic generation system based on the NLP technology comprises a text input unit (100), an encoding unit (200) and a decoding unit (300), wherein the text input unit (100) is used for inputting a text needing to be abstracted through a user terminal, the encoding unit (200) is used for encoding the text needing to be abstracted to obtain a text representation, and the decoding unit (300) is used for decoding the text representation of the input text to generate the abstract.
10. The system for automatically generating the chinese abstract based on the NLP technology of claim 9, wherein the coding unit (200) is formed by stacking N identical coding layers, and the coding process of the l-th layer of the coding unit (200) is as follows:
Figure FDA0003530820400000022
Figure FDA0003530820400000023
Figure FDA0003530820400000024
wherein,
Figure FDA0003530820400000025
indicating that the encoder level l-1 is for the ith word x in the input text xiThe output of the l-1 layer is the input of the l layer; Self-Attn denotes the application of a Self-attentive mechanism to the input, LayerNorm denotes layer normalization, FFN denotes a feed-forward neural network,
Figure FDA0003530820400000026
and
Figure FDA0003530820400000027
expressed as an intermediate result of the calculation process;
the decoding unit (300) passes a probability distribution PuocabObtaining the output word of the current step, the probability distribution Puocab=softmax(WoS+bo) Wherein W isoAnd boS is the output of the last layer of the decoder, and softmax is the softmax function, for trainable parameters.
CN202210204288.9A 2022-03-03 2022-03-03 NLP technology-based Chinese abstract automatic generation method and system Pending CN114547289A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210204288.9A CN114547289A (en) 2022-03-03 2022-03-03 NLP technology-based Chinese abstract automatic generation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210204288.9A CN114547289A (en) 2022-03-03 2022-03-03 NLP technology-based Chinese abstract automatic generation method and system

Publications (1)

Publication Number Publication Date
CN114547289A true CN114547289A (en) 2022-05-27

Family

ID=81660896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210204288.9A Pending CN114547289A (en) 2022-03-03 2022-03-03 NLP technology-based Chinese abstract automatic generation method and system

Country Status (1)

Country Link
CN (1) CN114547289A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541505A (en) * 2023-07-05 2023-08-04 华东交通大学 Dialogue abstract generation method based on self-adaptive dialogue segmentation

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541505A (en) * 2023-07-05 2023-08-04 华东交通大学 Dialogue abstract generation method based on self-adaptive dialogue segmentation
CN116541505B (en) * 2023-07-05 2023-09-19 华东交通大学 Dialogue abstract generation method based on self-adaptive dialogue segmentation

Similar Documents

Publication Publication Date Title
Zhang et al. A text sentiment classification modeling method based on coordinated CNN‐LSTM‐attention model
CN108363743B (en) Intelligent problem generation method and device and computer readable storage medium
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
Gallant et al. Representing objects, relations, and sequences
CN110851599B (en) Automatic scoring method for Chinese composition and teaching assistance system
CN110532395B (en) Semantic embedding-based word vector improvement model establishing method
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN108733647B (en) Word vector generation method based on Gaussian distribution
CN110362819A (en) Text emotion analysis method based on convolutional neural networks
US20230169271A1 (en) System and methods for neural topic modeling using topic attention networks
CN114925687B (en) Chinese composition scoring method and system based on dynamic word vector characterization
Suyanto Synonyms-based augmentation to improve fake news detection using bidirectional LSTM
CN115130538A (en) Training method of text classification model, text processing method, equipment and medium
CN116049387A (en) Short text classification method, device and medium based on graph convolution
US20230259708A1 (en) System and methods for key-phrase extraction
CN116010553A (en) Viewpoint retrieval system based on two-way coding and accurate matching signals
CN109325243B (en) Character-level Mongolian word segmentation method based on sequence model and word segmentation system thereof
CN113051886B (en) Test question duplicate checking method, device, storage medium and equipment
CN114547289A (en) NLP technology-based Chinese abstract automatic generation method and system
CN114970557B (en) Knowledge enhancement-based cross-language structured emotion analysis method
CN111008529A (en) Chinese relation extraction method based on neural network
CN115146031A (en) Short text position detection method based on deep learning and assistant features
CN113935308A (en) Method and system for automatically generating text abstract facing field of geoscience
Fan et al. Multi-label Chinese question classification based on word2vec
Cui et al. Aspect level sentiment classification based on double attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination