CN114547289A - NLP technology-based Chinese abstract automatic generation method and system - Google Patents
NLP technology-based Chinese abstract automatic generation method and system Download PDFInfo
- Publication number
- CN114547289A CN114547289A CN202210204288.9A CN202210204288A CN114547289A CN 114547289 A CN114547289 A CN 114547289A CN 202210204288 A CN202210204288 A CN 202210204288A CN 114547289 A CN114547289 A CN 114547289A
- Authority
- CN
- China
- Prior art keywords
- abstract
- text
- word
- generate
- automatically generating
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000005516 engineering process Methods 0.000 title claims abstract description 21
- 238000012549 training Methods 0.000 claims abstract description 13
- 238000011156 evaluation Methods 0.000 claims abstract description 11
- 230000008569 process Effects 0.000 claims description 13
- 238000000605 extraction Methods 0.000 claims description 8
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000003058 natural language processing Methods 0.000 abstract description 13
- 230000007547 defect Effects 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
- G06F16/345—Summarisation for human users
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the field of automatic generation of abstracts, and particularly provides a Chinese abstract automatic generation method and system based on an NLP (non-line-of-sight) technology, which comprises the following steps: s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word; s2: automatically generating an evaluation index; s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index; s4: and adopting an abstract generation model to extract sentences of the text to generate an abstract. The invention automatically generates the abstract through a natural language processing technology, and refers to automatically generating a segment of abstract which retains key information in an input text and has smooth, concise and accurate semantics according to one or more documents. The automatic text summarization can generate the summarization quickly, accurately and in real time, and overcomes the defects of manual summarization.
Description
Technical Field
The invention relates to the field of automatic generation of abstracts, in particular to a Chinese abstract automatic generation method and system based on an NLP technology.
Background
Natural Language Processing (NLP) is a discipline that studies the linguistic problems of human interaction with computers. According to different technical implementation difficulties, such systems can be divided into three types, namely simple matching type, fuzzy matching type and paragraph understanding type. The simple matching type tutoring and answering system mainly realizes the matching of questions proposed by students and related answering items in an answer library through a simple keyword matching technology, thereby realizing the automatic answering of the questions or the related tutoring. The fuzzy matching type tutoring and answering system increases the matching of synonyms and antonyms on the basis of the fuzzy matching type tutoring and answering system. Thus, even if the student does not find a directly matching answer in the answer library according to the original keyword in the question, if the words synonymous with the keyword or antisense to the keyword can be matched, the relevant answer item can be found in the answer library. Paragraph understanding type tutoring and answering system is the most ideal and truly intelligent tutoring and answering system (simple matching type and fuzzy matching type, which can only be called "automatic tutoring and answering system" rather than "intelligent tutoring and answering system" strictly speaking). However, the system relates to paragraph understanding of natural language, and for Chinese, the understanding relates to various complex technologies in the NLP field such as automatic word segmentation, part of speech analysis, syntactic analysis and semantic analysis, so that the realization difficulty is high. In recent years, automatic text summarization has become one of the important research directions in the fields of artificial intelligence and natural language processing. The automatic text summarization aims to extract key information in an original text and generate a summary which is semantically smooth, concise and accurate, and aims to improve the information browsing efficiency of a user. With the development of deep learning, the automatic text summarization model of the present day is mainly constructed based on a sequence-to-sequence framework. However, the application of the current sequence-to-sequence framework in automatic text summarization also has many problems, such as difficulty in generating out-of-set words, inability to effectively model the connections between words, lack of modeling for the key information extraction process, etc.
Disclosure of Invention
The invention mainly aims to provide a Chinese abstract automatic generation method and system based on NLP technology, so as to solve the problems in the related technology.
In order to achieve the above object, according to an aspect of the present invention, there is provided a method and a system for automatically generating a chinese abstract based on NLP technology, comprising the following steps:
s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word;
s2: automatically generating an evaluation index;
s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index;
s4: extracting sentences from the text by adopting an abstract generation model to generate an abstract;
further, the target training of the text needing to generate the abstract specifically includes:
wherein £ (theta) is the probability of generating each target word to the maximum, D is the training data set, x is the input text, y is the target abstract, and theta is a parameter of the model.
Further, the automatic generation evaluation index is one or the combination of two of the ROUGE-N, ROUGE-L.
Further, the ROUGE-N index is specifically as follows:
where S represents a sentence, gram, in the reference abstractnRepresents an n-tuple, Count (gram)n) Represents the number of n-tuples, Count in Smatch(gramn) Representing the number of n-tuples that the model-generated digest and the reference digest match.
Further, the ROUGE-L index is specifically as follows:
wherein X is the reference summary, m is its length, Y is the summary generated by the model, n is its length,
further, the sentence extraction of the text specifically includes representing the text content as a set composed of feature items, extracting a topic from the set according to the feature items, extracting a word from a word distribution corresponding to the extracted topic, and repeating the above process until the abstract is generated.
Further, the representing the text content as a set of feature items is specifically: doc (t)1,t2,…,tn) Specifying tkFor the feature item, the text is represented by the feature item and the corresponding weight thereof to form a vector, and the vector is in the form of: doc ((t)1,w1),(t2,w2),…,(tn,wn) Wherein w)kIs a feature item tkThe weight of (c).
Further, the extracting a topic from the set according to the feature items, and the extracting a word from the word distribution corresponding to the extracted topic specifically includes:
p(w|d)=p(w|t)×p(t|d)
where t is the topic of the abstraction, w is the word being abstracted, d is the set being abstracted, and p is the abstract of the composition.
On the other hand, the system for automatically generating the Chinese abstract based on the NLP technology comprises a text input unit, an encoding unit and a decoding unit, wherein the text input unit is used for inputting a text needing to generate the abstract through a user terminal, the encoding unit is used for encoding the text needing to generate the abstract to obtain a text representation, and the decoding unit is used for decoding the text representation of the input text to generate the abstract.
Further, the coding unit is formed by stacking N identical coding layers, and the coding process of the l-th layer of the coding unit is as follows:
wherein,indicating that the l-1 level of the encoder is for the ith word x in the input text xiThe output of the l-1 layer is the input of the l layer; Self-Attn denotes the application of a Self-attentive mechanism to the input, LayerNorm denotes layer normalization, FFN denotes a feed-forward neural network,andexpressed as an intermediate result of the calculation process;
the decoding unit passes a probability distribution PuocabObtaining the output word of the current step, the probability distribution Puocab=softmax(WoS+bo) Wherein W isoAnd boFor trainable parameters, S is the output of the last layer of the decoder, and softmax is a softmax function.
Compared with the prior art, the invention has the following beneficial effects: the invention automatically generates the abstract through a natural language processing technology, and refers to automatically generating a segment of abstract which retains key information in an input text and has smooth, concise and accurate semantics according to one or more documents. The automatic text summarization can generate the summarization quickly, accurately and in real time, and overcomes the defects of manual summarization.
Drawings
FIG. 1 is a schematic view of the overall process of the present invention;
FIG. 2 is an overall system block diagram of the present invention;
FIG. 3 is a schematic diagram of a portion of the modules of the present invention.
In the figure: 100. a text input unit; 200. an encoding unit; 300. and a decoding unit.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "upper", "lower", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. It should be noted that when one component is referred to as being "connected" to another component, it can be directly connected to the other component or intervening components may also be present.
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
An automatic Chinese abstract generation method based on NLP technology comprises the following steps:
s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word;
s2: automatically generating an evaluation index;
s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index;
s4: extracting sentences from the text by adopting an abstract generation model to generate an abstract;
further, the target training of the text needing to generate the abstract specifically includes:
wherein £ (theta) is the probability of generating each target word to the maximum, D is the training data set, x is the input text, y is the target abstract, and theta is a parameter of the model.
The model of the embodiment is trained in two steps: pre-training and fine-tuning. To better exploit the pre-training model and limit the hardware conditions, the inventors replaced the pre-training process with a model MASS. And fine-tuning the pre-trained model on the data set of the text abstract. In fine tuning, maximum likelihood estimation is used to maximize the conditional probability of generating each target word given the model parameters θ and the input text x, which is equivalent to minimizing the negative log-likelihood between the model-generated word and the target word.
Further, the automatic generation evaluation index is one or the combination of two of the ROUGE-N, ROUGE-L.
Further, the ROUGE-N index is specifically as follows:
wherein, S isTabulated sentences, grams in the reference abstractnRepresents an n-tuple, Count (gram)n) Represents the number of n-tuples, Count in Smatch(gramn) Representing the number of n-tuples that the model-generated digest and the reference digest match. The index counts the co-occurrence recall rate between the reference digest and the model generation digest n-tuple.
Further, the ROUGE-L index is specifically as follows:
wherein X is the reference summary, m is its length, Y is the summary generated by the model, n is its length,the index measures the quality of the model generated abstract according to the longest common substring between the model generated abstract and the reference abstract.
Further, the sentence extraction of the text specifically includes representing the text content as a set composed of feature items, extracting a topic from the set according to the feature items, extracting a word from a word distribution corresponding to the extracted topic, and repeating the above process until the abstract is generated.
Further, the representing the text content as a set of feature items is specifically: doc (t)1,t2,…,tn) Specifying tkFor the feature item, the text is represented by the feature item and the corresponding weight thereof to form a vector, and the vector is in the form of: doc ((t)1,w1),(t2,w2),…,(tn,wn) Therein), whichIn, wkIs a feature item tkThe weight of (c).
In this embodiment, the topic model realizes extension of the BOW model by introducing a "topic (topic)" as a hidden variable, and abstracting the association relationship between words and documents as: the topic model maps words or phrases with the same topic to the same dimension, and the judgment basis of two different words belonging to the same topic is as follows: two different words may have a higher probability of being generated than other words if both words have a higher probability of occurring in the same document at the same time, or given a topic. The topic model is a special probability map model, the mathematical basis is complete, and the inference based on Gibbs sampling is simple and effective. Assuming that there are K topics (which are generally set by human, and this is a possible problem of the model), an article is represented as a K-dimensional vector, each dimension of the vector represents a topic, and the weight represents the probability that the article belongs to the corresponding topic. Thus, the topic model calculates the word distribution of topics in the corpus of text, and calculates the topic distribution of each article.
The text features are extracted from an original text, and can be characters, words, phrases, sentences or other forms to form nodes, the same feature items only form one node, the total number of the nodes is the number of the feature items which are different from each other in the text, a node set V is formed, edges are formed by the relationship among the nodes in V, the relationship is the most simple co-occurrence relationship, and if two feature items appear in a window, such as a sentence, a specific number of character intervals, a document and the like, the connecting edges exist between the nodes corresponding to the feature items in the window. The text graph can be directional or non-directional, the set of all edges forms an edge set E. In addition to the co-occurrence relationships forming a textual co-occurrence graph, a grammatical relationship graph or semantic relationship graph structure for the text may be similarly constructed.
Further, the extracting a topic from the set according to the feature items, and the extracting a word from the word distribution corresponding to the extracted topic specifically includes:
p(w|d)=p(w|t)×p(t|d)
where t is the topic of the extraction, w is the word of the extraction, d is the set being extracted, and p is the summary of the composition.
On the other hand, the system for automatically generating the Chinese abstract based on the NLP technology comprises a text input unit 100, an encoding unit 200 and a decoding unit 300, wherein the text input unit 100 is used for inputting a text needing to generate the abstract through a user terminal, the encoding unit 200 is used for encoding the text needing to generate the abstract to obtain a text representation, and the decoding unit 300 is used for decoding the text representation of the input text to generate the abstract.
Further, the encoding unit 200 is formed by stacking N identical encoding layers, and the encoding process of the l-th layer of the encoding unit 200 is as follows:
wherein,indicating that the encoder level l-1 is for the ith word x in the input text xiThe output of the l-1 layer is the input of the l layer; Self-Attn denotes the application of a Self-attentive mechanism to the input, LayerNorm denotes layer normalization, FFN denotes a feed-forward neural network,andexpressed as an intermediate result of the calculation process;
the decoding unit 300 passes through a probability distribution PuocabObtaining the output word of the current step, the probability distribution Puocab=softmax(WoS+bo) Wherein W isoAnd boFor trainable parameters, S is the output of the last layer of the decoder, and softmax is a softmax function.
Spatially relative terms, such as "above … …," "above … …," "above … … surface," "above," and the like, may be used herein for ease of description to describe one device or feature's spatial relationship to another device or feature as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if a device in the figures is turned over, devices described as "above" or "on" other devices or configurations would then be oriented "below" or "under" the other devices or configurations. Thus, the exemplary term "above … …" can include both an orientation of "above … …" and "below … …". The device may be otherwise variously oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged under appropriate circumstances such that, for example, embodiments of the application described herein may be implemented in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. An automatic Chinese abstract generation method based on NLP technology is characterized by comprising the following steps:
s1: performing target training on the text needing to generate the abstract, and maximizing the probability of generating each target word;
s2: automatically generating an evaluation index;
s3: evaluating the text needing to generate the abstract by adopting an automatic generation evaluation index;
s4: and adopting an abstract generation model to extract sentences of the text to generate an abstract.
2. The method for automatically generating a chinese abstract based on NLP technology according to claim 1, wherein the target training of the text to be generated with an abstract specifically comprises:
wherein £ (theta) is the probability of generating each target word to the maximum, D is the training data set, x is the input text, y is the target abstract, and theta is a parameter of the model.
3. The method for automatically generating a chinese abstract according to claim 1, wherein the automatically generated evaluation index is any one or a combination of two of the root-N, ROUGE-L.
4. The method and system for automatically generating a chinese abstract based on NLP technology according to claim 3, wherein the route-N index is specifically:
where S represents a sentence, gram, in the reference abstractnRepresents an n-tuple, Count (gram)n) Represents the number of n-tuples, Count in Smatch(gramn) Representing the number of n-tuples that the model-generated digest and the reference digest match.
6. the method according to claim 1, wherein the sentence extraction of the text specifically includes representing the text content as a set of feature items, extracting a topic from the set according to the feature items, extracting a word from a word distribution corresponding to the extracted topic, and repeating the above process until the abstract is generated.
7. The method for automatically generating a chinese abstract based on NLP technology according to claim 1, wherein the representing of text content as a set of feature items specifically is: doc (t)1,t2,…,tn) Specifying tkFor the feature item, the text is represented by the feature item and the corresponding weight thereof to form a vector, and the vector is in the form of: doc ((t)1,w1),(t2,w2),…,(tn,wn) Wherein w)kIs a feature item tkThe weight of (c). .
8. The method for automatically generating a chinese abstract based on NLP technology according to claim 7, wherein said extracting a topic from the set according to the feature items, and extracting a word from the word distribution corresponding to the extracted topic specifically comprises:
p(w|d)=p(w|t)×p(t|d)
where t is the topic of the extraction, w is the word of the extraction, d is the set being extracted, and p is the summary of the composition.
9. The Chinese abstract automatic generation system based on the NLP technology comprises a text input unit (100), an encoding unit (200) and a decoding unit (300), wherein the text input unit (100) is used for inputting a text needing to be abstracted through a user terminal, the encoding unit (200) is used for encoding the text needing to be abstracted to obtain a text representation, and the decoding unit (300) is used for decoding the text representation of the input text to generate the abstract.
10. The system for automatically generating the chinese abstract based on the NLP technology of claim 9, wherein the coding unit (200) is formed by stacking N identical coding layers, and the coding process of the l-th layer of the coding unit (200) is as follows:
wherein,indicating that the encoder level l-1 is for the ith word x in the input text xiThe output of the l-1 layer is the input of the l layer; Self-Attn denotes the application of a Self-attentive mechanism to the input, LayerNorm denotes layer normalization, FFN denotes a feed-forward neural network,andexpressed as an intermediate result of the calculation process;
the decoding unit (300) passes a probability distribution PuocabObtaining the output word of the current step, the probability distribution Puocab=softmax(WoS+bo) Wherein W isoAnd boS is the output of the last layer of the decoder, and softmax is the softmax function, for trainable parameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210204288.9A CN114547289A (en) | 2022-03-03 | 2022-03-03 | NLP technology-based Chinese abstract automatic generation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210204288.9A CN114547289A (en) | 2022-03-03 | 2022-03-03 | NLP technology-based Chinese abstract automatic generation method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114547289A true CN114547289A (en) | 2022-05-27 |
Family
ID=81660896
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210204288.9A Pending CN114547289A (en) | 2022-03-03 | 2022-03-03 | NLP technology-based Chinese abstract automatic generation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114547289A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116541505A (en) * | 2023-07-05 | 2023-08-04 | 华东交通大学 | Dialogue abstract generation method based on self-adaptive dialogue segmentation |
-
2022
- 2022-03-03 CN CN202210204288.9A patent/CN114547289A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116541505A (en) * | 2023-07-05 | 2023-08-04 | 华东交通大学 | Dialogue abstract generation method based on self-adaptive dialogue segmentation |
CN116541505B (en) * | 2023-07-05 | 2023-09-19 | 华东交通大学 | Dialogue abstract generation method based on self-adaptive dialogue segmentation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | A text sentiment classification modeling method based on coordinated CNN‐LSTM‐attention model | |
CN108363743B (en) | Intelligent problem generation method and device and computer readable storage medium | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
Gallant et al. | Representing objects, relations, and sequences | |
CN110851599B (en) | Automatic scoring method for Chinese composition and teaching assistance system | |
CN110532395B (en) | Semantic embedding-based word vector improvement model establishing method | |
CN113255320A (en) | Entity relation extraction method and device based on syntax tree and graph attention machine mechanism | |
CN108733647B (en) | Word vector generation method based on Gaussian distribution | |
CN110362819A (en) | Text emotion analysis method based on convolutional neural networks | |
US20230169271A1 (en) | System and methods for neural topic modeling using topic attention networks | |
CN114925687B (en) | Chinese composition scoring method and system based on dynamic word vector characterization | |
Suyanto | Synonyms-based augmentation to improve fake news detection using bidirectional LSTM | |
CN115130538A (en) | Training method of text classification model, text processing method, equipment and medium | |
CN116049387A (en) | Short text classification method, device and medium based on graph convolution | |
US20230259708A1 (en) | System and methods for key-phrase extraction | |
CN116010553A (en) | Viewpoint retrieval system based on two-way coding and accurate matching signals | |
CN109325243B (en) | Character-level Mongolian word segmentation method based on sequence model and word segmentation system thereof | |
CN113051886B (en) | Test question duplicate checking method, device, storage medium and equipment | |
CN114547289A (en) | NLP technology-based Chinese abstract automatic generation method and system | |
CN114970557B (en) | Knowledge enhancement-based cross-language structured emotion analysis method | |
CN111008529A (en) | Chinese relation extraction method based on neural network | |
CN115146031A (en) | Short text position detection method based on deep learning and assistant features | |
CN113935308A (en) | Method and system for automatically generating text abstract facing field of geoscience | |
Fan et al. | Multi-label Chinese question classification based on word2vec | |
Cui et al. | Aspect level sentiment classification based on double attention mechanism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |