CN114254175A - Method for extracting generative abstract of power policy file - Google Patents
Method for extracting generative abstract of power policy file Download PDFInfo
- Publication number
- CN114254175A CN114254175A CN202111550623.2A CN202111550623A CN114254175A CN 114254175 A CN114254175 A CN 114254175A CN 202111550623 A CN202111550623 A CN 202111550623A CN 114254175 A CN114254175 A CN 114254175A
- Authority
- CN
- China
- Prior art keywords
- attention
- model
- decoder
- encoder
- abstract
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method for extracting a generative abstract of a power policy file, which comprises the following steps: step S10, acquiring an electronic document of the power policy file by adopting a crawler technology; step S11, performing word segmentation processing on the electronic document, forming initial embedded data according to a word vector model, and inputting the initial embedded data into a pre-trained abstract generation model; step S12, adding position coding in the bottom embedding of the encoder and the decoder; and step S13, automatically generating abstract contents by using the output of the current time and the previous time of the decoder and the generation probability of the pointer generation network obtained by attention distribution splicing. The invention can improve the efficiency and the accuracy of generating the abstract.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method for extracting a generating type automatic abstract of a power policy file.
Background
For power supply enterprises, the enhancement of power price management is an important guarantee that the sales income can be realized and the profit level can be improved. The method has the advantages of seriously executing national electricity price policies and regulations, standardizing the order of electricity price management, and having important significance for ensuring the regulation and control of national industrial policies, saving energy and maintaining the economic benefits of both power supply and power utilization parties. The electricity price policy needs to be known in time so as to make a reasonable electricity marketing strategy and promote the development of the electric power enterprises.
At present, the inherent habits of people in daily work and life are being changed rapidly by the rise of artificial intelligence and deep learning technology, the technical speciality of the automatic summarization method based on deep learning can be exerted in the field of electricity price policy management, generally speaking, electricity price policy information can be published on websites of the national level with strong specialty and authority, so that an electricity price policy electronic document can be obtained from the websites, in order to facilitate managers to know the key content of an electricity price policy text rapidly, key information in the electricity price policy electronic document needs to be extracted, then a summary document based on the electricity price policy electronic document is generated automatically, and policy makers are helped to obtain relevant information more efficiently.
In the prior art, the technology for realizing automatic summarization mainly comprises an abstract type summarization and a generated summarization, and a TextRank ordering algorithm is adopted in the abstract type summarization method, so that the abstract type summarization method is widely applied to the industry due to the characteristics of conciseness and high efficiency. However, the abstraction type abstract mainly considers word frequency, does not have too much semantic information, and cannot establish complete semantic information in text paragraphs.
The generated text abstract is mainly realized by a deep neural network structure, a basic framework is from Sequence-to-Sequence (Seq 2Seq) sequences proposed by the Google Brain team in 2014, and an Encoder-Decoder framework (Encoder-Decoder) is adopted, wherein the most classical Encoder and Decoder are both formed by a plurality of layers of RNN/LSTM, the Encoder is responsible for encoding an original text into a vector, and the Decoder is responsible for extracting information from the vector, acquiring semantics and generating the text abstract.
However, when the existing automatic summarization technology is applied to texts with certain writing formats, such as scientific papers and policy documents, the summarization effect is still insufficient, and the problems of Out of vocabulary (OOV), text repetition discontinuity and long-distance dependence mainly exist.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method for extracting a generative digest of a power policy document, which can solve the above mentioned problems and improve the efficiency and accuracy of generating the digest.
To solve the above technical problem, as an aspect of the present invention, there is provided a method for extracting a generative digest of a power policy file, including the steps of:
step S10, acquiring an electronic document of the power policy file from a specific website by adopting a crawler technology;
step S11, performing word segmentation processing on the electronic document, forming initial embedded data according to a word vector model, and inputting the initial embedded data into a pre-trained abstract generation model;
wherein the digest generation model employs an encoder-decoder framework in which an attention-based bidirectional Transformer model (Transformer) is used as a language representation model;
the encoder includes: the multi-head attention layer and the fully-connected feedforward neural network layer are formed by two sublayers, the connection between the sublayers adopts residual connection, and then layer normalization is carried out; the decoder at least comprises a multi-head attention layer with a mask, a multi-head attention layer and a fully-connected feedforward neural network layer, and the sub-layers are connected by adopting residual errors and are normalized;
step S12, adding position coding in the bottom embedding of the encoder and the decoder;
step S13, obtaining the generation probability of Pointer generation network (Pointer-Generator Networks) by using the output of the decoder at the current time and the previous time and attention distribution concatenation, and controlling to copy the content in the electronic document source text according to the generation probability or generate corresponding abstract content according to the attention.
Preferably, further comprising:
and constructing a summary generation model in advance and training to obtain the trained summary generation model.
Preferably, the pre-constructing a summary generation model and performing training, and the obtaining of the trained summary generation model further includes:
constructing a summary generation model adopting an encoder-decoder framework, wherein a bidirectional converter model based on an attention mechanism is used in both an encoder and a decoder;
counting all words in the training corpus and generating a dictionary file; and forming a training set;
and initially embedding the dictionary files in the training set into an encoder of the abstract generating model through a vector model, training the abstract generating model, and finally obtaining the trained abstract generating model.
Preferably, in step S13, the generation probability is obtained in the pointer generation network by:
calculating the attention product of each word embedding input and the decoder output, normalizing to obtain weights, and weighting and summing to obtain the attention score ei:
ei=νTtanh(Whhi+Wsst+Wcct);
Calculating content vector c by multiplying attention and hidden layer of encoderiAnd a vocabulary distribution Pvocab:
Pvocab=softmax(L(st,ci))
Wherein h isiEncoder hidden layer state representing the ith word, ciRepresents a content vector, siThe decoder hidden layer state representing the ith word;
calculating a generation probability P bygen:
pgen=σ(Wc'ci+Wh'hi+Wxxt+bptr)
Obtaining the probability distribution of the final word list by combining the word list distribution and the attention distribution;
Preferably, the pointer generation network employs a hierarchical pointer generation network.
The implementation of the invention has the following beneficial effects:
the invention provides a method for extracting a generative abstract of a power policy file, which adopts a Seq2Seq framework integrated with an attention mechanism as a basic model for generating the abstract, and adds a pointer to generate a network at the same time, so that words are directly copied from a source document to solve the OOV problem;
and then combining a hierarchical structure of the policy document, and adding language model modeling language segment information of a language segment level (section level) on the basis of a pointer generation network. In the technology of language model modeling language segment information, the invention abandons the traditional RNN and LSTM structures, introduces a bidirectional converter model as a language representation model in a Seq2Seq framework integrated with an attention mechanism, and effectively solves the problem of long-distance dependence. The invention designs an improved attention mechanism to solve the problems of incoherent irrelevant content and repeated sentences in long texts.
The invention designs an automatic abstract identification method suitable for the long text aiming at the characteristics of the electricity price policy text that the long text and the writing format are relatively fixed, and integrates the special format characteristics of the automatic abstract identification method. The efficiency and the accuracy of the abstract extraction process can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is within the scope of the present invention for those skilled in the art to obtain other drawings based on the drawings without inventive exercise.
Fig. 1 is a schematic main flow chart of an embodiment of a method for collecting an electricity price policy document according to the present invention;
fig. 2 is a schematic diagram of the hierarchical pointer generation network referred to in fig. 1.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
Fig. 1 is a main flow diagram illustrating an embodiment of a method for extracting a generative digest of a power policy file according to the present invention. Referring to fig. 2 together, in this embodiment, the method includes the following steps:
step S10, acquiring an electronic document of the power policy file from a specific website by adopting a crawler technology;
step S11, performing word segmentation processing on the electronic document, forming initial embedded data according to a word vector model, and inputting the initial embedded data into a pre-trained abstract generation model;
wherein the digest generation model employs an encoder-decoder framework in which an attention-based bidirectional Transformer model (Transformer) is used as a language representation model;
the encoder includes: the multi-head attention layer and the fully-connected feedforward neural network layer are formed by two sublayers, the connection between the sublayers adopts residual connection, and then layer normalization is carried out; the decoder at least comprises a multi-head attention layer with a mask, a multi-head attention layer and a fully-connected feedforward neural network layer, and the sub-layers are connected by adopting residual errors and are normalized;
step S12, adding position coding in the bottom embedding of the encoder and the decoder;
step S13, obtaining the generation probability of Pointer generation network (Pointer-Generator Networks) by using the output of the decoder at the current time and the previous time and attention distribution concatenation, and controlling to copy the content in the electronic document source text according to the generation probability or generate corresponding abstract content according to the attention. Specifically, if there is no decoded word in the lexical distribution, then it is replicated using the multi-head attention distribution, and if there is a decoded word in the lexical distribution, then a distributed representation of the decoded word is used
It is understood that, in a specific example of the present invention, further comprising:
and constructing a summary generation model in advance and training to obtain the trained summary generation model.
In one example, the pre-constructing and training the summary generation model, and obtaining the trained summary generation model further includes:
constructing a summary generation model adopting an encoder-decoder framework, wherein a bidirectional converter model based on an attention mechanism is used in both an encoder and a decoder;
counting all words in the training corpus and generating a dictionary file; and forming a training set;
and initially embedding the dictionary files in the training set into an encoder of the abstract generating model through a vector model, training the abstract generating model, and finally obtaining the trained abstract generating model.
Specifically, in step S13, the generation probability is obtained in the pointer generation network by:
calculating the attention product of each word embedding input and the decoder output, normalizing to obtain weights, and weighting and summing to obtain the attention score ei:
ei=νTtanh(Whhi+Wsst+Wcct);
Calculating content vector c by multiplying attention and hidden layer of encoderiAnd vocabulary distribution Pvocab:
Pvocab=softmax(L(st,ci))
wherein h isiEncoder hidden layer state representing the ith word, ciRepresents a content vector, siThe decoder hidden layer state representing the ith word;
calculating a generation probability P bygen:
pgen=σ(Wc'ci+Wh'hi+Wxxt+bptr)
Obtaining the probability distribution of the final word list by combining the word list distribution and the attention distribution;
In one example of the present invention, the pointer generation network employs a hierarchical pointer generation network.
For better understanding, the following further describes each of the aspects of the present invention.
First, in the embodiments provided herein, the Sequence-to-Sequence (Sequence-to-Sequence) framework of Attention (Attention) mechanism is adopted, which attempts to use the RNN as an encoder and decoder first. And simultaneously adding a Pointer to generate a network (Pointer-Generator Networks), and using initial embedding of words obtained by a pre-trained word vector model as the input of the model.
It can be understood that, compared with the Seq2Seq framework which is only adopted and is integrated with the Attention mechanism, the decoder generates the distribution P of a word list through the softmax functionvocabDifferent, it means that attention calculation is performed once for the generating network and the words in the source document at the decoder stage, thereby generating an attention distribution. At this point, the pointer generation network computes the attention product with its decoder output by embedding each word in the input, then normalizing to get the weight, and weighted summing to get the attention score ei:
ei=νTtanh(Whhi+Wsst+Wcct);
Calculating content vector c by multiplying attention and hidden layer of encoderiAnd a vocabulary distribution Pvocab:
Pvocab=softmax(L(st,ci))
Wherein h isiEncoder hidden layer state representing the ith word, ciRepresents a content vector, siThe decoder hidden layer state representing the ith word;
calculating a generation probability P bygen:
pgen=σ(Wc'ci+Wh'hi+Wxxt+bptr)
Obtaining the probability distribution of the final word list by combining the word list distribution and the attention distribution;
wherein, P_vocabIs a distribution of the word list,is the attention distribution. PgenCan be viewed as a switch that controls whether to copy words from the input queue or to generate new words, P if unregistered_vocab0, which can only be obtained by replication; words can only be generated by the model if they do not appear in the input text.
The probability of generating words is obtained as the final output result.
Secondly, in the invention, a pointer with a hierarchical structure is adopted to generate a network.
Since price policy articles are often structured, they are organized into a section. When summarization is performed, information is generally extracted from different speech segments and then summarized. Therefore, the invention adds language model modeling language segment information of a language segment level (section level) on the basis of the prior art.
As shown in fig. 2, a schematic diagram of a hierarchical pointer generation network employed by the present invention is shown.
Wherein, for the encoder:
the lowest layer word-level RNN generates an expression of a section, wherein an superscript(s) represents the section, (t) represents a decode step, (e) represents an encoder, (d) represents a decoder, a subscript i represents a word number, and j represents the section number;
x(j,i)the word representing the ith word, part of the jth word, embeds a vector.
Section-level RNN generates a representation of a document using underlying input
For the decoder:
the specific method is that the context coefficient is provided with the information of section level, the context coefficient is represented as firstly summing in a section and then summing all the sections.
Wherein the content of the first and second substances,the encoder hidden layer state of the ith word representing the jth part,
The newly introduced variable relates to the attribute of the section level:
wherein the content of the first and second substances,the encoder hidden layer state of the jth section is represented,representing the decoder hidden layer state at time t-1.
In general, the attention coefficient is calculated as follows,
wherein the content of the first and second substances,the encoder hidden layer state representing the jth partial ith word,representing the decoder hidden layer state at time t-1.
The coverage vector is calculated as:
and (3) calculating the final probability:
wherein the content of the first and second substances,representing the decoder hidden layer state at time t, ctIs a coverage vector.
Third, in the present invention, the RNN model in the encoder-decoder framework is replaced with a bidirectional Transformer model (Transformer).
The problem of long-distance dependence and parallel computation can be effectively solved by a bidirectional converter model (Transformer), which is based on a self-attention (self-attention) layer and is divided into an encoder part and a decoder part. The model structure can be combined with the model structure in the prior art.
The input word embedding vector and the corresponding element of the position embedding vector are added at the encoder end, so that the model can learn more information of word positions in the sentence, and finally the model can distinguish words at different positions in the sentence. They are input to the self attention layer, attention coefficients are calculated, and finally a vector Z is output to the next encoder.
Where the input to the attention mechanism is Q (query), the key-value pair (K, V) is used to store the context. For the attention mechanism, Q is K and V, and is calculated by the similarity of the text and the multiplication of the text and the text. And the result is spliced by using a multi-head attention mechanism, so that the current word and other words can show more relationships by the multi-head attention.
Then, the addition operation of the corresponding elements is performed through an addition (Add) Layer, and Layer Normalization is used to avoid the over-fitting problem. Then a Feed-Forward neural network (Feed Forward) is connected to map the attention matrix Z into a space with higher dimensionality, and the ReLU is used for carrying out nonlinear operation, and finally the attention matrix Z is restored to be in the same dimensionality as Z.
After 6 identical encoder processes, a vector R is finally output, representing all the encoded representation information for the source sequence. The vector R will be converted K, V into two vectors, where the key-value pairs (K, V) are used to store the context, which will be used for the computation of the encoding-decoding attention layer in the decoder part, thus enabling the information integration of the encoder and decoder.
In the decoder section, the process before the Linear layer is the same as the encoder. Because the encoder belongs to the prediction process, Linear operation is required to be performed on a Linear layer to realize dimension expansion (vector dimension is changed into the length of a dictionary), the final probability distribution of the whole dictionary is obtained through softmax normalization operation, and the index (index) with the maximum probability is selected, so that the corresponding generated word can be obtained.
This word is then input as the next predicted word, and so on until the sentence end flag < EOS > is generated, at which point the decoder portion ends.
It can be appreciated that the method adopted by the present invention solves the OOV problem in the policy document by adopting the Seq2Seq framework integrated with the attention mechanism and adopting the improved pointer to generate the network; compared with the traditional Seq2Seq framework integrated with attention mechanism, the pointer generation network has the capability of directly copying words from source documents, and the capability has a very good effect on generating some OOVs. Since the policy articles are often structured, they are organized into a section. When people abstract, information is usually extracted from different speech segments and then summarized. Therefore, the language model modeling language segment information of the language segment level (section level) is added to the basic pointer generation network, and the pointer generation network with the hierarchical structure is designed.
Meanwhile, a bidirectional converter model is introduced into a Seq2Seq framework integrated with an attention mechanism as a language representation model so as to solve the problem that the existing attention mechanism is easy to repeat and discontinuous for generating long texts; the abstract generation model designed by the invention can not pay more attention to a specific part and generate repeated sentences. First, the attention layer at the encoder calculates weights for each word in the input, which enables the generated content information to be overlaid on the original text without paying attention to a specific piece of content. And the attention layer at the decoder also calculates weights for the words that have already been generated, which avoids generating duplicate content. After the attention mechanism is used on the encoder and the decoder respectively, the encoder and the decoder are spliced together to decode to generate the next word, so that the generation of repeated sentences can be avoided.
The implementation of the invention has the following beneficial effects: the implementation of the invention has the following beneficial effects:
the invention provides a method for extracting a generative abstract of a power policy file, which adopts a Seq2Seq framework integrated with an attention mechanism as a basic model for generating the abstract, and adds a pointer to generate a network at the same time, so that words are directly copied from a source document to solve the OOV problem;
and then combining a hierarchical structure of the policy document, and adding language model modeling language segment information of a language segment level (section level) on the basis of a pointer generation network. In the technology of language model modeling language segment information, the invention abandons the traditional RNN and LSTM structures, introduces a bidirectional converter model as a language representation model in a Seq2Seq framework integrated with an attention mechanism, and effectively solves the problem of long-distance dependence. The invention designs an improved attention mechanism to solve the problems of incoherent irrelevant content and repeated sentences in long texts.
The invention designs an automatic abstract identification method suitable for the long text aiming at the characteristics of the electricity price policy text that the long text and the writing format are relatively fixed, and integrates the special format characteristics of the automatic abstract identification method. The efficiency and the accuracy of the abstract extraction process can be improved.
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
Claims (5)
1. A method for extracting a generative abstract of a power policy document is characterized by comprising the following steps:
step S10, acquiring an electronic document of the power policy file from a specific website by adopting a crawler technology;
step S11, performing word segmentation processing on the electronic document, forming initial embedded data according to a word vector model, and inputting the initial embedded data into a pre-trained abstract generation model;
wherein the digest generation model employs an encoder-decoder framework in which an attention-based bidirectional Transformer model (Transformer) is used as a language representation model;
the encoder includes: the multi-head attention layer and the fully-connected feedforward neural network layer are formed by two sublayers, the connection between the sublayers adopts residual connection, and then layer normalization is carried out; the decoder at least comprises a multi-head attention layer with a mask, a multi-head attention layer and a fully-connected feedforward neural network layer, and the sub-layers are connected by adopting residual errors and are normalized;
step S12, adding position coding in the bottom embedding of the encoder and the decoder;
step S13, obtaining the generation probability of Pointer generation network (Pointer-Generator Networks) by using the output of the decoder at the current time and the previous time and attention distribution concatenation, and controlling to copy the content in the electronic document source text according to the generation probability or generate corresponding abstract content according to the attention.
2. The method of claim 1, further comprising:
and constructing a summary generation model in advance and training to obtain the trained summary generation model.
3. The method of claim 2, wherein the pre-constructing and training the summary generation model, and obtaining the trained summary generation model further comprises:
constructing a summary generation model adopting an encoder-decoder framework, wherein a bidirectional converter model based on an attention mechanism is used in both an encoder and a decoder;
counting all words in the training corpus and generating a dictionary file; and forming a training set;
and initially embedding the dictionary files in the training set into an encoder of the abstract generating model through a vector model, training the abstract generating model, and finally obtaining the trained abstract generating model.
4. The method of claim 1, wherein in step S13, the generation probability is obtained in the pointer generation network by:
calculating the attention product of each word embedding input and the decoder output, normalizing to obtain weights, and weighting and summing to obtain the attention score ei:
ei=νTtanh(Whhi+Wsst+Wcct);
Calculating content vector c by multiplying attention and hidden layer of encoderiAnd a vocabulary distribution Pvocab:
Pvocab=soft max(L(st,ci))
Wherein h isiEncoder hidden layer state representing the ith word, ciRepresents a content vector, siThe decoder hidden layer state representing the ith word;
calculating a generation probability P bygen:
pgen=σ(Wc'ci+Wh'hi+Wxxt+bptr)
Obtaining the probability distribution of the final word list by combining the word list distribution and the attention distribution;
5. The method of any of claims 1 to 4, wherein the pointer generation network employs a hierarchical pointer generation network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111550623.2A CN114254175A (en) | 2021-12-17 | 2021-12-17 | Method for extracting generative abstract of power policy file |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111550623.2A CN114254175A (en) | 2021-12-17 | 2021-12-17 | Method for extracting generative abstract of power policy file |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114254175A true CN114254175A (en) | 2022-03-29 |
Family
ID=80795597
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111550623.2A Pending CN114254175A (en) | 2021-12-17 | 2021-12-17 | Method for extracting generative abstract of power policy file |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114254175A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116933785A (en) * | 2023-06-30 | 2023-10-24 | 国网湖北省电力有限公司武汉供电公司 | Transformer-based electronic file abstract generation method, system and medium |
-
2021
- 2021-12-17 CN CN202111550623.2A patent/CN114254175A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116933785A (en) * | 2023-06-30 | 2023-10-24 | 国网湖北省电力有限公司武汉供电公司 | Transformer-based electronic file abstract generation method, system and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110134771B (en) | Implementation method of multi-attention-machine-based fusion network question-answering system | |
US11741109B2 (en) | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system | |
US11210306B2 (en) | Dialogue system, a method of obtaining a response from a dialogue system, and a method of training a dialogue system | |
CN113158665B (en) | Method for improving dialog text generation based on text abstract generation and bidirectional corpus generation | |
CN111460092B (en) | Multi-document-based automatic complex problem solving method | |
CN110020438A (en) | Enterprise or tissue Chinese entity disambiguation method and device based on recognition sequence | |
CN110413768B (en) | Automatic generation method of article titles | |
CN109992775B (en) | Text abstract generation method based on high-level semantics | |
CN112765345A (en) | Text abstract automatic generation method and system fusing pre-training model | |
CN111666756B (en) | Sequence model text abstract generation method based on theme fusion | |
CN112417901A (en) | Non-autoregressive Mongolian machine translation method based on look-around decoding and vocabulary attention | |
CN110807324A (en) | Video entity identification method based on IDCNN-crf and knowledge graph | |
CN112818698A (en) | Fine-grained user comment sentiment analysis method based on dual-channel model | |
CN115062140A (en) | Method for generating abstract of BERT SUM and PGN fused supply chain ecological district length document | |
CN115600581B (en) | Controlled text generation method using syntactic information | |
CN114139497A (en) | Text abstract extraction method based on BERTSUM model | |
CN114218928A (en) | Abstract text summarization method based on graph knowledge and theme perception | |
CN113239666A (en) | Text similarity calculation method and system | |
CN112417138A (en) | Short text automatic summarization method combining pointer generation type and self-attention mechanism | |
CN115048511A (en) | Bert-based passport layout analysis method | |
Qiu et al. | Text summarization based on multi-head self-attention mechanism and pointer network | |
Wang et al. | Vector-to-sequence models for sentence analogies | |
CN114281982B (en) | Book propaganda abstract generation method and system adopting multi-mode fusion technology | |
CN116663578A (en) | Neural machine translation method based on strategy gradient method improvement | |
CN114254175A (en) | Method for extracting generative abstract of power policy file |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |