CN115589446A - Meeting abstract generation method and system based on pre-training and prompting - Google Patents
Meeting abstract generation method and system based on pre-training and prompting Download PDFInfo
- Publication number
- CN115589446A CN115589446A CN202211172546.6A CN202211172546A CN115589446A CN 115589446 A CN115589446 A CN 115589446A CN 202211172546 A CN202211172546 A CN 202211172546A CN 115589446 A CN115589446 A CN 115589446A
- Authority
- CN
- China
- Prior art keywords
- conference
- model
- text
- training
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 73
- 238000000034 method Methods 0.000 title claims abstract description 40
- 239000013598 vector Substances 0.000 claims abstract description 114
- 238000013507 mapping Methods 0.000 claims abstract description 24
- 238000012545 processing Methods 0.000 claims abstract description 19
- 230000011218 segmentation Effects 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 25
- 238000011478 gradient descent method Methods 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 5
- 125000004432 carbon atom Chemical group C* 0.000 claims description 4
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 239000012634 fragment Substances 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/568—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- Evolutionary Computation (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a meeting abstract generating method and system based on pre-training and prompting, wherein the method comprises the following steps: acquiring conference text data, and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence; pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model; and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract. By using the invention, the text information of the conference can be predicted, so that the conference abstract more conforming to the conference information is generated. The method and the system for generating the conference abstract based on the pre-training and prompting can be widely applied to the technical field of natural language processing.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a meeting abstract generation method and system based on pre-training and prompting.
Background
The existing mainstream automatic meeting abstract technology mainly comprises an abstraction method represented by TextRank and a generation method represented by an improved transform algorithm, wherein the abstraction method is directly a truncation in a meeting record text, so that the generated abstract easily leaks important information and does not contain further understanding and reasoning, particularly for the meeting record mainly in a multi-person conversation form, the abstraction method cannot obtain information contained in the communication of meeting personnel, only can intercept some obvious viewpoints, the formed abstract is inflexible, and the part of the abstract which is not understood and can cause the formation of the abstract possibly does not conform to the actual expression of the meeting.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method and a system for generating a meeting summary based on pre-training and prompting, which can predict text information of a meeting to generate a meeting summary more suitable for meeting information.
The first technical scheme adopted by the invention is as follows: a conference abstract generation method based on pre-training and prompting comprises the following steps:
acquiring conference text data, and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence;
pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract.
Further, the step of obtaining the conference text data and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence specifically includes:
acquiring audio data of a conference through audio equipment;
converting audio data of the conference through an API (application program interface) interface to obtain conference text data;
mapping the conference text data to obtain a conference text vector sequence;
and segmenting the conference text vector sequence to obtain a segmented conference text vector sequence.
Further, the step of mapping the conference text data to obtain a conference text vector sequence specifically includes:
splitting the conference text data according to characters of the conference text data to obtain a conference text character sequence;
constructing a dictionary, wherein the dictionary comprises numerical indexes for mapping the conference text character sequence;
and allocating the number index in the dictionary to the conference text character sequence and expressing the number index to obtain a conference text vector sequence.
Further, the pre-training of the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model specifically includes:
acquiring a label-free text data set, and performing data preprocessing to obtain a training set;
inputting the training set into a Transformer-XL model, wherein the Transformer-XL model comprises an input layer, a hidden layer and an output layer;
performing word embedding and position embedding processing on a training set based on an input layer of a Transformer-XL model to obtain a hidden state vector of the training set;
based on a hidden layer of a Transformer-XL model, carrying out data hiding operation processing on hidden state vectors of a training set to obtain hidden vectors with context information;
on the basis of an output layer of a Transformer-XL model, carrying out projection processing on a hidden vector with context information to obtain an output result;
and updating the Transformer-XL model by a gradient descent method based on the output result to obtain the trained Transformer-XL model.
Further, the expression of the hidden layer of the Transformer-XL pre-training model is as follows:
in the above formula, the first and second carbon atoms are,representing the hidden state vector, W, output after the tau segment token sequence passes through the layers of j Transformer-XL models o A matrix of output projections is represented which,anda matrix of projections is represented which,represents the inverse matrix of the k-dimensional all 1-vector, and μ represents an arbitrary constant.
Further, the expression of the output layer of the Transformer-XL pre-training model is as follows:
in the above formula, W u A matrix of projections is represented which,is represented byAnd outputting the hidden state vectors after passing through m identical hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input.
Further, the method based on task prompt includes the step of inputting the sequence of the fragmented conference text vectors into a trained transform-XL model to obtain a conference abstract, and specifically includes:
acquiring a conference abstract data set and inputting the conference abstract data set into a trained Transformer-XL model for fine tuning processing to obtain a task prompt;
guiding based on task prompt, and sequentially inputting the vector sequences of the fragmented conference texts into a trained transform-XL model to obtain the vector sequence of the abstract text of the conference;
and converting the vector sequence of the text of the conference abstract to obtain the conference abstract.
Further, the step of obtaining a meeting summary data set and inputting the meeting summary data set to a trained Transformer-XL model to obtain a task prompt specifically includes:
acquiring a conference abstract data set and carrying out data preprocessing to obtain a conference abstract vector;
setting a constant matrix of the context range of the conference abstract task and carrying out mapping processing through the mapping matrix to obtain an initial memory vector sequence;
inputting the initial memory vector sequence and the conference abstract vector into a trained transform-XL model to obtain a predicted value;
and updating parameters of the mapping matrix through a gradient descent method based on the predicted value to finally obtain a task prompt vector.
Further, the step of sequentially inputting the fragmented conference text vector sequence to the trained transform-XL model based on the task prompt to obtain the vector sequence of the conference abstract text specifically includes:
after the single hidden layer full-connection neural network model is trained through data, generating an initial memory vector;
splicing the initial memory vector with the text vector sequence of the fragmented conference to obtain a spliced vector;
updating initial memory vectors respectively by the spliced vectors and inputting the initial memory vectors into the trained transform-XL model;
outputting hidden state vectors of a hidden layer by a Transformer-XL model after task prompt guide training;
splicing the hidden state vector of the hidden layer and the updated memory vector and inputting the spliced hidden state vector and the updated memory vector into a trained Transformer-XL model for traversal training;
judging whether the traversal times reach the number of hidden layers of the trained transform-XL model;
and if the requirement of the layer number is not met, the steps of splicing, updating and inputting the model are circulated until the requirement of the layer number is met, all the segmented conference text vector sequences are traversed, and the vector sequence of the summary conference text is output.
The second technical scheme adopted by the invention is as follows: a meeting abstract generating system based on pre-training and prompting comprises:
the acquisition module acquires the conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;
the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model;
and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.
The method and the system have the beneficial effects that: according to the invention, through constructing the Transformer-XL model and using a large amount of conference texts for pre-training, the pre-trained Transformer-XL model has the capability of processing information in a segmented manner, the abstract of the conference texts is further acquired to finely adjust the trained Transformer-XL model, so that the pre-trained model is prompted about an abstract task, relevant knowledge can be led out from the pre-trained model, the computational cost of model training is saved, and a generative abstract method is adopted, so that dialogues in the conference recording texts can be understood and inferred, and the generated abstract is more suitable for the conference information.
Drawings
FIG. 1 is a flowchart illustrating the steps of a method for generating a meeting summary based on pre-training and prompting according to the present invention;
FIG. 2 is a block diagram of a conference summary generation system based on pre-training and prompting according to the present invention;
FIG. 3 is a flowchart of the steps of generating the abstract after the transformation-XL pre-training model of the present invention obtains the task prompt.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
Referring to fig. 1, the invention provides a conference abstract generating method based on pre-training and prompting, which comprises the following steps:
s1, conference text data are obtained and are subjected to mapping and segmentation processing, and a fragmented conference text vector sequence is obtained;
specifically, a conference site is recorded in real time until the conference is finished to obtain a recording file of the whole conference content, a voice-to-text API service interface provided by a network platform is directly called (here, API service provided on the network is directly called, such as Korea communication voice transfer, the recording of the whole conference content is converted into a plain text, each character of a plain text character string is unpacked to form a character sequence, a dictionary is constructed for mapping the character of the character string type to a character sequenceIn the numerical indexes starting from 0, according to a dictionary, each unique character is allocated with a numerical index, a character sequence split by a text character string is converted from a character string representation to a numerical index representation, a final token sequence can be obtained, and the token sequence X = (X =) is obtained 1 ,x 2 ,…,x n ) The method is divided according to the following formula, and the expression is as follows:
in the above formula, k represents the size of the division, x i ∈X,The fragment sequence X' = (X) can then be obtained 1 ,x 2 ,…,x n′ )。
S2, pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
specifically, a large-scale label-free data set (generally, text data such as hundredth known, question and answer known, hundredth encyclopedia and the like collected on a network) is established, data cleaning operations such as de-duplication, missing value removal and the like are carried out on the data, text preprocessing is carried out, the data is converted into token sequences, a transform-XL autoregressive model, namely a transform-XL pre-training model is established, and tasks of the model can be formalized by giving a token sequence X = (X) representing an input text 1 ,x 2 ,…,x n ) Outputting token sequence Y = (Y) of subsequent text 1 ,y 2 ,…,y n ) When the conditional probability P (Y | X) of token sequence Y = (Y) is obtained 1 ,y 2 ,…,y n ) The probability of (c) is the maximum, and the token sequence Y = (Y) of the subsequent text can be obtained finally 1 ,y 2 ,…,y n );
S21, training an input layer of a Transformer-XL pre-training model;
specifically, a token sequence X = (X) of the entire input text 1 ,x 2 ,…,x n ) Divided into shorter pieces of fixed sizeInputting the fragments into a Transformer-XL model one by one, enabling the length of each fragment to be k, and setting the sequence x of the tau fragment token τ =(x τ,1 ,x τ,2 ,…,x τ,k ) In the input layer of the transform-XL model, the following operations are performed:
in the above formula, the first and second carbon atoms are,representative token sequence x representing input layer output τ =(x τ,1 ,x τ,2 ,…,x τ,k ) Hidden state vector of (2), W e Word-embedding matrix, W, representing token p Position-embedding matrix, W, representing token e 、W p Trainable parameters of the input layer are represented.
S22, training a hidden layer of a Transformer-XL pre-training model;
specifically, the transform-XL model comprises a set of m identical hidden layers, and for the j ∈ {2, \ 8230;, m +1} layers of the transform-XL model, there are:
in the above formula, the first and second carbon atoms are,represents a hidden state vector, W, output after the tau segment token sequence passes through j layers of a Transformer-XL model o A matrix of output projections is represented which,anda matrix of projections is represented which,denotes an inverse matrix of all 1 vectors of the k dimension, and μ denotes an arbitrary constant.
S23, training an output layer of a Transformer-XL pre-training model;
specifically, for the transform-XL model output layer, there is an expression trained as follows:
in the above formula, W u A matrix of projections is represented which,is represented byOutputting hidden state vectors after passing through m same hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input;
establishing an objective function max φ P (Y | X; phi), wherein phi represents all trainable parameters of the Transformer-XL model, and the model parameters phi are updated by a gradient descent method, so that the parameters comprise:
in the above formula, phi represents an updatable parameter of the Transformer-XL model;
after multiple iterations, the trained Transformer-XL model parameter phi can be obtained, and the pre-trained Transformer-XL model can be obtained.
And S3, inputting the segmented conference text vector sequence into the trained Transformer-XL model based on a task prompt method to obtain a conference abstract.
S31, acquiring a prompt, namely a meeting task prompt;
specifically, referring to fig. 3, a data cleansing operation such as deduplication and missing value removal is performed on the conference summary data set, and a text is performedPreprocessing, converting the data into token sequences, wherein for each piece of data, the token sequence x represents the text of the meeting record and belongs to the input of the model, and the token sequence y represents the meeting abstract content summarized by related professionals according to the meeting record without settingRepresenting a range of contextual contexts that describe the task of summarization, then:
in the above formula, W θ Represents a mapping matrix, B θ Representing the mapping matrix, | representing the l-th layer of the Transformer-XL pre-training model,representing an initial memory vector;
wherein W θ 、B θ Is the training parameter of the single hidden layer full-connection neural network model, the initial memory vector sequence Is the initial input of the Transformer-XL pre-training model;
the initial memory vector sequenceInputting Transformer-XL pre-training model with sequence xThe predicted value y' can be obtained, so that the objective function can be established as follows:
in the above formula, z = [ x; y is]Means x and Y are spliced, i represents time step, Y idz Context range, h, representing y i A contextual context representing all histories;
autoregressive Transformer-XL pre-training model h i Is calculated with respect to z i And its left side past context h <i The function of (c) then has:
in the above formula, P idx Is represented by a parameter matrix, P θ Context range of representation, here, h i (for all i) is a learnable parameter matrix P θ When i ∈ P idx H is i Directly from P θ When is coming into contact withH is i Still depend on P θ Due to P θ The initial memory vector provided is always in the left-hand (i.e. historical) context and therefore affects the representation of any subsequent context to its right;
updating the model parameter theta by using a gradient descent method, wherein the updated expression is as follows:
after multiple iterations, a trained parameter matrix P can be obtained θ Further, an initial sequence of memory vectors representing the context of the prompt summary task may be obtained
S32, after the prompt is obtained, sequentially inputting a fragment sequence representing the segmented text of the conference content by a transform-XL pre-training model;
in particular, token sequence x for the 1 st text segment 1 After passing through an input layer of a Transformer-XL neural network, an input layer hidden state vector can be obtainedHidden state vector of input layer of Transformer-XL neural network in memory vector sequenceAfter stitching, as input for the layer of the next Transformer-XL pre-training model, we can:
in the above formula, L ∈ {1,2, \ 8230;, L } represents the L-th layer of the transform-XL pre-training model, when τ -1=0, the hidden state vector is from the initial memory vector sequence, layer (·) represents the specific operation of the layer of the transform-XL pre-training model, and the vector which represents the output hidden state of the L-th layer of the transform-XL pre-training model in the memory vector sequence is directly replaced into
Input τ th fragment token sequence x τ After passing through L layers of the Transformer-XL pre-training models, obtaining a hidden state vector output by the hidden layer of the last Transformer-XL pre-training modelAnd an updated memory vector sequence
In fragment sequence X' = (X) 1 ,x 2 ,…,x n′ ) After each segment is operated according to the steps, the last hidden state vector of the last segment is obtainedAnd an updated memory vector sequence
S33, after the token sequence representing the segmented text of the conference content is input into the transform-XL pre-training model, generating a token of the conference abstract continuously in an autoregressive mode;
in particular, the hidden state vectorInputting an output layer of a transform-XL neural network to obtain the distribution of a first abstract text token so as to obtain the first abstract text token;
the above steps are combined to form a hidden state vectorInputting an output layer of the Transformer-XL neural network to obtain a summary text token and a memory vector sequenceInputting the result into a Transformer-XL pre-training model, and obtaining the distribution of the next abstract text tokenAnd the current latest memory vector sequence
Repeating the steps until the model generates a token representing the end symbol, wherein at this time, a token sequence formed by the first abstract text token to the token representing the end symbol is the token sequence of the abstract of the whole conference content, and the token sequence of the conference abstract text is converted back into a text form again, so that the abstract text of the whole conference content can be obtained finally.
Referring to fig. 2, a conference summary generation system based on pre-training and prompting includes:
the acquisition module acquires conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;
the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model;
and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.
The contents in the method embodiments are all applicable to the system embodiments, the functions specifically implemented by the system embodiments are the same as those in the method embodiments, and the beneficial effects achieved by the system embodiments are also the same as those achieved by the method embodiments.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A conference abstract generating method based on pre-training and prompting is characterized by comprising the following steps:
acquiring conference text data, and mapping and segmenting the conference text data to obtain a fragmented conference text vector sequence;
pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract.
2. The method for generating the conference abstract based on the pre-training and prompting as claimed in claim 1, wherein the step of obtaining the conference text data and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence specifically comprises:
acquiring audio data of a conference through audio equipment;
converting audio data of the conference through an API (application program interface) interface to obtain conference text data;
mapping the conference text data to obtain a conference text vector sequence;
and segmenting the conference text vector sequence to obtain a segmented conference text vector sequence.
3. The method for generating the conference abstract based on the pre-training and the prompting as claimed in claim 2, wherein the step of mapping the conference text data to obtain the conference text vector sequence specifically comprises:
splitting the conference text data according to characters of the conference text data to obtain a conference text character sequence;
constructing a dictionary, wherein the dictionary comprises numerical indexes for mapping the conference text character sequence;
and allocating the number indexes in the dictionary to the conference text character sequence and expressing the number indexes to obtain a conference text vector sequence.
4. The method for generating a conference summary based on pre-training and prompting as claimed in claim 3, wherein the step of pre-training a fransformer-XL model based on a label-free text dataset to obtain a trained fransformer-XL model specifically includes:
acquiring a label-free text data set, and performing data preprocessing to obtain a training set;
inputting the training set into a Transformer-XL model, wherein the Transformer-XL model comprises an input layer, a hidden layer and an output layer;
performing word embedding and position embedding processing on the training set based on an input layer of a transform-XL model to obtain a hidden state vector of the training set;
based on a hidden layer of a transform-XL model, carrying out data hiding operation processing on the hidden state vector of the training set to obtain a hidden vector with context information;
on the basis of an output layer of a Transformer-XL model, carrying out projection processing on a hidden vector with context information to obtain an output result;
and updating the Transformer-XL model by a gradient descent method based on the output result to obtain the trained Transformer-XL model.
5. The method for generating a conference summary based on pre-training and hinting of claim 4, wherein the expression of the hidden layer of the transform-XL pre-training model is as follows:
in the above formula, the first and second carbon atoms are,represents a hidden state vector, W, output after the tau segment token sequence passes through j layers of a Transformer-XL model o A matrix of output projections is represented which,anda matrix of projections is represented which,denotes an inverse matrix of all 1 vectors of the k dimension, and μ denotes an arbitrary constant.
6. The method for generating a meeting abstract based on pre-training and prompting of claim 4, wherein the expression of the output layer of the transform-XL pre-training model is as follows:
in the above formula, W u A matrix of projections is represented which,is represented byAnd outputting the hidden state vectors after passing through m identical hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input.
7. The method as claimed in claim 4, wherein the method for generating the meeting abstract based on pre-training and prompting is characterized in that the method for generating the meeting abstract based on task prompting inputs the sequence of segmented meeting text vectors into the trained transform-XL model to obtain the meeting abstract, and specifically comprises:
acquiring a conference abstract data set and inputting the conference abstract data set into a trained Transformer-XL model for fine adjustment processing to obtain a task prompt;
guiding based on task prompt, and sequentially inputting the vector sequences of the fragmented conference texts into a trained transform-XL model to obtain the vector sequence of the abstract text of the conference;
and converting the vector sequence of the text of the conference abstract to obtain the conference abstract.
8. The method of claim 7, wherein the step of obtaining a meeting summary data set and inputting the meeting summary data set to a trained Transformer-XL model to obtain a task prompt specifically comprises:
acquiring a conference abstract data set and carrying out data preprocessing to obtain a conference abstract vector;
setting a constant matrix of the context range of the conference abstract task and carrying out mapping processing through the mapping matrix to obtain an initial memory vector sequence;
inputting the initial memory vector sequence and the conference abstract vector into a trained transform-XL model to obtain a predicted value;
and updating parameters of the mapping matrix through a gradient descent method based on the predicted value to finally obtain a task prompt vector.
9. The method of claim 8, wherein the step of sequentially inputting the segmented text vector sequence of the conference to the trained Transformer-XL model to obtain the vector sequence of the text of the conference summary based on the task prompt specifically comprises:
after fine adjustment is carried out on the basis of a task prompt method, an initial memory vector is obtained;
splicing the initial memory vector with the text vector sequence of the fragmented conference to obtain a spliced vector;
updating initial memory vectors respectively by the spliced vectors and inputting the initial memory vectors into the trained transform-XL model;
outputting hidden state vectors of a hidden layer by a Transformer-XL model after task prompt guide training;
splicing the hidden state vector of the hidden layer and the updated memory vector and inputting the spliced hidden state vector and the updated memory vector to a trained Transformer-XL model for traversal;
judging whether the traversal times reach the number of hidden layers of the trained Transformer-XL model;
and if the requirement of the layer number is not met, the steps of splicing, updating and inputting the model are circulated until the requirement of the layer number is met, all the segmented conference text vector sequences are traversed, and the vector sequence of the summary conference text is output.
10. A conference abstract generation system based on pre-training and prompting is characterized by comprising the following modules:
the acquisition module acquires the conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;
the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211172546.6A CN115589446A (en) | 2022-09-26 | 2022-09-26 | Meeting abstract generation method and system based on pre-training and prompting |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211172546.6A CN115589446A (en) | 2022-09-26 | 2022-09-26 | Meeting abstract generation method and system based on pre-training and prompting |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115589446A true CN115589446A (en) | 2023-01-10 |
Family
ID=84777960
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211172546.6A Pending CN115589446A (en) | 2022-09-26 | 2022-09-26 | Meeting abstract generation method and system based on pre-training and prompting |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115589446A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115965033A (en) * | 2023-03-16 | 2023-04-14 | 安徽大学 | Generation type text summarization method and device based on sequence level prefix prompt |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111897949A (en) * | 2020-07-28 | 2020-11-06 | 北京工业大学 | Guided text abstract generation method based on Transformer |
CN112765345A (en) * | 2021-01-22 | 2021-05-07 | 重庆邮电大学 | Text abstract automatic generation method and system fusing pre-training model |
CN112862662A (en) * | 2021-03-12 | 2021-05-28 | 云知声智能科技股份有限公司 | Method and equipment for distributed training of transform-xl language model |
CN113282750A (en) * | 2021-05-27 | 2021-08-20 | 成都数之联科技有限公司 | Model training method, system, device and medium |
CN114372140A (en) * | 2021-12-31 | 2022-04-19 | 北京海联捷讯科技股份有限公司 | Layered conference abstract generation model training method, generation method and device |
-
2022
- 2022-09-26 CN CN202211172546.6A patent/CN115589446A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111897949A (en) * | 2020-07-28 | 2020-11-06 | 北京工业大学 | Guided text abstract generation method based on Transformer |
CN112765345A (en) * | 2021-01-22 | 2021-05-07 | 重庆邮电大学 | Text abstract automatic generation method and system fusing pre-training model |
CN112862662A (en) * | 2021-03-12 | 2021-05-28 | 云知声智能科技股份有限公司 | Method and equipment for distributed training of transform-xl language model |
CN113282750A (en) * | 2021-05-27 | 2021-08-20 | 成都数之联科技有限公司 | Model training method, system, device and medium |
CN114372140A (en) * | 2021-12-31 | 2022-04-19 | 北京海联捷讯科技股份有限公司 | Layered conference abstract generation model training method, generation method and device |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115965033A (en) * | 2023-03-16 | 2023-04-14 | 安徽大学 | Generation type text summarization method and device based on sequence level prefix prompt |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108415977B (en) | Deep neural network and reinforcement learning-based generative machine reading understanding method | |
JP7419508B2 (en) | Contrastive pre-training for language tasks | |
CN111783474B (en) | Comment text viewpoint information processing method and device and storage medium | |
CN107273503B (en) | Method and device for generating parallel text in same language | |
CN111897933B (en) | Emotion dialogue generation method and device and emotion dialogue model training method and device | |
CN112435656B (en) | Model training method, voice recognition method, device, equipment and storage medium | |
CN111444340A (en) | Text classification and recommendation method, device, equipment and storage medium | |
CN112214604A (en) | Training method of text classification model, text classification method, device and equipment | |
CN111966800B (en) | Emotion dialogue generation method and device and emotion dialogue model training method and device | |
CN110162766B (en) | Word vector updating method and device | |
CN111930914B (en) | Problem generation method and device, electronic equipment and computer readable storage medium | |
CN109344242B (en) | Dialogue question-answering method, device, equipment and storage medium | |
US20220383206A1 (en) | Task Augmentation and Self-Training for Improved Few-Shot Learning | |
CN113435211B (en) | Text implicit emotion analysis method combined with external knowledge | |
CN111767694B (en) | Text generation method, apparatus and computer readable storage medium | |
CN112632244A (en) | Man-machine conversation optimization method and device, computer equipment and storage medium | |
CN112183106B (en) | Semantic understanding method and device based on phoneme association and deep learning | |
CN118043885A (en) | Contrast twin network for semi-supervised speech recognition | |
CN114443899A (en) | Video classification method, device, equipment and medium | |
CN114091466A (en) | Multi-modal emotion analysis method and system based on Transformer and multi-task learning | |
CN115589446A (en) | Meeting abstract generation method and system based on pre-training and prompting | |
CN114169408A (en) | Emotion classification method based on multi-mode attention mechanism | |
CN116860943A (en) | Multi-round dialogue method and system for dialogue style perception and theme guidance | |
WO2023009740A1 (en) | Contrastive learning and masked modeling for end-to-end self-supervised pre-training | |
CN115422329A (en) | Knowledge-driven multi-channel screening fusion dialogue generation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |