CN115589446A - Meeting abstract generation method and system based on pre-training and prompting - Google Patents

Meeting abstract generation method and system based on pre-training and prompting Download PDF

Info

Publication number
CN115589446A
CN115589446A CN202211172546.6A CN202211172546A CN115589446A CN 115589446 A CN115589446 A CN 115589446A CN 202211172546 A CN202211172546 A CN 202211172546A CN 115589446 A CN115589446 A CN 115589446A
Authority
CN
China
Prior art keywords
conference
model
text
training
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211172546.6A
Other languages
Chinese (zh)
Inventor
罗彦卓
李滔
孟伟
林超纯
张秀屏
麦永钦
卓汉强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Black Box Technology Guangzhou Co ltd
Original Assignee
Black Box Technology Guangzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Black Box Technology Guangzhou Co ltd filed Critical Black Box Technology Guangzhou Co ltd
Priority to CN202211172546.6A priority Critical patent/CN115589446A/en
Publication of CN115589446A publication Critical patent/CN115589446A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a meeting abstract generating method and system based on pre-training and prompting, wherein the method comprises the following steps: acquiring conference text data, and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence; pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model; and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract. By using the invention, the text information of the conference can be predicted, so that the conference abstract more conforming to the conference information is generated. The method and the system for generating the conference abstract based on the pre-training and prompting can be widely applied to the technical field of natural language processing.

Description

Meeting abstract generation method and system based on pre-training and prompting
Technical Field
The invention relates to the technical field of natural language processing, in particular to a meeting abstract generation method and system based on pre-training and prompting.
Background
The existing mainstream automatic meeting abstract technology mainly comprises an abstraction method represented by TextRank and a generation method represented by an improved transform algorithm, wherein the abstraction method is directly a truncation in a meeting record text, so that the generated abstract easily leaks important information and does not contain further understanding and reasoning, particularly for the meeting record mainly in a multi-person conversation form, the abstraction method cannot obtain information contained in the communication of meeting personnel, only can intercept some obvious viewpoints, the formed abstract is inflexible, and the part of the abstract which is not understood and can cause the formation of the abstract possibly does not conform to the actual expression of the meeting.
Disclosure of Invention
In order to solve the above technical problems, an object of the present invention is to provide a method and a system for generating a meeting summary based on pre-training and prompting, which can predict text information of a meeting to generate a meeting summary more suitable for meeting information.
The first technical scheme adopted by the invention is as follows: a conference abstract generation method based on pre-training and prompting comprises the following steps:
acquiring conference text data, and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence;
pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract.
Further, the step of obtaining the conference text data and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence specifically includes:
acquiring audio data of a conference through audio equipment;
converting audio data of the conference through an API (application program interface) interface to obtain conference text data;
mapping the conference text data to obtain a conference text vector sequence;
and segmenting the conference text vector sequence to obtain a segmented conference text vector sequence.
Further, the step of mapping the conference text data to obtain a conference text vector sequence specifically includes:
splitting the conference text data according to characters of the conference text data to obtain a conference text character sequence;
constructing a dictionary, wherein the dictionary comprises numerical indexes for mapping the conference text character sequence;
and allocating the number index in the dictionary to the conference text character sequence and expressing the number index to obtain a conference text vector sequence.
Further, the pre-training of the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model specifically includes:
acquiring a label-free text data set, and performing data preprocessing to obtain a training set;
inputting the training set into a Transformer-XL model, wherein the Transformer-XL model comprises an input layer, a hidden layer and an output layer;
performing word embedding and position embedding processing on a training set based on an input layer of a Transformer-XL model to obtain a hidden state vector of the training set;
based on a hidden layer of a Transformer-XL model, carrying out data hiding operation processing on hidden state vectors of a training set to obtain hidden vectors with context information;
on the basis of an output layer of a Transformer-XL model, carrying out projection processing on a hidden vector with context information to obtain an output result;
and updating the Transformer-XL model by a gradient descent method based on the output result to obtain the trained Transformer-XL model.
Further, the expression of the hidden layer of the Transformer-XL pre-training model is as follows:
Figure BDA0003863857310000021
in the above formula, the first and second carbon atoms are,
Figure BDA0003863857310000022
representing the hidden state vector, W, output after the tau segment token sequence passes through the layers of j Transformer-XL models o A matrix of output projections is represented which,
Figure BDA0003863857310000023
and
Figure BDA0003863857310000024
a matrix of projections is represented which,
Figure BDA0003863857310000025
represents the inverse matrix of the k-dimensional all 1-vector, and μ represents an arbitrary constant.
Further, the expression of the output layer of the Transformer-XL pre-training model is as follows:
Figure BDA0003863857310000031
in the above formula, W u A matrix of projections is represented which,
Figure BDA0003863857310000032
is represented by
Figure BDA0003863857310000033
And outputting the hidden state vectors after passing through m identical hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input.
Further, the method based on task prompt includes the step of inputting the sequence of the fragmented conference text vectors into a trained transform-XL model to obtain a conference abstract, and specifically includes:
acquiring a conference abstract data set and inputting the conference abstract data set into a trained Transformer-XL model for fine tuning processing to obtain a task prompt;
guiding based on task prompt, and sequentially inputting the vector sequences of the fragmented conference texts into a trained transform-XL model to obtain the vector sequence of the abstract text of the conference;
and converting the vector sequence of the text of the conference abstract to obtain the conference abstract.
Further, the step of obtaining a meeting summary data set and inputting the meeting summary data set to a trained Transformer-XL model to obtain a task prompt specifically includes:
acquiring a conference abstract data set and carrying out data preprocessing to obtain a conference abstract vector;
setting a constant matrix of the context range of the conference abstract task and carrying out mapping processing through the mapping matrix to obtain an initial memory vector sequence;
inputting the initial memory vector sequence and the conference abstract vector into a trained transform-XL model to obtain a predicted value;
and updating parameters of the mapping matrix through a gradient descent method based on the predicted value to finally obtain a task prompt vector.
Further, the step of sequentially inputting the fragmented conference text vector sequence to the trained transform-XL model based on the task prompt to obtain the vector sequence of the conference abstract text specifically includes:
after the single hidden layer full-connection neural network model is trained through data, generating an initial memory vector;
splicing the initial memory vector with the text vector sequence of the fragmented conference to obtain a spliced vector;
updating initial memory vectors respectively by the spliced vectors and inputting the initial memory vectors into the trained transform-XL model;
outputting hidden state vectors of a hidden layer by a Transformer-XL model after task prompt guide training;
splicing the hidden state vector of the hidden layer and the updated memory vector and inputting the spliced hidden state vector and the updated memory vector into a trained Transformer-XL model for traversal training;
judging whether the traversal times reach the number of hidden layers of the trained transform-XL model;
and if the requirement of the layer number is not met, the steps of splicing, updating and inputting the model are circulated until the requirement of the layer number is met, all the segmented conference text vector sequences are traversed, and the vector sequence of the summary conference text is output.
The second technical scheme adopted by the invention is as follows: a meeting abstract generating system based on pre-training and prompting comprises:
the acquisition module acquires the conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;
the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model;
and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.
The method and the system have the beneficial effects that: according to the invention, through constructing the Transformer-XL model and using a large amount of conference texts for pre-training, the pre-trained Transformer-XL model has the capability of processing information in a segmented manner, the abstract of the conference texts is further acquired to finely adjust the trained Transformer-XL model, so that the pre-trained model is prompted about an abstract task, relevant knowledge can be led out from the pre-trained model, the computational cost of model training is saved, and a generative abstract method is adopted, so that dialogues in the conference recording texts can be understood and inferred, and the generated abstract is more suitable for the conference information.
Drawings
FIG. 1 is a flowchart illustrating the steps of a method for generating a meeting summary based on pre-training and prompting according to the present invention;
FIG. 2 is a block diagram of a conference summary generation system based on pre-training and prompting according to the present invention;
FIG. 3 is a flowchart of the steps of generating the abstract after the transformation-XL pre-training model of the present invention obtains the task prompt.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
Referring to fig. 1, the invention provides a conference abstract generating method based on pre-training and prompting, which comprises the following steps:
s1, conference text data are obtained and are subjected to mapping and segmentation processing, and a fragmented conference text vector sequence is obtained;
specifically, a conference site is recorded in real time until the conference is finished to obtain a recording file of the whole conference content, a voice-to-text API service interface provided by a network platform is directly called (here, API service provided on the network is directly called, such as Korea communication voice transfer, the recording of the whole conference content is converted into a plain text, each character of a plain text character string is unpacked to form a character sequence, a dictionary is constructed for mapping the character of the character string type to a character sequenceIn the numerical indexes starting from 0, according to a dictionary, each unique character is allocated with a numerical index, a character sequence split by a text character string is converted from a character string representation to a numerical index representation, a final token sequence can be obtained, and the token sequence X = (X =) is obtained 1 ,x 2 ,…,x n ) The method is divided according to the following formula, and the expression is as follows:
Figure BDA0003863857310000051
in the above formula, k represents the size of the division, x i ∈X,
Figure BDA0003863857310000052
The fragment sequence X' = (X) can then be obtained 1 ,x 2 ,…,x n′ )。
S2, pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
specifically, a large-scale label-free data set (generally, text data such as hundredth known, question and answer known, hundredth encyclopedia and the like collected on a network) is established, data cleaning operations such as de-duplication, missing value removal and the like are carried out on the data, text preprocessing is carried out, the data is converted into token sequences, a transform-XL autoregressive model, namely a transform-XL pre-training model is established, and tasks of the model can be formalized by giving a token sequence X = (X) representing an input text 1 ,x 2 ,…,x n ) Outputting token sequence Y = (Y) of subsequent text 1 ,y 2 ,…,y n ) When the conditional probability P (Y | X) of token sequence Y = (Y) is obtained 1 ,y 2 ,…,y n ) The probability of (c) is the maximum, and the token sequence Y = (Y) of the subsequent text can be obtained finally 1 ,y 2 ,…,y n );
S21, training an input layer of a Transformer-XL pre-training model;
specifically, a token sequence X = (X) of the entire input text 1 ,x 2 ,…,x n ) Divided into shorter pieces of fixed sizeInputting the fragments into a Transformer-XL model one by one, enabling the length of each fragment to be k, and setting the sequence x of the tau fragment token τ =(x τ,1 ,x τ,2 ,…,x τ,k ) In the input layer of the transform-XL model, the following operations are performed:
Figure BDA0003863857310000053
in the above formula, the first and second carbon atoms are,
Figure BDA0003863857310000054
representative token sequence x representing input layer output τ =(x τ,1 ,x τ,2 ,…,x τ,k ) Hidden state vector of (2), W e Word-embedding matrix, W, representing token p Position-embedding matrix, W, representing token e 、W p Trainable parameters of the input layer are represented.
S22, training a hidden layer of a Transformer-XL pre-training model;
specifically, the transform-XL model comprises a set of m identical hidden layers, and for the j ∈ {2, \ 8230;, m +1} layers of the transform-XL model, there are:
Figure BDA0003863857310000055
in the above formula, the first and second carbon atoms are,
Figure BDA0003863857310000061
represents a hidden state vector, W, output after the tau segment token sequence passes through j layers of a Transformer-XL model o A matrix of output projections is represented which,
Figure BDA0003863857310000062
and
Figure BDA0003863857310000063
a matrix of projections is represented which,
Figure BDA0003863857310000064
denotes an inverse matrix of all 1 vectors of the k dimension, and μ denotes an arbitrary constant.
S23, training an output layer of a Transformer-XL pre-training model;
specifically, for the transform-XL model output layer, there is an expression trained as follows:
Figure BDA0003863857310000065
in the above formula, W u A matrix of projections is represented which,
Figure BDA0003863857310000066
is represented by
Figure BDA0003863857310000067
Outputting hidden state vectors after passing through m same hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input;
establishing an objective function max φ P (Y | X; phi), wherein phi represents all trainable parameters of the Transformer-XL model, and the model parameters phi are updated by a gradient descent method, so that the parameters comprise:
Figure BDA0003863857310000068
in the above formula, phi represents an updatable parameter of the Transformer-XL model;
after multiple iterations, the trained Transformer-XL model parameter phi can be obtained, and the pre-trained Transformer-XL model can be obtained.
And S3, inputting the segmented conference text vector sequence into the trained Transformer-XL model based on a task prompt method to obtain a conference abstract.
S31, acquiring a prompt, namely a meeting task prompt;
specifically, referring to fig. 3, a data cleansing operation such as deduplication and missing value removal is performed on the conference summary data set, and a text is performedPreprocessing, converting the data into token sequences, wherein for each piece of data, the token sequence x represents the text of the meeting record and belongs to the input of the model, and the token sequence y represents the meeting abstract content summarized by related professionals according to the meeting record without setting
Figure BDA0003863857310000069
Representing a range of contextual contexts that describe the task of summarization, then:
Figure BDA00038638573100000610
in the above formula, W θ Represents a mapping matrix, B θ Representing the mapping matrix, | representing the l-th layer of the Transformer-XL pre-training model,
Figure BDA00038638573100000611
representing an initial memory vector;
wherein W θ 、B θ Is the training parameter of the single hidden layer full-connection neural network model, the initial memory vector sequence
Figure BDA00038638573100000612
Figure BDA0003863857310000071
Is the initial input of the Transformer-XL pre-training model;
the initial memory vector sequence
Figure BDA0003863857310000072
Inputting Transformer-XL pre-training model with sequence x
Figure BDA0003863857310000073
The predicted value y' can be obtained, so that the objective function can be established as follows:
Figure BDA0003863857310000074
in the above formula, z = [ x; y is]Means x and Y are spliced, i represents time step, Y idz Context range, h, representing y i A contextual context representing all histories;
autoregressive Transformer-XL pre-training model h i Is calculated with respect to z i And its left side past context h <i The function of (c) then has:
Figure BDA0003863857310000075
in the above formula, P idx Is represented by a parameter matrix, P θ Context range of representation, here, h i (for all i) is a learnable parameter matrix P θ When i ∈ P idx H is i Directly from P θ When is coming into contact with
Figure BDA0003863857310000076
H is i Still depend on P θ Due to P θ The initial memory vector provided is always in the left-hand (i.e. historical) context and therefore affects the representation of any subsequent context to its right;
updating the model parameter theta by using a gradient descent method, wherein the updated expression is as follows:
Figure BDA0003863857310000077
after multiple iterations, a trained parameter matrix P can be obtained θ Further, an initial sequence of memory vectors representing the context of the prompt summary task may be obtained
Figure BDA0003863857310000078
S32, after the prompt is obtained, sequentially inputting a fragment sequence representing the segmented text of the conference content by a transform-XL pre-training model;
in particular, token sequence x for the 1 st text segment 1 After passing through an input layer of a Transformer-XL neural network, an input layer hidden state vector can be obtained
Figure BDA0003863857310000079
Hidden state vector of input layer of Transformer-XL neural network in memory vector sequence
Figure BDA00038638573100000710
After stitching, as input for the layer of the next Transformer-XL pre-training model, we can:
Figure BDA00038638573100000711
in the above formula, L ∈ {1,2, \ 8230;, L } represents the L-th layer of the transform-XL pre-training model, when τ -1=0, the hidden state vector is from the initial memory vector sequence, layer (·) represents the specific operation of the layer of the transform-XL pre-training model, and the vector which represents the output hidden state of the L-th layer of the transform-XL pre-training model in the memory vector sequence is directly replaced into
Figure BDA0003863857310000081
Input τ th fragment token sequence x τ After passing through L layers of the Transformer-XL pre-training models, obtaining a hidden state vector output by the hidden layer of the last Transformer-XL pre-training model
Figure BDA0003863857310000082
And an updated memory vector sequence
Figure BDA0003863857310000083
In fragment sequence X' = (X) 1 ,x 2 ,…,x n′ ) After each segment is operated according to the steps, the last hidden state vector of the last segment is obtained
Figure BDA0003863857310000084
And an updated memory vector sequence
Figure BDA0003863857310000085
S33, after the token sequence representing the segmented text of the conference content is input into the transform-XL pre-training model, generating a token of the conference abstract continuously in an autoregressive mode;
in particular, the hidden state vector
Figure BDA0003863857310000086
Inputting an output layer of a transform-XL neural network to obtain the distribution of a first abstract text token so as to obtain the first abstract text token;
the above steps are combined to form a hidden state vector
Figure BDA0003863857310000087
Inputting an output layer of the Transformer-XL neural network to obtain a summary text token and a memory vector sequence
Figure BDA0003863857310000088
Inputting the result into a Transformer-XL pre-training model, and obtaining the distribution of the next abstract text token
Figure BDA0003863857310000089
And the current latest memory vector sequence
Figure BDA00038638573100000810
Repeating the steps until the model generates a token representing the end symbol, wherein at this time, a token sequence formed by the first abstract text token to the token representing the end symbol is the token sequence of the abstract of the whole conference content, and the token sequence of the conference abstract text is converted back into a text form again, so that the abstract text of the whole conference content can be obtained finally.
Referring to fig. 2, a conference summary generation system based on pre-training and prompting includes:
the acquisition module acquires conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;
the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model;
and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.
The contents in the method embodiments are all applicable to the system embodiments, the functions specifically implemented by the system embodiments are the same as those in the method embodiments, and the beneficial effects achieved by the system embodiments are also the same as those achieved by the method embodiments.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A conference abstract generating method based on pre-training and prompting is characterized by comprising the following steps:
acquiring conference text data, and mapping and segmenting the conference text data to obtain a fragmented conference text vector sequence;
pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract.
2. The method for generating the conference abstract based on the pre-training and prompting as claimed in claim 1, wherein the step of obtaining the conference text data and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence specifically comprises:
acquiring audio data of a conference through audio equipment;
converting audio data of the conference through an API (application program interface) interface to obtain conference text data;
mapping the conference text data to obtain a conference text vector sequence;
and segmenting the conference text vector sequence to obtain a segmented conference text vector sequence.
3. The method for generating the conference abstract based on the pre-training and the prompting as claimed in claim 2, wherein the step of mapping the conference text data to obtain the conference text vector sequence specifically comprises:
splitting the conference text data according to characters of the conference text data to obtain a conference text character sequence;
constructing a dictionary, wherein the dictionary comprises numerical indexes for mapping the conference text character sequence;
and allocating the number indexes in the dictionary to the conference text character sequence and expressing the number indexes to obtain a conference text vector sequence.
4. The method for generating a conference summary based on pre-training and prompting as claimed in claim 3, wherein the step of pre-training a fransformer-XL model based on a label-free text dataset to obtain a trained fransformer-XL model specifically includes:
acquiring a label-free text data set, and performing data preprocessing to obtain a training set;
inputting the training set into a Transformer-XL model, wherein the Transformer-XL model comprises an input layer, a hidden layer and an output layer;
performing word embedding and position embedding processing on the training set based on an input layer of a transform-XL model to obtain a hidden state vector of the training set;
based on a hidden layer of a transform-XL model, carrying out data hiding operation processing on the hidden state vector of the training set to obtain a hidden vector with context information;
on the basis of an output layer of a Transformer-XL model, carrying out projection processing on a hidden vector with context information to obtain an output result;
and updating the Transformer-XL model by a gradient descent method based on the output result to obtain the trained Transformer-XL model.
5. The method for generating a conference summary based on pre-training and hinting of claim 4, wherein the expression of the hidden layer of the transform-XL pre-training model is as follows:
Figure FDA0003863857300000021
in the above formula, the first and second carbon atoms are,
Figure FDA0003863857300000022
represents a hidden state vector, W, output after the tau segment token sequence passes through j layers of a Transformer-XL model o A matrix of output projections is represented which,
Figure FDA0003863857300000023
and
Figure FDA0003863857300000024
a matrix of projections is represented which,
Figure FDA0003863857300000025
denotes an inverse matrix of all 1 vectors of the k dimension, and μ denotes an arbitrary constant.
6. The method for generating a meeting abstract based on pre-training and prompting of claim 4, wherein the expression of the output layer of the transform-XL pre-training model is as follows:
Figure FDA0003863857300000026
in the above formula, W u A matrix of projections is represented which,
Figure FDA0003863857300000027
is represented by
Figure FDA0003863857300000028
And outputting the hidden state vectors after passing through m identical hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input.
7. The method as claimed in claim 4, wherein the method for generating the meeting abstract based on pre-training and prompting is characterized in that the method for generating the meeting abstract based on task prompting inputs the sequence of segmented meeting text vectors into the trained transform-XL model to obtain the meeting abstract, and specifically comprises:
acquiring a conference abstract data set and inputting the conference abstract data set into a trained Transformer-XL model for fine adjustment processing to obtain a task prompt;
guiding based on task prompt, and sequentially inputting the vector sequences of the fragmented conference texts into a trained transform-XL model to obtain the vector sequence of the abstract text of the conference;
and converting the vector sequence of the text of the conference abstract to obtain the conference abstract.
8. The method of claim 7, wherein the step of obtaining a meeting summary data set and inputting the meeting summary data set to a trained Transformer-XL model to obtain a task prompt specifically comprises:
acquiring a conference abstract data set and carrying out data preprocessing to obtain a conference abstract vector;
setting a constant matrix of the context range of the conference abstract task and carrying out mapping processing through the mapping matrix to obtain an initial memory vector sequence;
inputting the initial memory vector sequence and the conference abstract vector into a trained transform-XL model to obtain a predicted value;
and updating parameters of the mapping matrix through a gradient descent method based on the predicted value to finally obtain a task prompt vector.
9. The method of claim 8, wherein the step of sequentially inputting the segmented text vector sequence of the conference to the trained Transformer-XL model to obtain the vector sequence of the text of the conference summary based on the task prompt specifically comprises:
after fine adjustment is carried out on the basis of a task prompt method, an initial memory vector is obtained;
splicing the initial memory vector with the text vector sequence of the fragmented conference to obtain a spliced vector;
updating initial memory vectors respectively by the spliced vectors and inputting the initial memory vectors into the trained transform-XL model;
outputting hidden state vectors of a hidden layer by a Transformer-XL model after task prompt guide training;
splicing the hidden state vector of the hidden layer and the updated memory vector and inputting the spliced hidden state vector and the updated memory vector to a trained Transformer-XL model for traversal;
judging whether the traversal times reach the number of hidden layers of the trained Transformer-XL model;
and if the requirement of the layer number is not met, the steps of splicing, updating and inputting the model are circulated until the requirement of the layer number is met, all the segmented conference text vector sequences are traversed, and the vector sequence of the summary conference text is output.
10. A conference abstract generation system based on pre-training and prompting is characterized by comprising the following modules:
the acquisition module acquires the conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;
the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;
and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.
CN202211172546.6A 2022-09-26 2022-09-26 Meeting abstract generation method and system based on pre-training and prompting Pending CN115589446A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211172546.6A CN115589446A (en) 2022-09-26 2022-09-26 Meeting abstract generation method and system based on pre-training and prompting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211172546.6A CN115589446A (en) 2022-09-26 2022-09-26 Meeting abstract generation method and system based on pre-training and prompting

Publications (1)

Publication Number Publication Date
CN115589446A true CN115589446A (en) 2023-01-10

Family

ID=84777960

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211172546.6A Pending CN115589446A (en) 2022-09-26 2022-09-26 Meeting abstract generation method and system based on pre-training and prompting

Country Status (1)

Country Link
CN (1) CN115589446A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965033A (en) * 2023-03-16 2023-04-14 安徽大学 Generation type text summarization method and device based on sequence level prefix prompt

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897949A (en) * 2020-07-28 2020-11-06 北京工业大学 Guided text abstract generation method based on Transformer
CN112765345A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Text abstract automatic generation method and system fusing pre-training model
CN112862662A (en) * 2021-03-12 2021-05-28 云知声智能科技股份有限公司 Method and equipment for distributed training of transform-xl language model
CN113282750A (en) * 2021-05-27 2021-08-20 成都数之联科技有限公司 Model training method, system, device and medium
CN114372140A (en) * 2021-12-31 2022-04-19 北京海联捷讯科技股份有限公司 Layered conference abstract generation model training method, generation method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111897949A (en) * 2020-07-28 2020-11-06 北京工业大学 Guided text abstract generation method based on Transformer
CN112765345A (en) * 2021-01-22 2021-05-07 重庆邮电大学 Text abstract automatic generation method and system fusing pre-training model
CN112862662A (en) * 2021-03-12 2021-05-28 云知声智能科技股份有限公司 Method and equipment for distributed training of transform-xl language model
CN113282750A (en) * 2021-05-27 2021-08-20 成都数之联科技有限公司 Model training method, system, device and medium
CN114372140A (en) * 2021-12-31 2022-04-19 北京海联捷讯科技股份有限公司 Layered conference abstract generation model training method, generation method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965033A (en) * 2023-03-16 2023-04-14 安徽大学 Generation type text summarization method and device based on sequence level prefix prompt

Similar Documents

Publication Publication Date Title
CN108415977B (en) Deep neural network and reinforcement learning-based generative machine reading understanding method
JP7419508B2 (en) Contrastive pre-training for language tasks
CN111783474B (en) Comment text viewpoint information processing method and device and storage medium
CN107273503B (en) Method and device for generating parallel text in same language
CN111897933B (en) Emotion dialogue generation method and device and emotion dialogue model training method and device
CN112435656B (en) Model training method, voice recognition method, device, equipment and storage medium
CN111444340A (en) Text classification and recommendation method, device, equipment and storage medium
CN112214604A (en) Training method of text classification model, text classification method, device and equipment
CN111966800B (en) Emotion dialogue generation method and device and emotion dialogue model training method and device
CN110162766B (en) Word vector updating method and device
CN111930914B (en) Problem generation method and device, electronic equipment and computer readable storage medium
CN109344242B (en) Dialogue question-answering method, device, equipment and storage medium
US20220383206A1 (en) Task Augmentation and Self-Training for Improved Few-Shot Learning
CN113435211B (en) Text implicit emotion analysis method combined with external knowledge
CN111767694B (en) Text generation method, apparatus and computer readable storage medium
CN112632244A (en) Man-machine conversation optimization method and device, computer equipment and storage medium
CN112183106B (en) Semantic understanding method and device based on phoneme association and deep learning
CN118043885A (en) Contrast twin network for semi-supervised speech recognition
CN114443899A (en) Video classification method, device, equipment and medium
CN114091466A (en) Multi-modal emotion analysis method and system based on Transformer and multi-task learning
CN115589446A (en) Meeting abstract generation method and system based on pre-training and prompting
CN114169408A (en) Emotion classification method based on multi-mode attention mechanism
CN116860943A (en) Multi-round dialogue method and system for dialogue style perception and theme guidance
WO2023009740A1 (en) Contrastive learning and masked modeling for end-to-end self-supervised pre-training
CN115422329A (en) Knowledge-driven multi-channel screening fusion dialogue generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination