CN115589446A

CN115589446A - Meeting abstract generation method and system based on pre-training and prompting

Info

Publication number: CN115589446A
Application number: CN202211172546.6A
Authority: CN
Inventors: 罗彦卓; 李滔; 孟伟; 林超纯; 张秀屏; 麦永钦; 卓汉强
Original assignee: Black Box Technology Guangzhou Co ltd
Current assignee: Black Box Technology Guangzhou Co ltd
Priority date: 2022-09-26
Filing date: 2022-09-26
Publication date: 2023-01-10

Abstract

The invention discloses a meeting abstract generating method and system based on pre-training and prompting, wherein the method comprises the following steps: acquiring conference text data, and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence; pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model; and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract. By using the invention, the text information of the conference can be predicted, so that the conference abstract more conforming to the conference information is generated. The method and the system for generating the conference abstract based on the pre-training and prompting can be widely applied to the technical field of natural language processing.

Description

Meeting abstract generation method and system based on pre-training and prompting

Technical Field

The invention relates to the technical field of natural language processing, in particular to a meeting abstract generation method and system based on pre-training and prompting.

Background

The existing mainstream automatic meeting abstract technology mainly comprises an abstraction method represented by TextRank and a generation method represented by an improved transform algorithm, wherein the abstraction method is directly a truncation in a meeting record text, so that the generated abstract easily leaks important information and does not contain further understanding and reasoning, particularly for the meeting record mainly in a multi-person conversation form, the abstraction method cannot obtain information contained in the communication of meeting personnel, only can intercept some obvious viewpoints, the formed abstract is inflexible, and the part of the abstract which is not understood and can cause the formation of the abstract possibly does not conform to the actual expression of the meeting.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide a method and a system for generating a meeting summary based on pre-training and prompting, which can predict text information of a meeting to generate a meeting summary more suitable for meeting information.

The first technical scheme adopted by the invention is as follows: a conference abstract generation method based on pre-training and prompting comprises the following steps:

acquiring conference text data, and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence;

pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;

and inputting the segmented conference text vector sequence into a trained Transformer-XL model based on a task prompt method to obtain a conference abstract.

Further, the step of obtaining the conference text data and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence specifically includes:

acquiring audio data of a conference through audio equipment;

converting audio data of the conference through an API (application program interface) interface to obtain conference text data;

mapping the conference text data to obtain a conference text vector sequence;

and segmenting the conference text vector sequence to obtain a segmented conference text vector sequence.

Further, the step of mapping the conference text data to obtain a conference text vector sequence specifically includes:

splitting the conference text data according to characters of the conference text data to obtain a conference text character sequence;

constructing a dictionary, wherein the dictionary comprises numerical indexes for mapping the conference text character sequence;

and allocating the number index in the dictionary to the conference text character sequence and expressing the number index to obtain a conference text vector sequence.

Further, the pre-training of the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model specifically includes:

acquiring a label-free text data set, and performing data preprocessing to obtain a training set;

inputting the training set into a Transformer-XL model, wherein the Transformer-XL model comprises an input layer, a hidden layer and an output layer;

performing word embedding and position embedding processing on a training set based on an input layer of a Transformer-XL model to obtain a hidden state vector of the training set;

based on a hidden layer of a Transformer-XL model, carrying out data hiding operation processing on hidden state vectors of a training set to obtain hidden vectors with context information;

on the basis of an output layer of a Transformer-XL model, carrying out projection processing on a hidden vector with context information to obtain an output result;

and updating the Transformer-XL model by a gradient descent method based on the output result to obtain the trained Transformer-XL model.

Further, the expression of the hidden layer of the Transformer-XL pre-training model is as follows:

in the above formula, the first and second carbon atoms are,

representing the hidden state vector, W, output after the tau segment token sequence passes through the layers of j Transformer-XL models _o A matrix of output projections is represented which,

and

a matrix of projections is represented which,

represents the inverse matrix of the k-dimensional all 1-vector, and μ represents an arbitrary constant.

Further, the expression of the output layer of the Transformer-XL pre-training model is as follows:

in the above formula, W _u A matrix of projections is represented which,

is represented by

And outputting the hidden state vectors after passing through m identical hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input.

Further, the method based on task prompt includes the step of inputting the sequence of the fragmented conference text vectors into a trained transform-XL model to obtain a conference abstract, and specifically includes:

acquiring a conference abstract data set and inputting the conference abstract data set into a trained Transformer-XL model for fine tuning processing to obtain a task prompt;

guiding based on task prompt, and sequentially inputting the vector sequences of the fragmented conference texts into a trained transform-XL model to obtain the vector sequence of the abstract text of the conference;

and converting the vector sequence of the text of the conference abstract to obtain the conference abstract.

Further, the step of obtaining a meeting summary data set and inputting the meeting summary data set to a trained Transformer-XL model to obtain a task prompt specifically includes:

acquiring a conference abstract data set and carrying out data preprocessing to obtain a conference abstract vector;

setting a constant matrix of the context range of the conference abstract task and carrying out mapping processing through the mapping matrix to obtain an initial memory vector sequence;

inputting the initial memory vector sequence and the conference abstract vector into a trained transform-XL model to obtain a predicted value;

and updating parameters of the mapping matrix through a gradient descent method based on the predicted value to finally obtain a task prompt vector.

Further, the step of sequentially inputting the fragmented conference text vector sequence to the trained transform-XL model based on the task prompt to obtain the vector sequence of the conference abstract text specifically includes:

after the single hidden layer full-connection neural network model is trained through data, generating an initial memory vector;

splicing the initial memory vector with the text vector sequence of the fragmented conference to obtain a spliced vector;

updating initial memory vectors respectively by the spliced vectors and inputting the initial memory vectors into the trained transform-XL model;

outputting hidden state vectors of a hidden layer by a Transformer-XL model after task prompt guide training;

splicing the hidden state vector of the hidden layer and the updated memory vector and inputting the spliced hidden state vector and the updated memory vector into a trained Transformer-XL model for traversal training;

judging whether the traversal times reach the number of hidden layers of the trained transform-XL model;

and if the requirement of the layer number is not met, the steps of splicing, updating and inputting the model are circulated until the requirement of the layer number is met, all the segmented conference text vector sequences are traversed, and the vector sequence of the summary conference text is output.

The second technical scheme adopted by the invention is as follows: a meeting abstract generating system based on pre-training and prompting comprises:

the acquisition module acquires the conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;

the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain the trained Transformer-XL model;

and the output module is used for inputting the fragmented conference text vector sequence into the trained transform-XL model based on a task prompting method to obtain a conference abstract.

The method and the system have the beneficial effects that: according to the invention, through constructing the Transformer-XL model and using a large amount of conference texts for pre-training, the pre-trained Transformer-XL model has the capability of processing information in a segmented manner, the abstract of the conference texts is further acquired to finely adjust the trained Transformer-XL model, so that the pre-trained model is prompted about an abstract task, relevant knowledge can be led out from the pre-trained model, the computational cost of model training is saved, and a generative abstract method is adopted, so that dialogues in the conference recording texts can be understood and inferred, and the generated abstract is more suitable for the conference information.

Drawings

FIG. 1 is a flowchart illustrating the steps of a method for generating a meeting summary based on pre-training and prompting according to the present invention;

FIG. 2 is a block diagram of a conference summary generation system based on pre-training and prompting according to the present invention;

FIG. 3 is a flowchart of the steps of generating the abstract after the transformation-XL pre-training model of the present invention obtains the task prompt.

Detailed Description

The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.

Referring to fig. 1, the invention provides a conference abstract generating method based on pre-training and prompting, which comprises the following steps:

s1, conference text data are obtained and are subjected to mapping and segmentation processing, and a fragmented conference text vector sequence is obtained;

specifically, a conference site is recorded in real time until the conference is finished to obtain a recording file of the whole conference content, a voice-to-text API service interface provided by a network platform is directly called (here, API service provided on the network is directly called, such as Korea communication voice transfer, the recording of the whole conference content is converted into a plain text, each character of a plain text character string is unpacked to form a character sequence, a dictionary is constructed for mapping the character of the character string type to a character sequenceIn the numerical indexes starting from 0, according to a dictionary, each unique character is allocated with a numerical index, a character sequence split by a text character string is converted from a character string representation to a numerical index representation, a final token sequence can be obtained, and the token sequence X = (X =) is obtained ₁ ,x ₂ ,…,x _n ) The method is divided according to the following formula, and the expression is as follows:

in the above formula, k represents the size of the division, x _i ∈X,

The fragment sequence X' = (X) can then be obtained ₁ ,x ₂ ,…,x _n′ )。

S2, pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;

specifically, a large-scale label-free data set (generally, text data such as hundredth known, question and answer known, hundredth encyclopedia and the like collected on a network) is established, data cleaning operations such as de-duplication, missing value removal and the like are carried out on the data, text preprocessing is carried out, the data is converted into token sequences, a transform-XL autoregressive model, namely a transform-XL pre-training model is established, and tasks of the model can be formalized by giving a token sequence X = (X) representing an input text ₁ ,x ₂ ,…,x _n ) Outputting token sequence Y = (Y) of subsequent text ₁ ,y ₂ ,…,y _n ) When the conditional probability P (Y | X) of token sequence Y = (Y) is obtained ₁ ,y ₂ ,…,y _n ) The probability of (c) is the maximum, and the token sequence Y = (Y) of the subsequent text can be obtained finally ₁ ,y ₂ ,…,y _n )；

S21, training an input layer of a Transformer-XL pre-training model;

specifically, a token sequence X = (X) of the entire input text ₁ ,x ₂ ,…,x _n ) Divided into shorter pieces of fixed sizeInputting the fragments into a Transformer-XL model one by one, enabling the length of each fragment to be k, and setting the sequence x of the tau fragment token _τ ＝(x _τ,1 ,x _τ,2 ,…,x _τ,k ) In the input layer of the transform-XL model, the following operations are performed:

in the above formula, the first and second carbon atoms are,

representative token sequence x representing input layer output _τ ＝(x _τ,1 ,x _τ,2 ,…,x _τ,k ) Hidden state vector of (2), W _e Word-embedding matrix, W, representing token _p Position-embedding matrix, W, representing token _e 、W _p Trainable parameters of the input layer are represented.

S22, training a hidden layer of a Transformer-XL pre-training model;

specifically, the transform-XL model comprises a set of m identical hidden layers, and for the j ∈ {2, \ 8230;, m +1} layers of the transform-XL model, there are:

in the above formula, the first and second carbon atoms are,

represents a hidden state vector, W, output after the tau segment token sequence passes through j layers of a Transformer-XL model _o A matrix of output projections is represented which,

and

a matrix of projections is represented which,

denotes an inverse matrix of all 1 vectors of the k dimension, and μ denotes an arbitrary constant.

S23, training an output layer of a Transformer-XL pre-training model;

specifically, for the transform-XL model output layer, there is an expression trained as follows:

in the above formula, W _u A matrix of projections is represented which,

is represented by

Outputting hidden state vectors after passing through m same hidden layers, wherein Y represents outputting a subsequent text mark sequence, and X represents a text mark sequence representing text input;

establishing an objective function max _φ P (Y | X; phi), wherein phi represents all trainable parameters of the Transformer-XL model, and the model parameters phi are updated by a gradient descent method, so that the parameters comprise:

in the above formula, phi represents an updatable parameter of the Transformer-XL model;

after multiple iterations, the trained Transformer-XL model parameter phi can be obtained, and the pre-trained Transformer-XL model can be obtained.

And S3, inputting the segmented conference text vector sequence into the trained Transformer-XL model based on a task prompt method to obtain a conference abstract.

S31, acquiring a prompt, namely a meeting task prompt;

specifically, referring to fig. 3, a data cleansing operation such as deduplication and missing value removal is performed on the conference summary data set, and a text is performedPreprocessing, converting the data into token sequences, wherein for each piece of data, the token sequence x represents the text of the meeting record and belongs to the input of the model, and the token sequence y represents the meeting abstract content summarized by related professionals according to the meeting record without setting

Representing a range of contextual contexts that describe the task of summarization, then:

in the above formula, W _θ Represents a mapping matrix, B _θ Representing the mapping matrix, | representing the l-th layer of the Transformer-XL pre-training model,

representing an initial memory vector;

wherein W _θ 、B _θ Is the training parameter of the single hidden layer full-connection neural network model, the initial memory vector sequence

Is the initial input of the Transformer-XL pre-training model;

the initial memory vector sequence

Inputting Transformer-XL pre-training model with sequence x

The predicted value y' can be obtained, so that the objective function can be established as follows:

in the above formula, z = [ x; y is]Means x and Y are spliced, i represents time step, Y _idz Context range, h, representing y _i A contextual context representing all histories;

autoregressive Transformer-XL pre-training model h _i Is calculated with respect to z _i And its left side past context h _＜i The function of (c) then has:

in the above formula, P _idx Is represented by a parameter matrix, P _θ Context range of representation, here, h _i (for all i) is a learnable parameter matrix P _θ When i ∈ P _idx H is _i Directly from P _θ When is coming into contact with

H is _i Still depend on P _θ Due to P _θ The initial memory vector provided is always in the left-hand (i.e. historical) context and therefore affects the representation of any subsequent context to its right;

updating the model parameter theta by using a gradient descent method, wherein the updated expression is as follows:

after multiple iterations, a trained parameter matrix P can be obtained _θ Further, an initial sequence of memory vectors representing the context of the prompt summary task may be obtained

S32, after the prompt is obtained, sequentially inputting a fragment sequence representing the segmented text of the conference content by a transform-XL pre-training model;

in particular, token sequence x for the 1 st text segment ₁ After passing through an input layer of a Transformer-XL neural network, an input layer hidden state vector can be obtained

Hidden state vector of input layer of Transformer-XL neural network in memory vector sequence

After stitching, as input for the layer of the next Transformer-XL pre-training model, we can:

in the above formula, L ∈ {1,2, \ 8230;, L } represents the L-th layer of the transform-XL pre-training model, when τ -1=0, the hidden state vector is from the initial memory vector sequence, layer (·) represents the specific operation of the layer of the transform-XL pre-training model, and the vector which represents the output hidden state of the L-th layer of the transform-XL pre-training model in the memory vector sequence is directly replaced into

Input τ th fragment token sequence x _τ After passing through L layers of the Transformer-XL pre-training models, obtaining a hidden state vector output by the hidden layer of the last Transformer-XL pre-training model

And an updated memory vector sequence

In fragment sequence X' = (X) ₁ ,x ₂ ,…,x _n′ ) After each segment is operated according to the steps, the last hidden state vector of the last segment is obtained

And an updated memory vector sequence

S33, after the token sequence representing the segmented text of the conference content is input into the transform-XL pre-training model, generating a token of the conference abstract continuously in an autoregressive mode;

in particular, the hidden state vector

Inputting an output layer of a transform-XL neural network to obtain the distribution of a first abstract text token so as to obtain the first abstract text token;

the above steps are combined to form a hidden state vector

Inputting an output layer of the Transformer-XL neural network to obtain a summary text token and a memory vector sequence

Inputting the result into a Transformer-XL pre-training model, and obtaining the distribution of the next abstract text token

And the current latest memory vector sequence

Repeating the steps until the model generates a token representing the end symbol, wherein at this time, a token sequence formed by the first abstract text token to the token representing the end symbol is the token sequence of the abstract of the whole conference content, and the token sequence of the conference abstract text is converted back into a text form again, so that the abstract text of the whole conference content can be obtained finally.

Referring to fig. 2, a conference summary generation system based on pre-training and prompting includes:

the acquisition module acquires conference text data and performs mapping and segmentation processing to obtain a fragmented conference text vector sequence;

The contents in the method embodiments are all applicable to the system embodiments, the functions specifically implemented by the system embodiments are the same as those in the method embodiments, and the beneficial effects achieved by the system embodiments are also the same as those achieved by the method embodiments.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A conference abstract generating method based on pre-training and prompting is characterized by comprising the following steps:

acquiring conference text data, and mapping and segmenting the conference text data to obtain a fragmented conference text vector sequence;

2. The method for generating the conference abstract based on the pre-training and prompting as claimed in claim 1, wherein the step of obtaining the conference text data and performing mapping and segmentation processing to obtain a fragmented conference text vector sequence specifically comprises:

acquiring audio data of a conference through audio equipment;

mapping the conference text data to obtain a conference text vector sequence;

3. The method for generating the conference abstract based on the pre-training and the prompting as claimed in claim 2, wherein the step of mapping the conference text data to obtain the conference text vector sequence specifically comprises:

and allocating the number indexes in the dictionary to the conference text character sequence and expressing the number indexes to obtain a conference text vector sequence.

4. The method for generating a conference summary based on pre-training and prompting as claimed in claim 3, wherein the step of pre-training a fransformer-XL model based on a label-free text dataset to obtain a trained fransformer-XL model specifically includes:

performing word embedding and position embedding processing on the training set based on an input layer of a transform-XL model to obtain a hidden state vector of the training set;

based on a hidden layer of a transform-XL model, carrying out data hiding operation processing on the hidden state vector of the training set to obtain a hidden vector with context information;

5. The method for generating a conference summary based on pre-training and hinting of claim 4, wherein the expression of the hidden layer of the transform-XL pre-training model is as follows:

in the above formula, the first and second carbon atoms are,

and

a matrix of projections is represented which,

6. The method for generating a meeting abstract based on pre-training and prompting of claim 4, wherein the expression of the output layer of the transform-XL pre-training model is as follows:

in the above formula, W _u A matrix of projections is represented which,

is represented by

7. The method as claimed in claim 4, wherein the method for generating the meeting abstract based on pre-training and prompting is characterized in that the method for generating the meeting abstract based on task prompting inputs the sequence of segmented meeting text vectors into the trained transform-XL model to obtain the meeting abstract, and specifically comprises:

acquiring a conference abstract data set and inputting the conference abstract data set into a trained Transformer-XL model for fine adjustment processing to obtain a task prompt;

8. The method of claim 7, wherein the step of obtaining a meeting summary data set and inputting the meeting summary data set to a trained Transformer-XL model to obtain a task prompt specifically comprises:

9. The method of claim 8, wherein the step of sequentially inputting the segmented text vector sequence of the conference to the trained Transformer-XL model to obtain the vector sequence of the text of the conference summary based on the task prompt specifically comprises:

after fine adjustment is carried out on the basis of a task prompt method, an initial memory vector is obtained;

splicing the hidden state vector of the hidden layer and the updated memory vector and inputting the spliced hidden state vector and the updated memory vector to a trained Transformer-XL model for traversal;

judging whether the traversal times reach the number of hidden layers of the trained Transformer-XL model;

10. A conference abstract generation system based on pre-training and prompting is characterized by comprising the following modules:

the training module is used for pre-training the Transformer-XL model based on the label-free text data set to obtain a trained Transformer-XL model;