CN116070638A

CN116070638A - Training updating method and system for Chinese sentence feature construction

Info

Publication number: CN116070638A
Application number: CN202310001746.3A
Authority: CN
Inventors: 杜浩鹏; 徐圣兵; 王振友; 谢锐; 吴宇佳
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2023-01-03
Filing date: 2023-01-03
Publication date: 2023-05-05
Anticipated expiration: 2043-01-03
Also published as: CN116070638B

Abstract

The invention discloses a training updating method and a training updating system for Chinese sentence characteristic construction, wherein the method comprises the following steps: extracting features of the training set and the application set to obtain respective feature matrixes; taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment; performing fine adjustment training on the information increment factors based on the feature matrix of the training set to obtain a trained information increment model; and carrying out feature update on the application set by applying the trained information increment model to obtain a new application set feature matrix. The system comprises: the system comprises a feature extraction module, a model construction module, a fine adjustment training module and an application updating module. The invention can dynamically update the feature vector of the word and improve the accuracy of the new semantic expression of the word in different sentences. The training updating method and system for the Chinese sentence characteristic structure can be widely applied to the field of natural language processing.

Description

Training updating method and system for Chinese sentence feature construction

Technical Field

The invention relates to the technical field of natural language processing, in particular to a training updating method and system for Chinese sentence feature construction.

Background

Natural language processing is widely used in communication, language dialogue, document processing, etc. and is widely used in life. With the development of technology, natural language processing is an important technology for connecting human communication with computer data, and is also an important method for enabling human to directly communicate with machines. How to make machines more precise for human services is also a constant problem for researchers.

Natural language processing is today based on pre-trained models. Natural language processes text, most importantly, feature extraction and representation of text. For Chinese, meaning varies in different contexts, and sometimes meaning varies from day to day by location. Because the pre-training model performs unsupervised learning through a large number of corpora, an exact feature vector library is formed, so that the feature vectors of the obtained words are fixed, and when the words and the words form sentences for feature extraction, the words are independent of each other. When some words in the Chinese sentence form another Chinese sentence, the feature vectors of the words are still represented by the same feature vector. With the change of time, certain words can derive new semantics from different sentences, and the direct application of the pre-training model feature extraction to the words can cause the sentences to lose original information, and the situations that the new semantics cannot be accurately captured exist.

Disclosure of Invention

In order to solve the technical problems, the invention aims to provide a training updating method and a training updating system for the feature construction of Chinese sentences, which can dynamically update the feature vectors of words and improve the accuracy of new semantic expression of the words in different sentences.

The first technical scheme adopted by the invention is as follows: a training updating method for Chinese sentence characteristic construction comprises the following steps:

extracting features of the training set and the application set to obtain respective feature matrixes;

taking the average value of the sum vector of the difference between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment;

performing fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;

and updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain an updated application feature matrix.

Further, the step of extracting features from the training set and the application set to obtain respective feature matrices specifically includes:

splitting the Chinese character vectors in the training set and the application set according to the Chinese characters to obtain Chinese character coding vectors;

according to the coding number of the Chinese character coding vector, carrying out feature extraction on each word in the training set and the application set sentences by utilizing a pre-training feature word library to obtain a feature vector of each word;

and superposing the feature vector of each word in the training set and the application set according to the coding number to obtain feature matrixes of the training set and the application set.

Through the preferred step, the characteristic information of the sentence is represented in a matrix mode, the characteristic information of the word is represented in a row vector mode, the characteristic number of the word is represented in a column vector mode, the characteristic information of the sentence can be clearly represented in the representation mode, and redundant fuzzy information or partial information is not lost in the subsequent characteristic information updating process.

Further, the step of taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment, wherein the information increment model expression is as follows:

the information increment is:

wherein alpha is _i ＝(x _i1 x _i2 … x _ik ) Representing Chinese character b _i Is characterized by the feature vector (i=1, 2, …, m), m represents the number of Chinese characters in the sentence, k represents the current Chinese character b _i Number of features, alpha _l Feature vectors (l=1, 2, …, m) representing any chinese character of a sentence.

The information increment model is as follows:

α _i ′＝α _i +λΔx _i

X _t ′＝X _t +λΔX _t

wherein alpha is _i Is the characteristic vector of the word, lambda is the information increment factor, deltax _i Alpha is the information increment of the word _i ' updated feature vector, X, for words after application of information delta model _t As a feature matrix of sentences, deltaX _t Incremental matrix of feature information for sentence, X _t ' updated feature matrix for sentence after application of information increment model, and

through the optimization step, the feature vector of each word is updated in an increment, and then the feature matrix of the sentence is updated.

Further, the step of performing fine tuning training on the information increment factor in the information increment model by using the feature matrix based on the training set to obtain the trained information increment factor specifically comprises the following steps:

setting an initial value, a threshold value, an initial value of accuracy and an increment value of the information increment factor;

updating the feature matrix of the training set by using the information increment model to obtain an updated training feature matrix;

sample prediction is carried out on the new training feature matrix based on the fine tuning technology function, and a prediction label is obtained;

comparing the predicted label with the correct label of the training set, and calculating the accuracy;

finely adjusting the information increment factors in the information increment model to obtain a finely adjusted information increment model;

and (3) circulating the steps of feature matrix updating, sample prediction, accuracy calculation and information increment factor fine adjustment until the accuracy and the information increment factor meet the conditions, and obtaining the trained information increment factor.

Through the optimization step, a trained information increment factor which can enable the accuracy to reach the highest is obtained, and the trained information increment model constructed by the information factor can improve the accuracy of new semantic expression in application of application set features.

Further, the step of updating the feature matrix of the application set by the information increment model applying the trained increment factors to obtain an updated application feature matrix, wherein the new application feature matrix is expressed as follows:

X _s ′＝X _s +λ*ΔX _s

wherein X is _s ' is a new application feature matrix, X _s Lambda is the feature matrix of the application set ^* As an increment factor after training, deltaX _s An information delta matrix for an application set.

Through the optimization step, the real-time update of the feature update of the application set is realized, and the new semantics are accurately captured.

The second technical scheme adopted by the invention is as follows: a training update system for a feature construct of a chinese sentence, comprising:

the feature extraction module is used for extracting features of the training set and the application set to obtain respective feature matrixes;

the model construction module is used for taking the average value of the sum vector of the differences between the feature vector of each word and the feature vector of the current word as the increment information of the current word to obtain an information increment model;

the fine tuning training module is used for carrying out fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;

and the application updating module is used for updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain a new application feature matrix.

The method and the system have the beneficial effects that: the invention provides an information increment matrix and an information increment model, which perform corresponding construction increment fusion updating on words in different sentences to form dynamic updating of word vectors, capture new information, update to obtain more accurate sentence characteristic representation, enable each word to express accurate meaning in different contexts, obtain more accurate sentence characteristic representation and improve the accuracy of subsequent natural language processing tasks.

Drawings

FIG. 1 is a flow chart showing the steps of a training update method for Chinese sentence feature construction according to the present invention;

FIG. 2 is a block diagram of a training update system for Chinese sentence feature construction according to the present invention;

FIG. 3 is a fine tuning training flowchart of a training update method of Chinese sentence feature construction according to a specific example I of the present invention;

FIG. 4 is a fine tuning training flowchart of a second embodiment of a training update method for Chinese sentence feature construction according to the present invention;

Detailed Description

The invention will now be described in further detail with reference to the drawings and to specific examples. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.

Referring to fig. 1, the invention provides a training updating method for Chinese sentence feature construction, which comprises the following steps:

s1, acquiring a training set, a pre-training characteristic word library, an application set and a fine-tuning technical function of an existing pre-training model;

pre-training feature word stock: the character library matrix E comprises M character feature vectors, wherein the length of each feature vector is K.

Training set: contains n sentences { S } ₁ S ₂ … S _n Each sentence corresponds to a length { m } ₁ m ₂ … m _n -correct label Y { Y }, corresponding thereto ₁ y ₂ … y _n }. Since the lengths of the sentences are different, it is necessary to perform 0-compensating or cutting operations for each sentence so that the lengths are identical, and the length is set to be m. If the training sets of different tasks are different, or a new training set is added, corresponding updating and retraining are carried out.

Application set: and according to the set given by the natural language processing task to be completed, a plurality of sentences are contained in the set.

Fine tuning the technical function: the downstream task technology of the pre-training model, the fine tuning technology according to different natural language processing tasks, is also different, and the process is described as a function, and the expression of the function is as follows:

{y ₁ ′ y ₂ ′ … y _n ′}＝f{X ₁ X ₂ … X _n }

s2, extracting features of a training set and an application set based on a pre-training feature word stock to obtain respective feature matrixes;

s2.1 Chinese sentence S in given training set _t T=1, 2, …, n, split Chinese character vector according to Chinese character

Wherein m represents S _t Number of Chinese characters after cutting or supplementing 0, c _i Representing the i-th Chinese character, i=1, 2, …, m, obtaining Chinese character coding vector, and using S for the Chinese character coding vector _b Indicating (I)>

S2.2, extracting each character feature of the sentence by utilizing the pre-training feature word library according to the coding number of the coding vector, ifThe number of the ith word is b _i B of word stock matrix _i The lines being the eigenvectors of the word, i.e

X _t Expressed as sentence S _t Is m x K, x _ij Representing Chinese character b _i The j-th eigenvalue (i=1, 2, …, m, j=1, 2, …, K) of the training set chinese sentence S _t Is expressed as: />

S2.3, extracting the same characteristics of the application set as those of the training set to obtain a characteristic matrix of the application set.

S3, taking the average value of the sum vector of the difference between the feature vector of each word and the feature vector of the current word as the increment information of the current word, and constructing an information increment model based on the information increment;

definition alpha _i ＝(x _i1 x _i2 … x _ik ) Representing Chinese characters _b The feature vector of i (i=1, 2, …, m) is

Definition of the definition

Representing sentence S _t The j-th feature (j=1, 2, …, K) then has X _t ＝(β ₁ β ₂ … β _K )。

The information increment defining each word is:

the information increment model is constructed as follows:

α _i ′＝α _i +λΔx _i

X _t ′＝X _t +λΔX _t

wherein alpha is _i Is the characteristic vector of the word, m represents the number of Chinese characters in the sentence, alpha _l Feature vectors (l=1, 2, …, m) representing any chinese character in a sentence, λ being the information increment factor, Δx _i Alpha is the information increment of the word _i ' updated feature vector, X, for words after application of information delta model _t As a feature matrix of sentences, deltaX _t Incremental matrix of feature information for sentence, X _t ' updated feature matrix for sentence after application of information increment model, and

applying an information increment model to all sentences in the training set to obtain new characteristic sentence representation { X } of each sentence in the training set ₁ ′ X ₂ ′ … X _n ′}。

S4, performing fine adjustment training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the trained information increment factors;

An embodiment of the judgment conditions is shown in fig. 3:

setting the initial value of the information increment factor to 0 and the maximum threshold value to lambda _max Initially, the method comprisesMaximum accuracy A _CCmax And an increment value Δλ;

updating the feature matrix of each sentence of the training set by using the information increment model to obtain a new feature representation set { X } of each sentence ₁ ′ X ₂ ′ … X _n ′}；

Based on a fine-tuning technique function { y } ₁ ′ y ₂ ′ … y _n ′}＝f{X ₁ X ₂ … X _n Sample prediction is carried out on the new training feature matrix to obtain a prediction label Y' = { Y ₁ ′ y ₂ ′ … y _n ′}；

Comparing the predicted label with the correct label of the training set, and calculating the accuracy, wherein the expression of the accuracy is as follows:

wherein n is _correct To predict the correct number of tags in the training set, n represents the total number of tags in the training set.

Judging whether the accuracy rate is greater than the initial maximum accuracy rate, if so, replacing the initial maximum accuracy rate with the current accuracy rate, judging the information increment factor, if so, outputting the information increment factor as the trained information increment factor, and if not, adding an increment value for the information increment factor; if the accuracy rate is not more than the initial maximum accuracy rate, directly judging the information increment factor, if the information increment factor is more than or equal to the maximum threshold value, outputting the information increment factor as the trained information increment factor, and if the information increment factor is not more than the maximum threshold value, adding an increment value for the information increment factor;

substituting the information increment factor added with one increment value into the information increment model, and replacing the original information increment factor to obtain a finely-adjusted information increment model;

and (5) circulating the steps until the trained information increment factor is obtained.

A second embodiment of the judgment condition is shown in fig. 4:

setting the initial value of the information increment factor to 0 and the maximum threshold value to lambda _max Initial maximum accuracy A _CCmax And an increment value Δλ;

Judging whether the information increment factor is greater than or equal to a maximum threshold value, if the information increment factor is greater than or equal to the maximum threshold value, outputting the increment factor as the trained information increment factor, if the information increment factor is not greater than the maximum initial accuracy, judging whether the accuracy is greater than or equal to the maximum initial accuracy, and if the information increment factor is greater than or equal to the maximum initial accuracy, replacing the maximum initial accuracy with the current accuracy and adding an increment value for the information increment factor; if the condition is not met, an increment value is directly added for the information increment factor, and the initial maximum accuracy is unchanged;

S5, updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain a new application feature matrix;

the new application feature matrix is represented as follows:

X _s ′＝X _s +λ*ΔX _s

As shown in FIG. 2, the present invention provides a training update system for Chinese sentence feature construction, the system comprising:

the fine tuning training module is used for carrying out fine tuning training on the information increment factors in the information increment model based on the feature matrix of the training set to obtain the information increment factors after training;

The content in the method embodiment is applicable to the system embodiment, the functions specifically realized by the system embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.

While the preferred embodiment of the present invention has been described in detail, the invention is not limited to the embodiment, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the invention, and these modifications and substitutions are intended to be included in the scope of the present invention as defined in the appended claims.

Claims

1. The training updating method for the Chinese sentence characteristic structure is characterized by comprising the following steps of:

2. The training updating method of Chinese sentence feature construction according to claim 1, wherein the step of extracting features from the training set and the application set to obtain respective feature matrices comprises:

according to the coding number of the Chinese character coding vector, carrying out feature extraction on each word of the sentence of the training set and the application set by utilizing a pre-training feature word library to obtain a feature vector of each word;

3. The training update method of a chinese sentence characteristic structure according to claim 1, wherein the expression of the information increment is as follows:

wherein alpha is _i ＝(x _i1 x _i2 …x _ik ) Representing the current Chinese character b _i Is characterized by the feature vector (i=1, 2, …, m), m represents the number of Chinese characters in the sentence, k represents the current Chinese character b _i Number of features, alpha _l Feature vectors (l=1, 2, …, m) representing any chinese character of a sentence.

4. The training updating method of Chinese sentence characteristic construction according to claim 1, wherein the information increment model has the expression:

α _i ′＝α _i +λΔx _i

X _t ′＝X _t +λΔX _t

5. the method for updating the feature structure of a chinese sentence according to claim 1, wherein the step of performing fine tuning training on the information increment factor in the information increment model based on the feature matrix of the training set to obtain the trained information increment factor specifically comprises:

6. The training update method of claim 5, wherein the accuracy is expressed as follows:

wherein n is _orrect In order to predict the correct number of labels in the correct labels of the training set, n represents the total number of labels in the training set.

7. The training updating method of Chinese sentence feature construction according to claim 1, wherein the step of updating the feature matrix of the application set by the information increment model applying the trained increment factor to obtain an updated application feature matrix is represented by the following formula:

X _s ′＝X _s +λ ^* ΔX _s

8. A training update system for a feature construct of a chinese sentence, comprising:

and the application updating module is used for updating the feature matrix of the application set by using the information increment model of the trained increment factor to obtain an updated application feature matrix.