WO2021139233A1 - 数据扩充混合策略生成方法、装置和计算机设备 - Google Patents

数据扩充混合策略生成方法、装置和计算机设备 Download PDF

Info

Publication number
WO2021139233A1
WO2021139233A1 PCT/CN2020/118140 CN2020118140W WO2021139233A1 WO 2021139233 A1 WO2021139233 A1 WO 2021139233A1 CN 2020118140 W CN2020118140 W CN 2020118140W WO 2021139233 A1 WO2021139233 A1 WO 2021139233A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
strategy
training data
hybrid strategy
training
Prior art date
Application number
PCT/CN2020/118140
Other languages
English (en)
French (fr)
Inventor
朱威
李恬静
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021139233A1 publication Critical patent/WO2021139233A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device, computer equipment, and storage medium for generating a data expansion hybrid strategy.
  • Data augmentation is a common data processing method in machine learning and deep learning. It can generate more data from limited data, increase the number and diversity of training samples (noise data), and improve the robustness of the model.
  • common data expansion methods include synonym substitution and reverse translation.
  • the inventor realizes that at present, in natural language processing tasks, the collection of annotation data requires a lot of labor costs, and the collected data has limitations.
  • the data expansion hybrid strategy is usually artificially designed, and there are often strategies and data sets. It is not suitable or the amount of expansion is too large, which makes the trained model produce an over-fitting phenomenon, which makes the expansion efficiency of natural language data low.
  • a method for generating a data expansion hybrid strategy includes:
  • expanding the training data according to the data expansion hybrid strategy, and obtaining the expanded training data includes:
  • the training data containing the predicted character is used as the expanded training data.
  • expanding the training data according to the data expansion hybrid strategy, and obtaining the expanded training data includes:
  • expanding the training data according to the data expansion hybrid strategy, and obtaining the expanded training data includes:
  • the pre-trained generative model is used to generate new training data, and the expanded training data is obtained.
  • the pre-trained generative model is trained based on historical sentence data.
  • a pre-trained generative model is used to generate new training data, and the expanded training data obtained includes:
  • a pre-trained generative model is used to predict the corresponding new characters, and the expanded training data is obtained.
  • the step of inputting the strategy feedback data of the current time into the preset hybrid strategy search model to update the data expansion hybrid strategy includes:
  • updating the parameters of the preset hybrid strategy search model includes:
  • the parameters of the preset hybrid strategy search model are updated.
  • a data expansion hybrid strategy generating device includes:
  • the data acquisition module is used to acquire strategy feedback data and training data at the current time
  • the hybrid strategy acquisition module is used to input the strategy feedback data of the current time into the preset hybrid strategy search model to obtain the data expansion hybrid strategy of the current time;
  • the data expansion module is used to expand the training data according to the data expansion hybrid strategy to obtain the expanded training data
  • the strategy feedback data update module is used to input the expanded training data into the preset loop neural network for training, and obtain the strategy feedback data corresponding to the data expansion hybrid strategy;
  • the hybrid strategy update module is used to use the strategy feedback data corresponding to the data expansion hybrid strategy as the strategy feedback data of the current time, and wake up the hybrid strategy acquisition module to execute the operation of inputting the strategy feedback data of the current time into the preset hybrid strategy search model to
  • the data expansion hybrid strategy is updated until the number of training times of the preset hybrid strategy search model reaches the preset number of training times, and the optimal data expansion hybrid strategy is obtained.
  • a computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the following steps when the computer program is executed:
  • the computer program is executed by a processor, the following steps are implemented:
  • the above-mentioned data expansion hybrid strategy generation method, device, computer equipment and storage medium input the strategy feedback data into the preset hybrid strategy search model to initially generate the data expansion hybrid strategy, and then expand the training data according to the generated data expansion hybrid strategy, and further
  • the expanded training data is input to the preset loop nerves to update the strategy feedback data, and the above steps are looped to input the updated strategy feedback data into the preset hybrid strategy search model to update the parameters of the hybrid strategy search model to make the model trend.
  • the optimal data expansion hybrid strategy is obtained.
  • the above solution can reduce the time-consuming strategy search, and can automatically construct the optimal data expansion hybrid strategy based on the training data, improve the accuracy and robustness of the model, thereby improve the efficiency of natural language data expansion, and save labor costs and computing power costs .
  • FIG. 1 is an application environment diagram of a method for generating a data expansion hybrid strategy in an embodiment
  • FIG. 2 is a schematic flowchart of a method for generating a data expansion hybrid strategy in an embodiment
  • FIG. 3 is a schematic flowchart of a step of expanding training data according to a data expansion hybrid strategy in an embodiment
  • FIG. 4 is a schematic diagram of another process of expanding training data according to a data expansion hybrid strategy
  • FIG. 5 is a structural block diagram of an apparatus for generating a data expansion hybrid strategy in an embodiment
  • Fig. 6 is an internal structure diagram of a computer device in an embodiment.
  • the method for generating a data expansion hybrid strategy provided in this application can be applied to the application environment as shown in FIG. 1.
  • the terminal 102 communicates with the server 104 through the network.
  • the user uploads training data and strategy feedback data constructed from natural language data to the server 104 through the terminal 102, and then performs corresponding operations on the operation interface of the terminal 102, and sends a data expansion mixed strategy generation message to the server 104, and the server 104 responds In response to the message, obtain the strategy feedback data and training data at the current time, and input the strategy feedback data at the current time into the preset hybrid strategy search model to obtain the data expansion hybrid strategy at the current time, and expand the training data according to the data expansion hybrid strategy to get the expansion After the training data, the expanded training data is input to the preset loop neural network for training, and the strategy feedback data corresponding to the data expansion hybrid strategy is obtained, and the strategy feedback data corresponding to the data expansion hybrid strategy is used as the strategy feedback data at the current time.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • a method for generating a data augmentation hybrid strategy is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
  • Step 202 Obtain policy feedback data and training data at the current time.
  • the data enhancement strategy plays an important role in increasing the amount of training sample data, improving the stability and robustness of the model, and improving the adaptability and generalization of the model to the real world.
  • a training set and a development set ie, a validation set
  • the purpose of this application is to search for data expansion (enhanced) strategy and test the performance of the verification set on the training data through the feedback mechanism to find the optimal data expansion hybrid strategy.
  • the strategy feedback data at the current time is the pre-collected initial data augmentation mixed strategy feedback data.
  • the so-called initial data expansion hybrid strategy feedback data refers to the feedback data obtained by expanding the performance of the hybrid strategy on the development set based on the historical data of the preset cyclic neural network.
  • the training data is the data to be expanded, which can be different types of data.
  • corresponding cyclic neural networks for training the training data are preset. If a certain type of training data is selected, the cyclic neural network used to train that type of data is correspondingly selected.
  • the training data is a classification task data set
  • the cyclic neural network for training the classification data can be a model Text-CNN for data classification, where the Text-CNN network parameters are shared during the data augmentation hybrid strategy search process .
  • Step 204 Input the strategy feedback data of the current time into the preset hybrid strategy search model to obtain the data expansion hybrid strategy of the current time.
  • the hybrid strategy search model is a controller, which is composed of a cyclic neural network.
  • the hybrid strategy search model is deployed with a defined data expansion sub-strategy, and the number of data expansion sub-strategies is multiple.
  • the strategy feedback data of the current time is used as the input data of the hybrid strategy search model.
  • the hidden state of each step of the network is input to a classifier to determine each parameter of the hybrid strategy.
  • the controller is initialized randomly, and a data expansion hybrid strategy for the current time is randomly generated.
  • Step 206 Expand the training data according to the data expansion hybrid strategy to obtain expanded training data.
  • the training data is expanded according to the data expansion hybrid strategy, and then the expanded training data is obtained to achieve the effect of updating the training data.
  • the hybrid data expansion hybrid strategy can include any combination of data expansion hybrid strategies such as using data translation, generating models to generate new sentences, synonym replacement based on enhanced semantics, and predicted character replacement.
  • Step 208 Input the expanded training data into the preset recurrent neural network for training, and obtain strategy feedback data corresponding to the data expansion hybrid strategy.
  • the training data is a classification task data set as an example
  • the recurrent neural network corresponding to the classification task data set may be a classification network, specifically, a Text-CNN network.
  • the data expansion hybrid strategy to expand the data
  • the Text-CNN network classifies and trains the expanded training data, and then compares
  • the performance of the training data on the development set is used to obtain feedback data.
  • the model predicts the label of the data in the development set, then compares it with the standard answer, and scores based on accuracy, etc., to obtain the strategy feedback data.
  • Step 210 Use the strategy feedback data corresponding to the data augmentation hybrid strategy as the strategy feedback data of the current time, and return to step 204 to update the data augmentation hybrid strategy until the number of training times of the preset hybrid strategy search model reaches the preset number of training times, and the most Excellent data expansion hybrid strategy.
  • the feedback data can be input into the hybrid strategy search model again as the reward of the data expansion hybrid strategy at the current time to update the hybrid strategy Search the parameters of the model to update the current time data to expand the hybrid strategy.
  • the strategy feedback data is input into the preset hybrid strategy search model to generate the data expansion hybrid strategy, and then the training data is expanded according to the generated data expansion hybrid strategy, and the expanded training data is further input to the preset Set up a loop nerve to update the strategy feedback data, loop the above steps, and input the updated strategy feedback data into the preset hybrid strategy search model to update the parameters of the hybrid strategy search model to make the model mature and get the optimal Data expansion hybrid strategy.
  • the above solution can reduce the time-consuming strategy search, and can automatically construct the optimal data expansion hybrid strategy based on the training data, improve the accuracy and robustness of the model, thereby improve the efficiency of natural language data expansion, and save labor costs and computing power costs .
  • the hybrid strategy search model deploys multiple data expansion sub-strategies
  • the training data is expanded according to the data expansion hybrid strategy, and the expanded training data includes:
  • Step 226 Use the trained MLM model to replace any character in the sentence in the training data with a mask character
  • Step 246 Predict the character corresponding to the mask character according to the pre-trained language model to obtain the predicted character
  • Step 266 If the confidence of the predicted character is greater than the preset threshold, use the training data containing the predicted character as the expanded training data.
  • the generated data-filling mixed strategy includes a combination of multiple data expansion sub-strategies.
  • the data expansion sub-strategy includes a sentence data expansion strategy using an MLM model. Specifically, it can use trained MLM (Masked Language Model, masked language model), replace a word in a sentence in the training data with "[MASK]" characters (mask characters), and the pre-trained language model predicts what word should be in the removed place, and get Predict the character, and the confidence of the word is greater than 0.85, then expand this new sentence containing the predicted character to obtain the expanded training data. Specifically, the predicted new character is based on the pre-trained parameter LMHead of the language model.
  • the training data is expanded according to the data expansion hybrid strategy, and the expanded training data obtained includes:
  • Step 216 Express the words in the training data as word vectors
  • Step 236 randomly representing a byte segment of any sentence in the training data as a target vector
  • Step 256 Calculate the similarity between the target vector and the word vector, and find the synonym vector of the target vector based on the similarity;
  • Step 276 Replace the byte segments with the words corresponding to the synonym vector to obtain the expanded training data.
  • the data expansion sub-strategy also includes a synonym replacement strategy based on enhanced semantics.
  • a synonym replacement strategy based on enhanced semantics to expand training data can be as follows: first, fine-tuning a pre-training model, and the task of fine-tuning is to determine whether two phrases (words) are synonyms. Then, all the words in the preset knowledge base are expressed as word vectors by the pre-training model.
  • the similarity degree may also be 0.96, 0.97, and other numerical values, which may be specifically determined according to actual conditions, and is not limited herein.
  • the sentences in the training data are expanded to obtain expanded training data, which enriches the training data.
  • expanding the training data according to the data expansion hybrid strategy, and obtaining the expanded training data includes:
  • the pre-trained generative model is used to generate new training data, and the expanded training data is obtained.
  • the pre-trained generative model is trained based on historical sentence data.
  • the data expansion sub-strategy may also include sentences based on training data, and use the generative model to generate new sentence strategies.
  • it may be: randomly removing the byte fragments of any sentence in the training data to obtain the target sentence, and using the pre-trained generation model to predict the corresponding new characters for the removed byte fragments in the target sentence, Get the expanded training data. Specifically, it may first randomly select the 3-gram (byte fragment) of the second half of a sentence in the training data, and remove the 3-gram. Then the pre-trained generative model predicts 3 new characters for the removed 3-gram to form a new sentence. In this embodiment, in this way, the sentences in the training data are expanded to generate new sentences, the expanded training data can be quickly obtained, and the training data can be expanded.
  • the step of inputting the strategy feedback data of the current time into the preset hybrid strategy search model to update the data expansion hybrid strategy includes:
  • the feedback data at the current time can be expressed as the expansion of the development set based on the historical data expansion hybrid strategy to obtain the expanded data
  • the hybrid strategy search model is trained on the expanded data for 1 round to obtain the training result (label data). Then compare the obtained label data with the previously known standard label data to obtain the correct rate, and score based on the correct rate, and use the score as the reward of the data expansion hybrid strategy, that is, the feedback data is re-input to the preset hybrid strategy search model , Update the parameters of the network, so that the network generates a new data expansion hybrid strategy.
  • updating the parameters of the preset hybrid strategy search model includes: updating the parameters of the preset hybrid strategy search model according to the REINFORCE strategy gradient algorithm.
  • the preset hybrid strategy search model gradually matures after updating its parameters, and can better generate data to expand the hybrid strategy. After 50-80 epochs in this cycle, better training can be completed. Among them, the number of training is adjusted according to existing resources. Generally speaking, rolling a round on the training set during training is called an epoch. It takes about 100 epochs for a model to fully train, and algorithm engineers generally use 200 epochs to randomly choose a data expansion strategy.
  • the mixed data expansion strategy has more than 1e+5 options. If you simply train all the possibilities, you need 1e+5 * 100 epoch time-consuming training.
  • the above-mentioned REINFORCE strategy gradient algorithm is used to update the parameters of the model. After 50 trainings, the hybrid strategy search model has been well trained. Finally, the trained hybrid strategy search model is used to generate the optimal data augmentation hybrid strategy. In this way, through less than a time-consuming training of ordinary neural networks, an optimized data expansion hybrid strategy can be obtained, and the accuracy of the model can be significantly improved. By using the REINFORCE strategy gradient algorithm to update the parameters, the data can be better fitted and the strategy can be trained quickly.
  • d_0 2 means to randomly remove two words in a sentence. Therefore, the two values p_i and d_i need to be determined.
  • p_i can be discretized into 10 numbers that are equally spaced between 0 and 5.
  • a data expansion strategy generation device including: a data acquisition module 510, a hybrid strategy acquisition module 520, a data expansion module 530, a strategy feedback data update module 540, and a hybrid strategy Update module 550, where:
  • the data acquisition module 510 is used to acquire strategy feedback data and training data at the current time.
  • the hybrid strategy acquisition module 520 is configured to input the strategy feedback data of the current time into the preset hybrid strategy search model to obtain the data expansion hybrid strategy of the current time.
  • the data expansion module 530 is used to expand the training data according to the data expansion hybrid strategy to obtain the expanded training data.
  • the strategy feedback data update module 540 is used to input the expanded training data into the preset recurrent neural network for training, and obtain strategy feedback data corresponding to the data expansion hybrid strategy.
  • the hybrid strategy update module 550 is configured to use the strategy feedback data corresponding to the data expansion hybrid strategy as the strategy feedback data of the current time, and wake up the hybrid strategy acquisition module to execute the operation of inputting the strategy feedback data of the current time into the preset hybrid strategy search model,
  • the hybrid strategy is expanded by updating the data until the number of training times of the preset hybrid strategy search model reaches the preset number of training times, and the optimal data expansion hybrid strategy is obtained.
  • the data expansion module 530 is also used to use the trained MLM model to replace any character in the sentence in the training data with a mask character, and predict the mask character corresponding to the pre-trained language model. Characters, the predicted characters are obtained. If the confidence of the predicted characters is greater than the preset threshold, the training data containing the predicted characters is used as the expanded training data.
  • the data expansion module 530 is also used to represent the words in the training data as word vectors, randomly represent the byte fragments of any sentence in the training data as the target vector, and calculate the similarity between the target vector and the word vector.
  • the synonym vector of the target vector is found based on the degree of similarity, and the byte fragments are replaced with the words corresponding to the synonym vector to obtain the expanded training data.
  • the data expansion module 530 is also used to generate new training data based on the training data using a pre-trained generative model to obtain expanded training data, and the pre-trained generative model is trained based on historical sentence data.
  • the data expansion module 530 is also used to randomly remove the byte fragments of any sentence in the training data to obtain the target sentence.
  • the pre-trained generative model is used to predict the corresponding The new characters of, get the expanded training data.
  • the hybrid strategy update module 550 is further configured to input the feedback data of the current time as return data into the preset hybrid strategy search model again, and update the parameters of the preset hybrid strategy search model; based on the updated hybrid strategy.
  • Strategy search model to generate new data expansion hybrid strategy.
  • the hybrid strategy update module 550 is further configured to update the parameters of the preset hybrid strategy search model according to the REINFORCE strategy gradient algorithm.
  • Each module in the above-mentioned data expansion hybrid strategy generating device can be implemented in whole or in part by software, hardware, and combinations thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 6.
  • the computer equipment includes a processor, a memory, and a network interface connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store feedback data, training data, and hybrid strategy search models.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a data expansion hybrid strategy generation method.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and a processor, and a computer program is stored in the memory.
  • the processor executes the computer program, the following steps are implemented: acquiring strategy feedback data and training data at the current time, and The strategy feedback data of the current time is input to the preset hybrid strategy search model to obtain the data expansion hybrid strategy of the current time, the training data is expanded according to the data expansion hybrid strategy, the expanded training data is obtained, and the expanded training data is input to the preset
  • the cyclic neural network is trained to obtain the strategy feedback data corresponding to the data expansion hybrid strategy, the strategy feedback data corresponding to the data expansion hybrid strategy is used as the strategy feedback data of the current time, and the strategy feedback data of the current time is input into the preset hybrid strategy search.
  • the step of the model is to update the data to expand the hybrid strategy until the number of training times of the preset hybrid strategy search model reaches the preset number of training times, and the optimal data expansion hybrid strategy is obtained.
  • the processor further implements the following steps when executing the computer program: use the trained MLM model to replace any character in the sentence in the training data with a mask character, and predict the mask according to the pre-trained language model The character corresponding to the character obtains the predicted character. If the confidence of the predicted character is greater than the preset threshold, the training data containing the predicted character is used as the expanded training data.
  • the processor further implements the following steps when executing the computer program: expressing the words in the training data as word vectors, randomly expressing the byte fragments of any sentence in the training data as the target vector, and calculating the target vector and The similarity of the word vector is used to find the synonym vector of the target vector based on the similarity, and the byte segment is replaced with the word corresponding to the synonym vector to obtain the expanded training data.
  • the processor further implements the following steps when executing the computer program: based on the training data, use the pre-trained generative model to generate new training data to obtain the expanded training data, and the pre-trained generative model is based on historical sentence data Get trained.
  • the processor also implements the following steps when executing the computer program: randomly remove byte fragments of any sentence in the training data to obtain the target sentence, and use pre-trained generation for the removed byte fragments in the target sentence
  • the model predicts the corresponding new characters and obtains the expanded training data.
  • the processor further implements the following steps when executing the computer program: input the feedback data of the current time as return data into the preset hybrid strategy search model again, update the parameters of the preset hybrid strategy search model, and update based on the parameters After the hybrid strategy search model, new data is generated to expand the hybrid strategy.
  • the processor further implements the following steps when executing the computer program: according to the REINFORCE strategy gradient algorithm, the parameters of the preset hybrid strategy search model are updated.
  • a computer-readable storage medium is provided, and the above-mentioned storage medium may be a non-volatile storage medium or a volatile storage medium.
  • a computer program is stored on it, and when the computer program is executed by the processor, the following steps are implemented: obtain the strategy feedback data and training data at the current time, and input the strategy feedback data at the current time into the preset hybrid strategy search model to obtain the data at the current time Expand the hybrid strategy, expand the training data according to the data expansion hybrid strategy, obtain the expanded training data, input the expanded training data into the preset recurrent neural network for training, obtain the strategy feedback data corresponding to the data expansion hybrid strategy, and expand the data The strategy feedback data corresponding to the hybrid strategy is used as the strategy feedback data of the current time, and the step of inputting the strategy feedback data of the current time into the preset hybrid strategy search model is returned to update the data to expand the hybrid strategy until the training of the preset hybrid strategy search model When the number of times reaches the preset number of training times, the optimal data expansion hybrid strategy is obtained.
  • the following steps are also implemented: use the trained MLM model to replace any character in the sentence in the training data with a masked character, and predict the masked character according to the pre-trained language model.
  • the character corresponding to the code character obtains the predicted character. If the confidence of the predicted character is greater than the preset threshold, the training data containing the predicted character is used as the expanded training data.
  • the following steps are also implemented: the words in the training data are represented as word vectors, the byte fragments of any sentence in the training data are randomly represented as the target vector, and the target vector is calculated The similarity with the word vector is used to find the synonym vector of the target vector based on the similarity, and the byte segment is replaced with the word corresponding to the synonym vector to obtain the expanded training data.
  • the following steps are also implemented: based on the training data, the pre-trained generative model is used to generate new training data, and the expanded training data is obtained, and the pre-trained generative model is based on historical sentences Data training is obtained.
  • the following steps are also implemented: randomly remove the byte fragments of any sentence in the training data to obtain the target sentence, and for the removed byte fragments in the target sentence, pre-trained
  • the generative model predicts the corresponding new characters and obtains the expanded training data.
  • the following steps are also implemented: the feedback data of the current time is input into the preset hybrid strategy search model again as return data, and the parameters of the preset hybrid strategy search model are updated based on the parameters.
  • the updated hybrid strategy search model generates new data to expand the hybrid strategy.
  • the following steps are also implemented: update the parameters of the preset hybrid strategy search model according to the REINFORCE strategy gradient algorithm.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Machine Translation (AREA)

Abstract

本申请涉及人工智能技术领域,提供一种数据扩充混合策略生成方法、装置和计算机设备。所述方法包括:获取当前时间的策略反馈数据和训练数据,将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到数据扩充混合策略,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据,将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据,将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。采用上述方法能提高数据扩充效率。

Description

数据扩充混合策略生成方法、装置和计算机设备
本申请要求于2020年07月16日提交中国专利局、申请号为202010686538.8,发明名称为“数据扩充混合策略生成方法、装置和计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能技术领域,特别是涉及一种数据扩充混合策略生成方法、装置、计算机设备和存储介质。
背景技术
随着人工智能的不断发展,深度学习算法和机器学习也迎来了发展热潮。深度学习算法比如神经网络模型需要大量的训练数据,以保证模型的泛化能力。数据增强(数据扩充)是机器学习和深度学习中常见的数据处理手段,其能让有限的数据产生更多的数据,增加训练样本的数量以及多样性(噪声数据),提升模型鲁棒性。在自然语言处理任务中,常见的数据扩充的方式包括同义词替换和反向翻译。
发明人意识到,目前,在自然语言处理任务中,标注数据的采集需要花费大量的人力成本,且收集的数据具有局限性,数据扩充混合策略通常是人为设计的,往往会出现策略与数据集不适合、或扩充量太大,使得训练的模型产生过拟合现象,使得自然语言数据扩充效率较低。
技术问题
基于此,有必要针对上述技术问题,提供一种能够提高自然语言数据扩充效率的数据扩充混合策略生成方法、装置、计算机设备和存储介质。
技术解决方案
基于此,有必要针对上述技术问题,提供一种能够提高自然语言数据扩充效率的数据扩充混合策略生成方法、装置、计算机设备和存储介质。
一种数据扩充混合策略生成方法,方法包括:
获取当前时间的策略反馈数据和训练数据;
将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据;
将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
在其中一个实施例中,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据包括:
使用已训练的MLM模型将训练数据中句子中的任一字符替换为掩码字符;
根据预训练的语言模型,预测掩码字符所对应的字符,得到预测字符;
若预测字符的置信度大于预设阈值,则将包含预测字符的训练数据作为扩充后的训练数据。
在其中一个实施例中,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据包括:
将训练数据中的词语表示为词向量;
随机将训练数据中任一句子的字节片段表示为目标向量;
计算目标向量与词向量的相似度、并基于相似度查找出目标向量的同义词向量;
将字节片段替换为同义词向量对应的词语,得到扩充后的训练数据。
在其中一个实施例中,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据包括:
基于训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,预训练的生成模型基于历史句子数据训练得到。
在其中一个实施例中,基于训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据包括:
随机去除训练数据中任一句子的字节片段,得到目标句子;
针对目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
在其中一个实施例中,将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略包括:
将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数;
基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
在其中一个实施例中,更新预设混合策略搜索模型的参数包括:
根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。
一种数据扩充混合策略生成装置,装置包括:
数据获取模块,用于获取当前时间的策略反馈数据和训练数据;
混合策略获取模块,用于将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
数据扩充模块,用于根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据;
策略反馈数据更新模块,用于将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
混合策略更新模块,用于将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,唤醒混合策略获取模块执行将当前时间的策略反馈数据输入至预设混合策略搜索模型的操作,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现以下步骤:
获取当前时间的策略反馈数据和训练数据;
将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据;
将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取当前时间的策略反馈数据和训练数据;
将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据;
将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
有益效果
上述数据扩充混合策略生成方法、装置、计算机设备和存储介质,将策略反馈数据输入至预设混合策略搜索模型,初步生成数据扩充混合策略,再根据生成的数据扩充混合策略扩充训练数据,进一步将扩充后的训练数据输入至预设循环神经,以更新策略反馈数据,循环上述步骤,将更新后的策略反馈数据输入至预设混合策略搜索模型,以更新混合策略搜索模型的参数,使模型趋于成熟,进而得到最优的数据扩充混合策略。上述方案能够减少策略搜索耗时,且能够根据训练数据,自动构建最优的数据扩充混合策略,提升模型的精度和鲁棒性,进而提高自然语言数据扩充的效率,节省人力成本和算力成本。
附图说明
图1为一个实施例中数据扩充混合策略生成方法的应用环境图;
图2为一个实施例中数据扩充混合策略生成方法的流程示意图;
图3为一个实施例中根据数据扩充混合策略扩充训练数据步骤的流程示意图;
图4为另一个根据数据扩充混合策略扩充训练数据步骤的流程示意图;
图5为一个实施例中数据扩充混合策略生成装置的结构框图;
图6为一个实施例中计算机设备的内部结构图。
本发明的最佳实施方式
本申请提供的数据扩充混合策略生成方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104进行通信。具体可以是用户通过将由自然语言数据构建的训练数据和策略反馈数据通过终端102上传至服务器104,再于终端102的操作界面进行相应操作,发送数据扩充混合策略生成消息至服务器104,服务器104响应该消息,获取当前时间的策略反馈数据和训练数据,将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据,将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据,将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。其中,终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一个实施例中,如图2所示,提供了一种数据扩充混合策略生成方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:
步骤202,获取当前时间的策略反馈数据和训练数据。
数据增强策略对于提升训练样本数据量、改善模型稳定性和鲁棒性,提高模型对于真实世界的适应性和泛化性具有重要的作用。在数据准备阶段,准备有训练集、开发集(即验证集)。本申请意为通过反馈机制在训练数据上进行数据扩充(增强)策略搜索和验证集的性能测试来寻找最优的数据扩充混合策略。算法执行之初,当前时间的策略反馈数据为预先采集的初始的数据扩充混合策略反馈数据。所谓初始数据扩充混合策略反馈数据即指对预设循环神经网络基于历史数据扩充混合策略在开发集上的表现得到的反馈数据。训练数据即为待数据扩充的数据,其可以是不同类型的数据。本实施例中,对于不同类型的训练数据预设有对应的用于训练该训练数据的循环神经网络。若选取某一类型的训练数据,则对应选取用于训练该类型数据的循环神经网络。例如,若训练数据为分类任务数据集,则训练该分类数据的循环神经网络可以是用于数据分类的模型Text-CNN,其中,Text-CNN网络参数在数据扩充混合策略搜索过程中是共享的。
步骤204,将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略。
本实施例中,混合策略搜索模型为控制器,其是由一个循环神经网络组成。该混合策略搜索模型部署有定义好的数据扩充子策略,数据扩充子策略的数量为多个。在获取当前时间的策略反馈数据之后,将当前时间的策略反馈数据作为上述混合策略搜索模型的输入数据,该网络每一步的隐含状态输入到一个分类器,决定混合策略的每个参数。控制器随机初始化,随机生成一个当前时间的数据扩充混合策略。
步骤206,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据。
当得到数据扩充混合策略后,根据该数据扩充混合策略扩充训练数据,进而得到扩充后的训练数据,达到更新训练数据的效果。具体的,混合数据扩充混合策略可以包括使用数据翻译、生成模型生成新句子、基于增强语义的同义词替换以及预测字符替换等数据扩充混合策略的任意组合形式。
步骤208,将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据。
本实施例中,训练数据以分类任务数据集为例,与分类任务数据集对应的循环神经网络则可以是分类网络,具体的,可以为Text-CNN网络。使用数据扩充混合策略对数据进行扩充后,以扩充后的数据为新的训练数据,将其输入至对应的Text-CNN网络,由Text-CNN网络对扩充的训练数据进行分类训练,然后,对比训练数据在开发集上的表现,得到反馈数据,具体可以是模型预测开发集的数据的标签,然后与标准答案作对比,依据准确率等进行打分,得到策略反馈数据。
步骤210,将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回步骤204,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
为挑选出最优的数据扩充混合策略,在得到数据扩充混合策略的反馈数据后,可将反馈数据作为当前时间的数据扩充混合策略的回报(reward)再次输入至混合策略搜索模型,更新混合策略搜索模型的参数,以更新当前时间的数据扩充混合策略。再根据更新后生成的数据扩充混合策略去扩充训练数据,再将扩充后的训练数据重新输入Text-CNN网络,得到新的策略反馈数据,进而再将新的策略反馈数据再次输入至混合策略搜索模型,重复上述步骤,直至模型训练次数达到预设次数,终止训练,从各轮训练得到的策略反馈数据(准确率)中,挑选出准确率最高时对应的数据扩充混合策略,作为最优数据扩充混合策略,至此,筛选出最优的数据扩充混合策略。
上述数据扩充混合策略生成方法中,将策略反馈数据输入至预设混合策略搜索模型,生成数据扩充混合策略,再根据生成的数据扩充混合策略扩充训练数据,进一步将扩充后的训练数据输入至预设循环神经,以更新策略反馈数据,循环上述步骤,将更新后的策略反馈数据输入至预设混合策略搜索模型,以更新混合策略搜索模型的参数,使模型趋于成熟,进而得到最优的数据扩充混合策略。上述方案能够减少策略搜索耗时,且能够根据训练数据,自动构建最优的数据扩充混合策略,提升模型的精度和鲁棒性,进而提高自然语言数据扩充的效率,节省人力成本和算力成本。
如图3所示,在其中一个实施例中,混合策略搜索模型部署多个数据扩充子策略;
根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据包括:
步骤226,使用已训练的MLM模型将训练数据中句子中的任一字符替换为掩码字符;
步骤246,根据预训练的语言模型,预测掩码字符所对应的字符,得到预测字符;
步骤266,若预测字符的置信度大于预设阈值,则将包含预测字符的训练数据作为扩充后的训练数据。
具体实施时,数据扩充子策略有多个,生成的数据充混合策略为包括多个数据扩充子策略的组合,具体的,数据扩充子策略包括使用MLM模型扩充句子数据策略。具体的,可以是使用已训练的MLM(Masked Language Model,遮蔽语言模型),将训练数据中一个句子中的某个字换为“[MASK]”字符(掩码字符),由预训练的语言模型预测这个除去的地方应该是什么字,得到预测字符,且该字的confidence(置信度)大于0.85,则将这个包含预测字符的新的句子扩充进来,得到扩充后的训练数据。具体的,预测出的新的字符是根据语言模型预训练的参数LMHead,其预测出一个“[MASK]”字符地方对应的字是其词汇表中的某个字的概率,取概率最大的那个作为预测的新字符。本实施例中,通过使用区别于传统数据扩充方式的MLM模型进行字符替换,能够快速生成新的句子达到扩充训练数据的效果。
如图4所示,在其中一个实施例中,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据包括:
步骤216,将训练数据中的词语表示为词向量;
步骤236,随机将训练数据中任一句子的字节片段表示为目标向量;
步骤256,计算目标向量与词向量的相似度、并基于相似度查找出目标向量的同义词向量;
步骤276,将字节片段替换为同义词向量对应的词语,得到扩充后的训练数据。
具体实施时,数据扩充子策略还包括基于增强语意的同义词替换策略。使用基于增强语意的同义词替换策略将扩充训练数据可以是:首先,微调一个预训练模型,微调任务是判断两个短语(词语)是否是同义词。然后,将预设知识库中的所有词语用预训练模型表示为词向量。然后,随机选取一个句子中的某个n-gram,使用预训练模型对其进行向量表示,得到目标向量,再基于目标向量到预设知识库中进行检索,即计算目标向量与预设知识库中词向量的相似度,查看是否存在与目标向量相似(同义)的词向量,本实施例中,可以是若二者向量相似度在0.95以上,则表征二者为同义词,预设知识库中的该词向量对应的词语可以与选取的句子中的n-gram替换,形成一个新的句子。可以理解的是,在其他实施例中,相似度还可以是0.96、0.97以及其他数值,具体可根据实际情况而定,在此不做限定。本实施例中,以此方式,对训练数据中的句子进行扩充,得到扩充后的训练数据,丰富训练数据。
在其中一个实施例中,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据包括:
基于训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,预训练的生成模型基于历史句子数据训练得到。
在实际应用中,数据扩充子策略还可以包括基于训练数据的句子,使用生成模型生成新的句子策略。在另一个实施例中,可以是:随机去除训练数据中任一句子的字节片段,得到目标句子,针对目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。具体的,可以是先随机选择训练数据中一个句子的后半段的3-gram(字节片段),去除3-gram。再由预训练的生成模型针对去除的3-gram预测3个新的字符,组成一个新的句子。本实施例中,以此方式,对训练数据中的句子进行扩充,生成新的句子,能够快速得到扩充后的训练数据,扩展训练数据。
在其中一个实施例中,将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略包括:
将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数;
基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
具体实施时,当前时间的反馈数据可表现为基于历史数据扩充混合策略在开发集上进行扩充得到扩充数据后,由混合策略搜索模型在扩充数据上训练1轮,得到训练结果(标签数据),再对比得到的标签数据和事先已知的标准标签数据,得到正确率,基于正确率进行打分,将得分作为该数据扩充混合策略的回报(reward)即反馈数据再次输入至预设混合策略搜索模型,更新该网络的参数,使得网络生成新的数据扩充混合策略。在一个实施例中,更新预设混合策略搜索模型的参数包括:根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。具体的,预设混合策略搜索模型的参数更新策略可以如下:假设预设混合策略搜索模型的参数为向量θ,策略为π(θ),其能得到的期望回报是R= E[ π(θ)* r ┤|  θ ],其中,r表示当前时间的反馈数据(也就是回报reward),则期望回报对参数的梯度为E[ ∇_θπ(θ)* r ┤|  θ ],∇_θπ(θ)为π(θ)对应的梯度。具体采用∇_(θ_i ) π(θ_i )* r_i来近似估计,于是参数更新就是θ_(i+1)=θ_i- ∇_(θ_i ) π(θ_i )* r_i。预设混合策略搜索模型经过更新参数,逐渐成熟,可以更好的生成数据扩充混合策略。这样的循环经过50-80 epoch ,便能完成较好的训练。其中,训练次数是根据已有资源进行调整。一般来说,训练时在训练集上面滚动一轮叫一个epoch。一个模型充分训练需要100个epoch左右,算法工程师一般会使用200epoch的时间来随意的选择一个数据扩充策略。但是这种策略一般是离最优还相差很远,一般还有3-4个点的精度提升空间。混合数据扩充策略有多于1e+5的选择,如果简单地训练所有的可能性,那么需要1e+5 * 100 epoch的耗时训练。而本实施例中,采用上述REINFORCE策略梯度算法更新模型的参数,在50次训练之后,混合策略搜索模型得到了较好的训练。最终,使用训练好的混合策略搜索模型生成最优的数据扩充混合策略。这样,通过不到一个训练普通神经网络的耗时,可以得到一个优化的数据扩充混合策略,使模型精度得到显著提高。通过使用REINFORCE策略梯度算法更新参数,能够更好的拟合数据,快速地训练策略。
在实际应用中,数据扩充子策略并不局限于上述列举的三种数据扩充子策略。若将上述三类数据扩充子策略记为s_0,s_1, s_2,对于每个策略s_i,其作用在每条数据上的概率为p_i (0 <= p_i<= 5),这里指利用这个策略,将原训练数据扩充p_i倍。p_i< 1 时,随机地对一部分数据做扩充。其影响程度d_i (0 <d_i<= 5),指这个扩充策略会影响某条数据中多少个字。例如,d_0 = 2 指随机去除一个句子中的两个字。所以,需要确定p_i和d_i这两个数值。为方便起见,可将p_i离散化为 0到5之间等距的10个数字。基于上述原理,包含三类上述策略的混合数据扩充策略,对应有(10 * 5)^3 = 1e+5种选择,使得模型能够根据不同的数据任务集,给出相匹配的最优的数据扩充策略。
应该理解的是,虽然图2-4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-4中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在其中一个实施例中,如图5所示,提供了一种数据扩充策略生成装置,包括:数据获取模块510、混合策略获取模块520、数据扩充模块530、策略反馈数据更新模块540和混合策略更新模块550,其中:
数据获取模块510,用于获取当前时间的策略反馈数据和训练数据。
混合策略获取模块520,用于将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略。
数据扩充模块530,用于根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据。
策略反馈数据更新模块540,用于将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据。
混合策略更新模块550,用于将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,唤醒混合策略获取模块执行将当前时间的策略反馈数据输入至预设混合策略搜索模型的操作,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
在其中一个实施例中,数据扩充模块530还用于使用已训练的MLM模型将训练数据中句子中的任一字符替换为掩码字符,根据预训练的语言模型,预测掩码字符所对应的字符,得到预测字符,若预测字符的置信度大于预设阈值,则将包含预测字符的训练数据作为扩充后的训练数据。
在其中一个实施例中,数据扩充模块530还用于将训练数据中的词语表示为词向量,随机将训练数据中任一句子的字节片段表示为目标向量,计算目标向量与词向量的相似度、并基于相似度查找出目标向量的同义词向量,将字节片段替换为同义词向量对应的词语,得到扩充后的训练数据。
在其中一个实施例中,数据扩充模块530还用于基于训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,预训练的生成模型基于历史句子数据训练得到。
在其中一个实施例中,数据扩充模块530还用于随机去除训练数据中任一句子的字节片段,得到目标句子,针对目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
在其中一个实施例中,混合策略更新模块550还用于将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数;基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
在其中一个实施例中,混合策略更新模块550还用于根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。
关于数据扩充混合策略生成装置的具体限定可以参见上文中对于数据扩充混合策略生成方法的限定,在此不再赘述。上述数据扩充混合策略生成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在其中一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储反馈数据、训练数据以及混合策略搜索模型等。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种数据扩充混合策略生成方法。
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在其中一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:获取当前时间的策略反馈数据和训练数据,将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据,将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据,将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
在其中一个实施例中,处理器执行计算机程序时还实现以下步骤:使用已训练的MLM模型将训练数据中句子中的任一字符替换为掩码字符,根据预训练的语言模型,预测掩码字符所对应的字符,得到预测字符,若预测字符的置信度大于预设阈值,则将包含预测字符的训练数据作为扩充后的训练数据。
在其中一个实施例中,处理器执行计算机程序时还实现以下步骤:将训练数据中的词语表示为词向量,随机将训练数据中任一句子的字节片段表示为目标向量,计算目标向量与词向量的相似度、并基于相似度查找出目标向量的同义词向量,将字节片段替换为同义词向量对应的词语,得到扩充后的训练数据。
在其中一个实施例中,处理器执行计算机程序时还实现以下步骤:基于训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,预训练的生成模型基于历史句子数据训练得到。
在其中一个实施例中,处理器执行计算机程序时还实现以下步骤:随机去除训练数据中任一句子的字节片段,得到目标句子,针对目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
在其中一个实施例中,处理器执行计算机程序时还实现以下步骤:将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数,基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
在其中一个实施例中,处理器执行计算机程序时还实现以下步骤:根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。
在一个实施例中,提供了一种计算机可读存储介质,上述存储介质可以是非易失性存储介质,也可以是易失性存储介质。其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:获取当前时间的策略反馈数据和训练数据,将当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略,根据数据扩充混合策略扩充训练数据,得到扩充后的训练数据,将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据,将数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
在其中一个实施例中,计算机程序被处理器执行时还实现以下步骤:使用已训练的MLM模型将训练数据中句子中的任一字符替换为掩码字符,根据预训练的语言模型,预测掩码字符所对应的字符,得到预测字符,若预测字符的置信度大于预设阈值,则将包含预测字符的训练数据作为扩充后的训练数据。
在其中一个实施例中,计算机程序被处理器执行时还实现以下步骤:将训练数据中的词语表示为词向量,随机将训练数据中任一句子的字节片段表示为目标向量,计算目标向量与词向量的相似度、并基于相似度查找出目标向量的同义词向量,将字节片段替换为同义词向量对应的词语,得到扩充后的训练数据。
在其中一个实施例中,计算机程序被处理器执行时还实现以下步骤:基于训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,预训练的生成模型基于历史句子数据训练得到。
在其中一个实施例中,计算机程序被处理器执行时还实现以下步骤:随机去除训练数据中任一句子的字节片段,得到目标句子,针对目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
在其中一个实施例中,计算机程序被处理器执行时还实现以下步骤:将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数,基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
在其中一个实施例中,计算机程序被处理器执行时还实现以下步骤:根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。

Claims (20)

  1. 一种数据扩充混合策略生成方法,其中,所述方法包括:
    获取当前时间的策略反馈数据和训练数据;
    将所述当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
    根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据;
    将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
    将所述数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至所述预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
  2. 根据权利要求1所述的方法,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    使用已训练的MLM模型将所述训练数据中句子中的任一字符替换为掩码字符;
    根据预训练的语言模型,预测所述掩码字符所对应的字符,得到预测字符;
    若所述预测字符的置信度大于预设阈值,则将包含所述预测字符的训练数据作为扩充后的训练数据。
  3. 根据权利要求1所述的方法,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    将所述训练数据中的词语表示为词向量;
    随机将所述训练数据中任一句子的字节片段表示为目标向量;
    计算所述目标向量与所述词向量的相似度、并基于相似度查找出所述目标向量的同义词向量;
    将所述字节片段替换为所述同义词向量对应的词语,得到扩充后的训练数据。
  4. 根据权利要求1所述的方法,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    基于所述训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,所述预训练的生成模型基于历史句子数据训练得到。
  5. 根据权利要求4所述的方法,其中,所述基于所述训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据包括:
    随机去除所述训练数据中任一句子的字节片段,得到目标句子;
    针对所述目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
  6. 根据权利要求1所述的方法,其中,所述将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略包括:
    将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数;
    基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
  7. 根据权利要求6所述的方法,其中,所述更新预设混合策略搜索模型的参数包括:
    根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。
  8. 一种数据扩充混合策略生成装置,其中,所述装置包括:
    数据获取模块,用于获取当前时间的策略反馈数据和训练数据;
    混合策略获取模块,用于将所述当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
    数据扩充模块,用于根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据;
    策略反馈数据更新模块,用于将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
    混合策略更新模块,用于将所述数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,唤醒混合策略获取模块执行将当前时间的策略反馈数据输入至预设混合策略搜索模型的操作,以更新数据扩充混合策略,直至所述预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其中,所述处理器执行所述计算机程序时实现数据扩充混合策略生成方法的步骤:
    获取当前时间的策略反馈数据和训练数据;
    将所述当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
    根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据;
    将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
    将所述数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至所述预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
  10. 根据权利要求9所述的计算机设备,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    使用已训练的MLM模型将所述训练数据中句子中的任一字符替换为掩码字符;
    根据预训练的语言模型,预测所述掩码字符所对应的字符,得到预测字符;
    若所述预测字符的置信度大于预设阈值,则将包含所述预测字符的训练数据作为扩充后的训练数据。
  11. 根据权利要求9所述的计算机设备,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    将所述训练数据中的词语表示为词向量;
    随机将所述训练数据中任一句子的字节片段表示为目标向量;
    计算所述目标向量与所述词向量的相似度、并基于相似度查找出所述目标向量的同义词向量;
    将所述字节片段替换为所述同义词向量对应的词语,得到扩充后的训练数据。
  12. 根据权利要求9所述的计算机设备,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    基于所述训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,所述预训练的生成模型基于历史句子数据训练得到。
  13. 根据权利要求12所述的计算机设备,其中,所述基于所述训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据包括:
    随机去除所述训练数据中任一句子的字节片段,得到目标句子;
    针对所述目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
  14. 根据权利要求9所述的计算机设备,其中,所述将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略包括:
    将当前时间的反馈数据作为回报数据再次输入至预设混合策略搜索模型,更新预设混合策略搜索模型的参数;
    基于参数更新后的混合策略搜索模型,生成新的数据扩充混合策略。
  15. 根据权利要求14所述的计算机设备,其中,所述更新预设混合策略搜索模型的参数包括:
    根据REINFORCE策略梯度算法,更新预设混合策略搜索模型的参数。
  16. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现数据扩充混合策略生成方法的步骤:
    获取当前时间的策略反馈数据和训练数据;
    将所述当前时间的策略反馈数据输入至预设混合策略搜索模型,得到当前时间的数据扩充混合策略;
    根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据;
    将扩充后的训练数据输入至预设循环神经网络进行训练,得到数据扩充混合策略对应的策略反馈数据;
    将所述数据扩充混合策略对应的策略反馈数据作为当前时间的策略反馈数据,返回将当前时间的策略反馈数据输入至预设混合策略搜索模型的步骤,以更新数据扩充混合策略,直至所述预设混合策略搜索模型的训练次数达到预设训练次数,得到最优的数据扩充混合策略。
  17. 根据权利要求1所述的计算机可读存储介质,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    使用已训练的MLM模型将所述训练数据中句子中的任一字符替换为掩码字符;
    根据预训练的语言模型,预测所述掩码字符所对应的字符,得到预测字符;
    若所述预测字符的置信度大于预设阈值,则将包含所述预测字符的训练数据作为扩充后的训练数据。
  18. 根据权利要求16所述的计算机可读存储介质,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    将所述训练数据中的词语表示为词向量;
    随机将所述训练数据中任一句子的字节片段表示为目标向量;
    计算所述目标向量与所述词向量的相似度、并基于相似度查找出所述目标向量的同义词向量;
    将所述字节片段替换为所述同义词向量对应的词语,得到扩充后的训练数据。
  19. 根据权利要求16所述的计算机可读存储介质,其中,所述根据所述数据扩充混合策略扩充所述训练数据,得到扩充后的训练数据包括:
    基于所述训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据,所述预训练的生成模型基于历史句子数据训练得到。
  20. 根据权利要求19所述的计算机可读存储介质,其中,所述基于所述训练数据,使用预训练的生成模型生成新的训练数据,得到扩充后的训练数据包括:
    随机去除所述训练数据中任一句子的字节片段,得到目标句子;
    针对所述目标句子中去除的字节片段,采用预训练的生成模型预测出对应的新字符,得到扩充后的训练数据。
PCT/CN2020/118140 2020-07-16 2020-09-27 数据扩充混合策略生成方法、装置和计算机设备 WO2021139233A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010686538.8 2020-07-16
CN202010686538.8A CN111931492A (zh) 2020-07-16 2020-07-16 数据扩充混合策略生成方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2021139233A1 true WO2021139233A1 (zh) 2021-07-15

Family

ID=73313221

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118140 WO2021139233A1 (zh) 2020-07-16 2020-09-27 数据扩充混合策略生成方法、装置和计算机设备

Country Status (2)

Country Link
CN (1) CN111931492A (zh)
WO (1) WO2021139233A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992830A (zh) * 2022-06-17 2023-11-03 北京聆心智能科技有限公司 文本数据处理方法、相关装置及计算设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113177402B (zh) * 2021-04-26 2024-03-01 平安科技(深圳)有限公司 词语替换方法、装置、电子设备和存储介质
CN113268996A (zh) * 2021-06-02 2021-08-17 网易有道信息技术(北京)有限公司 用于扩充语料的方法和用于翻译模型的训练方法及产品

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796248A (zh) * 2019-08-27 2020-02-14 腾讯科技(深圳)有限公司 数据增强的方法、装置、设备及存储介质
CN110807109A (zh) * 2019-11-08 2020-02-18 北京金山云网络技术有限公司 数据增强策略的生成方法、数据增强方法和装置
CN111127364A (zh) * 2019-12-26 2020-05-08 吉林大学 图像数据增强策略选择方法及人脸识别图像数据增强方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11488055B2 (en) * 2018-07-26 2022-11-01 International Business Machines Corporation Training corpus refinement and incremental updating
CN110852438B (zh) * 2019-11-11 2023-08-04 北京百度网讯科技有限公司 模型生成方法和装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796248A (zh) * 2019-08-27 2020-02-14 腾讯科技(深圳)有限公司 数据增强的方法、装置、设备及存储介质
CN110807109A (zh) * 2019-11-08 2020-02-18 北京金山云网络技术有限公司 数据增强策略的生成方法、数据增强方法和装置
CN111127364A (zh) * 2019-12-26 2020-05-08 吉林大学 图像数据增强策略选择方法及人脸识别图像数据增强方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992830A (zh) * 2022-06-17 2023-11-03 北京聆心智能科技有限公司 文本数据处理方法、相关装置及计算设备
CN116992830B (zh) * 2022-06-17 2024-03-26 北京聆心智能科技有限公司 文本数据处理方法、相关装置及计算设备

Also Published As

Publication number Publication date
CN111931492A (zh) 2020-11-13

Similar Documents

Publication Publication Date Title
WO2021093449A1 (zh) 基于人工智能的唤醒词检测方法、装置、设备及介质
WO2021047286A1 (zh) 文本处理模型的训练方法、文本处理方法及装置
CN112288075B (zh) 一种数据处理方法及相关设备
WO2021139233A1 (zh) 数据扩充混合策略生成方法、装置和计算机设备
CN104143327B (zh) 一种声学模型训练方法和装置
WO2021208612A1 (zh) 数据处理的方法与装置
CN111344779A (zh) 训练和/或使用编码器模型确定自然语言输入的响应动作
WO2020233380A1 (zh) 缺失语义补全方法及装置
CN111192692B (zh) 一种实体关系的确定方法、装置、电子设备及存储介质
US20220083868A1 (en) Neural network training method and apparatus, and electronic device
CN111951805A (zh) 一种文本数据处理方法及装置
WO2020151310A1 (zh) 文本生成方法、装置、计算机设备及介质
CN111563144A (zh) 基于语句前后关系预测的用户意图识别方法及装置
WO2018032765A1 (zh) 序列转换方法及装置
CN113641830B (zh) 模型预训练方法、装置、电子设备和存储介质
CN112052318A (zh) 一种语义识别方法、装置、计算机设备和存储介质
US20220383119A1 (en) Granular neural network architecture search over low-level primitives
CN113157919A (zh) 语句文本方面级情感分类方法及系统
CN111241820A (zh) 不良用语识别方法、装置、电子装置及存储介质
JP2019200756A (ja) 人工知能プログラミングサーバおよびそのプログラム
CN115879450B (zh) 一种逐步文本生成方法、系统、计算机设备及存储介质
WO2024011885A1 (zh) 语音唤醒方法、装置、电子设备以及存储介质
CN111026848A (zh) 一种基于相似上下文和强化学习的中文词向量生成方法
US20220284891A1 (en) Noisy student teacher training for robust keyword spotting
JP2022088586A (ja) 音声認識方法、音声認識装置、電子機器、記憶媒体コンピュータプログラム製品及びコンピュータプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912583

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20912583

Country of ref document: EP

Kind code of ref document: A1