WO2021169366A1 - 数据增强方法和装置 - Google Patents

数据增强方法和装置 Download PDF

Info

Publication number
WO2021169366A1
WO2021169366A1 PCT/CN2020/125338 CN2020125338W WO2021169366A1 WO 2021169366 A1 WO2021169366 A1 WO 2021169366A1 CN 2020125338 W CN2020125338 W CN 2020125338W WO 2021169366 A1 WO2021169366 A1 WO 2021169366A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
target
neural network
network model
performance index
Prior art date
Application number
PCT/CN2020/125338
Other languages
English (en)
French (fr)
Inventor
张新雨
袁鹏
钟钊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021169366A1 publication Critical patent/WO2021169366A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • This application relates to the field of artificial intelligence, and in particular to a data enhancement method and device.
  • Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theories.
  • the neural network model can support the processing of various types of data such as images, text, speech, and sequences to achieve classification, regression, and prediction. By improving the quality, diversity and quantity of training data, the performance of the neural network model can be effectively improved.
  • This application provides a data enhancement method and device to automatically determine a corresponding enhancement strategy based on training data.
  • an embodiment of the present application provides a data enhancement method, which may include: acquiring first training data, at least one first vector, and a performance index corresponding to each first vector, each first vector is used to represent A set of first enhancement strategies, the performance index corresponding to each first vector includes the performance index of the first neural network model, the first neural network model is trained by the second training data, the second training data is used
  • the first enhancement strategy is training data obtained by performing enhancement processing on the first training data. Determine at least one second vector according to the performance index corresponding to the at least one first vector and the at least one first vector, and each second vector is used to represent a group of second enhancement strategies.
  • the performance index corresponding to each second vector includes the performance index of the second neural network model.
  • the second neural network model is trained on third training data, and the third training data is training data obtained after the first training data is enhanced by using the second enhancement strategy.
  • the performance index corresponding to the at least one target vector is higher than the performance index corresponding to the at least one first vector and the at least one second vector other than the at least one target vector, and each target vector represents a group A target enhancement strategy, where at least one set of target enhancement strategies represented by the at least one target vector is used to perform enhancement processing on the first training data to obtain target training data, and the target training data is used to train to obtain a target neural network model.
  • the at least one second vector is predicted based on the performance index corresponding to the at least one first vector and the at least one first vector, and then combined with the actual performance index of the at least one second vector, a vector with a higher performance index is selected
  • the actual performance index is the actual performance index of the neural network model obtained by applying the enhancement strategy represented by the at least one second vector to the first training data, which can realize the automatic determination of the corresponding enhancement strategy based on the training data, In order to expand the training data, improve the performance of the target neural network model.
  • determining the at least one second vector according to the performance index corresponding to the at least one first vector and the at least one first vector may include: mapping the at least one first vector from a discrete parameter space To the continuous parameter space, at least one third vector is obtained. The at least one second vector is determined according to the performance index corresponding to the at least one third vector and the at least one first vector.
  • At least one third vector is obtained by mapping the at least one first vector from the discrete parameter space to the continuous parameter space.
  • the sampling efficiency can be improved, that is, a small amount of sampling can obtain an enhancement strategy with a higher performance index, and it can also reduce resource consumption. , Which reduces the TPU or CPU resources required in the preprocessing process.
  • determining the at least one second vector according to the performance index corresponding to the at least one third vector and the at least one first vector may include: according to the at least one third vector and the at least one first vector The performance index corresponding to the vector determines the mapping relationship between the third vector and the performance index corresponding to the at least one first vector. According to the mapping relationship, at least one second vector is determined.
  • At least one second vector with a higher performance index is searched and determined, which can improve the search performance index of the enhancement strategy. Search efficiency and reduce resource consumption.
  • the mapping relationship between the third vector and the performance index corresponding to the at least one first vector may be determined.
  • the method includes: inputting the performance index corresponding to the at least one third vector and the at least one first vector to a third neural network model, and outputting a mapping relationship between the third vector and the performance index.
  • the mapping relationship between the third vector and the performance index is determined through a neural network model, which can improve the accuracy of the mapping relationship, and is beneficial to search for the at least one second vector with a higher prediction performance index.
  • determining at least one second vector according to the mapping relationship may include: determining at least one fourth vector according to the mapping relationship. The at least one fourth vector is mapped from the continuous parameter space to the discrete parameter space to obtain the at least one second vector.
  • determining at least one fourth vector according to the mapping relationship includes: adopting a gradient update manner, and determining the at least one fourth vector in the mapping relationship.
  • an enhancement strategy with a higher performance index is determined in a continuous parameter space by a gradient update method, which can improve the search efficiency of the enhancement strategy and the sampling efficiency of the enhancement strategy.
  • mapping the at least one fourth vector from the continuous parameter space to the discrete parameter space to obtain the at least one second vector may include: inputting the at least one fourth vector into the The fourth neural network model outputs the at least one second vector, and the fourth neural network model is used to map each fourth vector from a continuous parameter space to a discrete parameter space.
  • the fourth vector is mapped from the continuous parameter space to the discrete parameter space through the neural network model, which can improve the efficiency and accuracy of the mapping.
  • the method may further include: determining whether a preset condition is satisfied, and if the preset condition is not satisfied, using the at least one second vector and the at least one first vector as the at least one first vector, The step of obtaining at least one first vector and the performance index corresponding to each first vector is performed.
  • mapping the at least one first vector from the discrete parameter space to the continuous parameter space to obtain the at least one third vector may include: inputting the at least one first vector to the fifth nerve respectively The network model outputs the at least one third vector, and the fifth neural network model is used to map each first vector from a discrete parameter space to a continuous parameter space.
  • the first vector is mapped from the discrete parameter space to the continuous parameter space through the neural network model, which can improve the efficiency and accuracy of the mapping, so as to accurately determine the performance of the third vector and the at least one first vector
  • the mapping relationship between the indicators is further used to predict at least one second vector with a higher performance indicator based on the mapping relationship.
  • obtaining the at least one first vector may include: randomly sampling in the search space of the data enhancement strategy to obtain the at least one first vector.
  • the method may further include: sending neural network model configuration information to the testing device, where the neural network model configuration information is used to configure the first neural network model. Receiving the performance index of the first neural network model sent by the testing device.
  • the performance index of the first neural network model is fed back through the testing device.
  • the first neural network model is trained by applying the training data of the first enhancement strategy represented by the first vector, which is conducive to the search of the enhancement strategy.
  • the space is accurately modeled, and then the enhancement strategy with higher performance index is determined, and the enhancement strategy with higher performance index is applied to the preprocessing process in the training process to expand the training data and improve the performance of the target neural network model.
  • the method may further include: using the at least one set of target enhancement strategies to perform enhancement processing on the first training data to obtain target training data.
  • Send target model configuration information, and the target model configuration information is used to configure the target neural network model.
  • the training data can be expanded and the performance of the target neural network model can be improved.
  • Configuring the target neural network model obtained by training to a corresponding model application device, such as a server or terminal device, can improve the processing performance of the model application device.
  • an embodiment of the present application provides a data enhancement device, which is configured to execute the data enhancement method in the first aspect or any possible design of the first aspect.
  • the data enhancement device may include a module for executing the data enhancement method in the first aspect or any possible design of the first aspect. For example, acquisition module, prediction module, enhancement strategy determination module, etc.
  • an embodiment of the present application provides an electronic device that includes a memory and a processor, the memory is used to store instructions, the processor is used to execute the instructions stored in the memory, and the instructions stored in the memory The execution of causes the processor to execute the data enhancement method in the first aspect or any possible design of the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method in the first aspect or any possible design of the first aspect is implemented.
  • the present application provides a computer program product.
  • the computer program product includes instructions that, when run on a computer, cause the computer to execute the method described in any one of the above-mentioned first aspects.
  • the present application provides a chip including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to execute the above-mentioned first aspect The method of any one of.
  • the first training data, at least one first vector, and the performance index corresponding to each first vector are acquired, and the at least one first vector and the performance index corresponding to the at least one vector are determined
  • At least one second vector determines at least one target vector based on the performance index corresponding to the at least one first vector and the performance index corresponding to the at least one second vector, and the performance index corresponding to the at least one target vector is higher than the performance index corresponding to the at least one first vector and Performance indicators corresponding to at least one second vector other than at least one target vector
  • each target vector represents a set of target enhancements
  • at least one set of target enhancement strategies is used to perform enhancement processing on the first training data to obtain target training
  • the target training data is used to train the target neural network model, which can automatically determine the corresponding enhancement strategy based on the training data, so as to expand the training data and improve the performance of the target neural network model.
  • FIG. 1 is a schematic diagram of an artificial intelligence main body framework provided by an embodiment of this application.
  • FIG. 2A is a schematic diagram of an application environment provided by an embodiment of the application.
  • FIG. 2B is a schematic diagram of an application environment provided by an embodiment of this application.
  • Figure 3 is a schematic diagram of a set of enhanced strategies provided by an embodiment of the application.
  • FIG. 4 is a flowchart of a data enhancement method according to an embodiment of the application.
  • FIG. 5 is a flowchart of another data enhancement method according to an embodiment of the application.
  • FIG. 6A is a schematic diagram of a data enhancement device according to an embodiment of the application.
  • FIG. 6B is a schematic diagram of a data enhancement method according to an embodiment of the application.
  • FIG. 7 is a schematic diagram of an enhancement processing according to an embodiment of the application.
  • FIG. 8 is a flowchart of another data enhancement method according to an embodiment of the application.
  • FIG. 9 is a schematic diagram of a data enhancement device according to an embodiment of the application.
  • FIG. 10 is a schematic diagram of an electronic device according to an embodiment of the application.
  • Figure 1 shows a schematic diagram of an artificial intelligence main framework, which describes the overall workflow of the artificial intelligence system and is suitable for general artificial intelligence field requirements.
  • Intelligent Information Chain reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • smart chips hardware acceleration chips such as CPU, NPU, GPU, ASIC, FPGA
  • basic platforms include distributed computing frameworks and network related platform guarantees and support, which can include cloud storage and Computing, interconnection network, etc.
  • sensors communicate with the outside to obtain data, and these data are provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • the data involves graphics, images, voice, and text, as well as the Internet of Things data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, and usually provides functions such as classification, ranking, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, image Recognition and so on.
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent manufacturing, intelligent transportation, Smart home, smart medical, smart security, autonomous driving, safe city, smart terminal, etc.
  • an embodiment of the present application provides a system architecture 200.
  • the data collection device 260 is used to collect target data (hereinafter also referred to as training data) and store it in the database 230.
  • the training device 220 generates a target model/rule 201 based on the target data maintained in the database 230. The following will describe in detail how the training device 220 obtains the target model/rule 201 based on the target data.
  • the target model/rule 201 can be applied to computer vision (for example, image classification), speech recognition, text recognition, and the like.
  • the training device 220 optimizes the preprocessing process in the training process through the data enhancement method of the embodiment of the present application, so as to automatically determine the enhancement strategy corresponding to different training data, and there is no need to manually design the data enhancement method.
  • the training device 220 may determine the enhancement strategy used in the preprocessing process according to the training data, use the enhancement strategy to enhance the training data, obtain the enhanced training data, and use the enhanced training data to train the neural network model , Get the target model/rule. For example, using the enhanced training data, the neural network model structure and loss function are searched in the search space to obtain the target model/rule.
  • the search space includes multiple neural network model structures, multiple loss functions, and so on.
  • the search space of the enhanced strategy may include multiple sets of enhanced strategies.
  • Each set of enhanced strategies may include N sub-strategies, and each sub-strategy includes two processing operations (operation 1 and operation 2) that are executed in sequence. ), and each operation is associated with two parameters: 1) the probability of applying the operation, and 2) the intensity value of the operation.
  • the processing operation types may include: rotation, translation, brightness adjustment, and other 16 types.
  • the enhanced strategy includes 5 sub-strategies, and each sub-strategy includes two-step processing operations executed in sequence.
  • the strategy includes operation 1 and operation 2, and the processing operations included in other sub-strategies are not shown.
  • the search space where the training data is an image enhancement strategy there are a total of (16 ⁇ 10 ⁇ 11) 10 optional enhancement strategies.
  • the search space of the enhancement strategy can be searched to determine the enhancement strategy with better performance index, and the enhancement strategy with the better performance index is used to enhance the training data to expand the training. Data to improve the performance of neural network models.
  • the target model/rule obtained by the training device 220 can be applied to different systems or devices.
  • the execution device 210 is equipped with an I/O interface 212 to perform data interaction with external devices.
  • the "user" can input data to the I/O interface 212 through the client device 240.
  • the execution device 210 can call data, codes, etc. in the data storage system 250, and can also store data, instructions, etc. in the data storage system 250.
  • the calculation module 211 uses the target model/rule 201 to process the input data, and returns the processing result to the client device 240 through the I/O interface 212, and provides it to the user.
  • the training device 220 can generate corresponding target models/rules 201 based on different data for different targets, so as to provide users with better results.
  • the user can manually specify the input data in the execution device 210, for example, to operate in the interface provided by the I/O interface 212.
  • the client device 240 can automatically input data to the I/O interface 212 and obtain the result. If the client device 240 automatically inputs data and needs the user's authorization, the user can set the corresponding authority in the client device 240.
  • the user can view the result output by the execution device 210 on the client device 240, and the specific presentation form may be a specific manner such as display, sound, and action.
  • the client device 240 may also serve as a data collection terminal to store the collected target data in the database 230.
  • FIG. 2A is only a schematic diagram of a system architecture provided by an embodiment of the present application, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the data storage system 250 is an external memory relative to the execution device 210. In other cases, the data storage system 250 may also be placed in the execution device 210.
  • an embodiment of the present application provides another system architecture 400.
  • the system architecture 400 may include a client device 410 and a server 420.
  • the client device 410 may establish a connection with the server 420, and the server 420 may use this
  • the data enhancement method of the application embodiment preprocesses the training data, generates a target model/rule based on the training data after data enhancement, and provides the target model/rule to the client device 410.
  • the client device 410 may configure the target model/rule on a corresponding execution device, for example, an embedded neural network processor (Neural-network Processing Unit, NPU).
  • NPU embedded neural network processor
  • This application can optimize the preprocessing process in the training process of the target neural network model through the data enhancement method described below.
  • Determine the enhancement strategy used in the preprocessing process according to the training data use the enhancement strategy to enhance the training data, obtain the enhanced training data, and use the enhanced training data to train to obtain the target neural network model.
  • the network model can be applied to scenarios such as scene recognition, human attribute recognition, and automated machine learning (AutoML). It is applied to scene recognition, for example, mobile phone album classification, mobile phone recognition, etc., that is, the target neural network model can be applied in terminal devices (such as smart phones).
  • the target nerve can be applied to terminal devices involved in smart cities (for example, camera equipment, computing centers, servers, etc.) Network model.
  • terminal devices involved in smart cities for example, camera equipment, computing centers, servers, etc.
  • AutoML automated machine learning
  • the target neural network model can be applied to the server involved in automated machine learning (AutoML) to provide users with customized data enhancement services.
  • FIG. 4 is a flowchart of a data enhancement method according to an embodiment of the application. As shown in FIG. 4, the method in this embodiment may be executed by the training device 220 or the processor of the training device 220 as shown in FIG. 2A, or, It may be executed by the server 420 or the processor of the server 420 as shown in FIG. 2B, and the method of this embodiment may include:
  • Step 101 Acquire first training data, at least one first vector, and a performance index corresponding to each first vector.
  • the first training data may be original training data, that is, training data that has not undergone data enhancement processing.
  • the first training data may be training data maintained in the database 230 as shown in FIG. 2A.
  • the first training data may be the training data sent by the client device 410 to the server 420 as shown in FIG. 2B, so that the server 420 feeds back the target neural network model to the client device 410 based on the training data.
  • Each first vector in the at least one first vector is used to represent a group of first enhancement strategies, for example, a group of enhancement strategies as shown in FIG. 3.
  • one or more first vectors may be selected in the search space of the enhanced strategy, for example, one or more first vectors are selected by random sampling.
  • the performance index corresponding to each first vector includes the performance index of the first neural network model.
  • the performance index may include any one or a combination of correct rate, recall rate, time delay, etc.
  • the first neural network model is obtained by training on second training data, and the second training data is training data obtained by performing enhancement processing on the first training data using the first enhancement strategy.
  • the first enhancement strategy represented by the first vector may be used to perform enhancement processing on the first training data to obtain second training data (that is, training data after preprocessing).
  • second training data that is, training data after preprocessing
  • Use the second training data to train the neural network model to obtain the first neural network model, and use the test data to determine the performance index of the first neural network model.
  • the performance index of the first neural network model is corresponding to the first vector Performance.
  • Step 102 Determine at least one second vector according to the performance index corresponding to the at least one first vector and the at least one first vector.
  • the at least one second vector is predicted to be obtained, and each second vector is used to represent a group of second enhancement strategies.
  • the corresponding relationship between the first vector and the performance index is determined, the performance index is optimized, and the at least one second vector is predicted.
  • the at least one second vector may be obtained by optimizing prediction in a discrete parameter space, or the at least one second vector may be predicted by using the following achievable manner.
  • An achievable way is to map the at least one first vector from a discrete parameter space to a continuous parameter space, obtain at least one third vector, and determine according to the at least one third vector and the performance index corresponding to the at least one vector At least one second vector. Since each group of enhancement strategies in the search space of the enhancement strategy is discrete, the enhancement strategy represented by the at least the first vector is an enhancement strategy in a discrete parameter space. For the optimization of the performance index of the first vector in the discrete space, the discrete enhancement strategy (first vector) can be mapped to the continuous parameter space to obtain at least one third vector. The at least one third vector represents continuous The enhancement strategy in the parameter space of is also referred to as the continuous representation of the enhancement strategy.
  • At least one second vector according to the third vector in the continuous parameter space and the corresponding performance index can be searched and predicted based on at least one third vector in the continuous parameter space and the performance index corresponding to the at least one third vector.
  • the mapping relationship between the third vector and the performance indicator may be determined based on the at least one third vector and the performance indicator, and at least one second vector may be determined based on the mapping relationship.
  • the mapping relationship between the third vector and the performance index may be a mapping function in a continuous parameter space, and a second vector with higher performance may be searched and predicted in the mapping function.
  • an achievable way of determining at least one second vector may be: determining at least one fourth vector according to the mapping relationship, and mapping the at least one fourth vector from a continuous parameter space to discrete parameters Space to obtain at least one second vector.
  • a gradient update method may be used, and in the mapping relationship, the at least one fourth vector is determined.
  • the fourth vector is an enhancement strategy in a continuous parameter space
  • the second vector is a representation of the fourth vector in a discrete parameter space.
  • Step 103 Determine at least one target vector according to the performance index corresponding to the at least one first vector and the performance index corresponding to the at least one second vector.
  • the performance index corresponding to each second vector includes the performance index of the second neural network model, that is, the actual performance index corresponding to the second vector.
  • the second neural network model is trained on the third training data.
  • the third training The data is the training data obtained by using the second enhancement strategy to enhance the first training data.
  • the performance index corresponding to the at least one target vector is higher than the performance index corresponding to the at least one first vector and the at least one second vector other than the at least one target vector.
  • the second enhancement strategy represented by the second vector can be used to enhance the first training data to obtain the third training data (That is, the training data after preprocessing).
  • M vectors with better performance indexes may be selected as the at least one target vector, and M is a positive integer.
  • the target enhancement strategy represented by the at least one target vector is used as a preprocessing operation in the training process for obtaining the target neural network model.
  • At least one second vector is determined according to the performance index corresponding to the at least one first vector and the at least one vector, and according to The performance index corresponding to at least one first vector and the performance index corresponding to at least one second vector are determined, and at least one target vector is determined, and the performance index corresponding to the at least one target vector is higher than at least one first vector and at least one second vector.
  • each target vector represents a set of target enhancement strategies
  • at least one set of target enhancement strategies is used to perform enhancement processing on the first training data to obtain target training data
  • the target training data It is used for training to obtain the target neural network model, which can automatically determine the corresponding enhancement strategy based on the training data, so as to expand the training data and improve the performance of the target neural network model.
  • the search space of the enhancement strategy is modeled, and the first vector in the discrete parameter space is mapped to the continuous parameter space, in the continuous parameter space
  • An enhancement strategy with a higher predictive performance index is applied to the preprocessing process in the training process to expand the training data and improve the performance of the target neural network model.
  • the at least one third vector is obtained by mapping the at least one first vector from the discrete parameter space to the continuous parameter space.
  • At least one third vector and at least one second vector with a higher performance index predictive performance index can improve the sampling efficiency, that is, a small amount of sampling can obtain an enhancement strategy with a higher performance index, and can also reduce resource consumption, that is, reduce the forecast TPU or CPU resources required during processing.
  • FIG. 5 is a flowchart of another data enhancement method according to an embodiment of the application.
  • the method in this embodiment may be executed by the training device 220 or the processor of the training device 220 as shown in FIG. 2A, or , Can be executed by the server 420 or the processor of the server 420 as shown in FIG. 2B.
  • the enhancement strategy in the discrete parameter space can be mapped to the continuous parameter space through two neural network models, and the continuous parameter space can be obtained through learning.
  • the relationship between the enhancement strategy and the performance index in the parameter space, and then based on the relationship between the enhancement strategy and the performance index search to obtain an enhancement strategy with a higher performance index, the method of this embodiment may include:
  • Step 201 Acquire first training data, at least one first vector, and a performance index corresponding to each first vector.
  • step 201 For the explanation of step 201, reference may be made to step 101 of the embodiment shown in FIG. 4, which will not be repeated here.
  • Step 202 Input the at least one first vector into a fifth neural network model, and output at least one third vector.
  • the fifth neural network model is used to map each first vector from a discrete parameter space to a continuous parameter. space.
  • the fifth neural network model can be any neural network model, and the fifth neural network model can be represented by the vector of the enhancement strategy in the discrete parameter space, and the enhancement strategy vector of the continuous parameter space corresponding to it can be represented by training. .
  • at least one first vector is respectively input to the fifth neural network model, and at least one third vector is output, that is, the vector representation of the enhancement strategy in the discrete parameter space in the continuous parameter space is output.
  • the fifth neural network model may include an embedding layer (Embedding), a long short-term memory network (Long Short-Term Memory, LSTM), and a fully connected layer (Linear).
  • Embedding embedding layer
  • LSTM Long Short-Term Memory
  • Linear fully connected layer
  • the fifth neural network model may also be in other specific forms, and the embodiments of the present application are not illustrated one by one.
  • Step 203 Input at least one third vector and the performance index to a third neural network model, and output a mapping relationship between the third vector and the performance index.
  • the third neural network model can be any neural network model, and the third neural network model can be obtained by training the enhancement strategy vector representation in the continuous parameter space, the corresponding performance index, and the mapping relationship between the two.
  • at least one third vector and the performance index corresponding to the at least one first vector are input to the second neural network model, and the mapping relationship between the third vector and the performance index is output.
  • the third neural network model may be a Multi-Layer Perception (MLP).
  • MLP Multi-Layer Perception
  • Step 204 Determine at least one fourth vector in the mapping relationship in a gradient update manner.
  • the vector representation of at least one set of enhancement strategies with higher predictive performance indicators in the continuous parameter space that is, the at least one fourth vector, which is in the continuous parameter space Enhancement strategy.
  • Step 205 Input at least one fourth vector into the fourth neural network model, and output at least one second vector.
  • the fourth neural network model is used to map each fourth vector from a continuous parameter space to a discrete parameter space. .
  • the fourth neural network model can be any neural network model, and the fourth neural network model can be represented by the vector of the enhancement strategy in the continuous parameter space, and the enhancement strategy vector in the discrete parameter space corresponding to it can be represented by training. .
  • at least one fourth vector is respectively input to the fourth neural network model, and at least one second vector is output, that is, the vector representation of the enhancement strategy in the discrete parameter space is output.
  • the fourth neural network model may include a long short-term memory network (LSTM) and a fully connected layer (Linear).
  • LSTM long short-term memory network
  • Linear fully connected layer
  • the fourth neural network model may also be in other specific forms, and the embodiments of the present application are not illustrated one by one.
  • the fifth neural network model and the fourth neural network model can also be jointly trained, that is, the output of the fifth neural network model is used as the input of the fourth neural network model.
  • the fifth neural network model and the fourth neural network model may be obtained by training the vector representation of the enhancement strategy in the discrete parameter space, and the input of the fifth neural network model obtained by training is the same as the output of the fourth neural network model.
  • Step 206 Determine whether the preset condition is met. If the preset condition is not met, use at least one second vector and at least one first vector as at least one first vector of the next iteration, and perform step 201, if the preset condition is met If the condition is met, the search is stopped, and step 207 is executed.
  • the preset conditions may include convergence conditions and search termination conditions.
  • the convergence condition may be that as the search continues, it is impossible to continue to search for an enhancement strategy with better performance indicators. For example, at least one of the performance indicators corresponding to the second vector is not higher than any one of the first vectors.
  • the search termination condition may be that the number of search (iteration) steps reaches a set threshold.
  • At least one second vector and at least one first vector are used as at least one first vector of the next iteration, and steps 201 to 205 are executed to predict the higher performance index in the next round.
  • Enhanced strategy If the convergence condition and the search termination condition are not met, at least one second vector and at least one first vector are used as at least one first vector of the next iteration, and steps 201 to 205 are executed to predict the higher performance index in the next round. Enhanced strategy.
  • the search is stopped, and a vector with a higher performance index is selected from the at least one second vector and the at least one first vector as the target vector.
  • the enhancement strategy is applied in the preprocessing process in the training process to train the target neural network model.
  • Step 207 Determine at least one target vector according to the performance index corresponding to the at least one first vector and the performance index corresponding to the at least one second vector.
  • step 207 For the explanation of step 207, reference may be made to step 103 of the embodiment shown in FIG. 4, which will not be repeated here. That is, through step 207, a vector with a higher performance index is selected as the target vector from the second vector and the first vector searched in the above steps, so as to expand the training data and improve the performance of the target neural network model.
  • the at least one first vector is input to the fifth neural network model, and at least one third vector is output
  • the fifth neural network model is used to map each first vector from the discrete parameter space to the continuous parameter space, input at least one third vector and the performance index to the third neural network model, and output the third vector and performance
  • the mapping relationship between the indicators uses a gradient update method.
  • at least one fourth vector is determined, at least one fourth vector is input to the fourth neural network model, and at least one second vector is output.
  • the neural network model is used to map each fourth vector from the continuous parameter space to the discrete parameter space to determine whether the preset conditions are met.
  • At least one third vector and at least one first vector As at least one first vector of the next iteration, and execute to obtain at least one first vector and the performance index corresponding to each first vector, and stop searching if the preset condition is met.
  • the search space of the enhancement strategy is modeled, and the first vector in the discrete parameter space is mapped to the continuous parameter space, in the continuous parameter space
  • An enhancement strategy with a higher predictive performance index is applied to the preprocessing process in the training process to expand the training data and improve the performance of the target neural network model.
  • Using a gradient update method to determine an enhancement strategy with a higher performance index in a continuous parameter space can improve the search efficiency of the enhancement strategy and the sampling efficiency of the enhancement strategy.
  • an embodiment of the present invention provides a data enhancement device to implement the aforementioned data enhancement method.
  • the data enhancement device includes a random strategy generation module 61, a strategy evaluation module 62, an encoder 63, a predictor 64, and a decoder 65.
  • the data enhancement device may be the execution subject of the data enhancement method of the embodiment of the present application, and the data enhancement device may be the training device 220 or the processor of the training device 220 as shown in FIG. 2A, or the server as shown in FIG. 2B 420 or the processor of the server 420.
  • the random strategy generation module 61 may randomly sample M groups of enhanced strategies in the search space of enhanced strategies, that is, obtain M first vectors.
  • the strategy evaluation module 62 sends neural network model configuration information to the test device, the neural network model configuration information is used to configure M first neural network models, and the M first neural network models are represented by using the M first vectors
  • the test device restores the M first neural network models according to the neural network model configuration information, and the test device measures the performance indicators of the M neural network models, such as accuracy, delay, etc.
  • the performance indicators of the M first neural network models are sent to the strategy evaluation module 62, and the strategy evaluation module 62 obtains the combination of the M first vectors and the performance indicators.
  • the encoder 63 maps the M first vectors from the discrete parameter space to the continuous parameter space to obtain M third vectors, that is, continuous representations of different enhancement strategies. As shown in FIG.
  • the encoder 63 passes the M first vectors through the embedding layer, the LSTM, and the fully connected layer, and outputs M third vectors.
  • the predictor 64 learns the relationship between the continuous representation of the enhancement strategy and the performance index, and uses the gradient update method to predict the continuous representation of the K groups of enhancement strategies with better performance, that is, K fourth vectors are obtained.
  • the output of the encoder (M second vectors) and performance indicators are used as the input of the MLP, and the relationship between the continuous representation of the strategy and the performance indicators is enhanced through MLP learning, and the gradient update method is used to predict performance Continuous representation of better K groups of enhanced strategies (K fourth vectors).
  • the decoder 65 decodes the K fourth vectors into discrete enhancement strategies, that is, the K above-mentioned second vectors. As shown in FIG. 6B, the decoder 65 outputs the predicted discrete enhancement vector through the embedding layer, the LSTM and the fully connected layer.
  • the K second vectors are obtained, it is determined through the above step 206 that the convergence condition or the search termination condition is not satisfied, and the M first vectors and K second vectors are used as the first vectors for the next iteration. , Loop iteration, search for enhanced strategies with higher performance indicators, until the convergence condition or search termination condition is met.
  • an encoder is used to map the discrete enhancement strategy to a continuous parameter space, and the relationship between the continuous representation of the predictor learning strategy and the performance index is used to select the enhancement strategy with the best current performance, and according to its continuous representation
  • the gradient update method predicts the continuous representation of the enhanced strategy with better performance, and uses the decoder to decode the predicted continuous representation into a discrete enhancement strategy, so as to effectively model the search space of the enhanced strategy through a small amount of enhanced strategy sampling, and then Mapping into a continuous parameter space, using an efficient gradient update method for enhanced strategy search, can improve the search efficiency of the enhanced strategy, improve the sampling efficiency of the enhanced strategy, and reduce resource consumption, that is, reduce the TPU or CPU required in the preprocessing process H.
  • test device can be a server or an internal chip of a server, or a terminal device or an internal chip of a terminal device.
  • terminal device can be a wireless communication device or an Internet of Things (IoT) device.
  • IoT Internet of Things
  • Wearable devices or vehicle-mounted devices mobile terminals
  • CPE Customer Premise Equipment
  • FIG. 8 is a flowchart of another data enhancement method according to an embodiment of the application.
  • the embodiment of the application relates to a data enhancement device and a model application device.
  • the data enhancement apparatus may be the training device 220 or the processor of the training device 220 as shown in FIG. 2A, or the server 420 or the processor of the server 420 as shown in FIG. 2B.
  • the model application device may be the execution device 210 or the processor of the execution device 210 as shown in FIG. 2A, or the client device 410 or the processor of the client device 410 as shown in FIG. 2B.
  • the data enhancement method of the embodiment of the present application may further include:
  • Step 301 The data enhancement device uses at least one set of target enhancement strategies to perform enhancement processing on the first training data to obtain target training data.
  • the data enhancement device can determine at least one set of target enhancement strategies by the method of any of the above embodiments.
  • the data enhancement device can apply the at least one set of target enhancement strategies to the pre-training process.
  • at least one set of target enhancement strategies is used to enhance the first training data to obtain the target training data.
  • the data set A is the first training data of this embodiment of the application.
  • A performs enhancement processing to obtain data set A'.
  • the data set A may include various pictures collected by the mobile phone, such as the leftmost picture in FIG. 7. After the picture is enhanced, it may be the rightmost picture in FIG. 7.
  • Step 302 The data enhancement device uses the target training data to train the neural network model to obtain the target neural network model.
  • Using target training data to train a neural network model includes, but is not limited to: neural network model structure search, neural network model parameter (for example, weight, bias, etc.) search, etc.
  • Step 303 The data enhancement device sends target model configuration information to the model application device, where the target model configuration information is used to configure the target neural network model.
  • the model application device receives the target model configuration information sent by the data enhancement device, the model application device can restore the target neural network model according to the neural network model configuration information, and use the target neural network model to process the corresponding data, for example, using
  • the target neural network model processes pictures in mobile phone albums and performs album classification.
  • the first training data is enhanced by using at least one set of target enhancement strategies to obtain target training data
  • the target training data is used to train the initial neural network model
  • the target neural network model is obtained to improve the target neural network model. Therefore, the processing performance of the model application device using the target neural network model is improved.
  • the data enhancement device 900 includes an acquisition module 901, a prediction module 902, and an enhancement strategy determination module 903.
  • the data enhancement device 900 has the function of a training device or a server in the method embodiment.
  • the data enhancement device 900 may execute the method of the embodiment of FIG. 4 or FIG. 5, or execute the method of the data enhancement device of the embodiment of FIG. 8.
  • the units of the data enhancement device 900 are respectively used to perform the following operations and/or processing.
  • the obtaining module 901 is configured to obtain first training data, at least one first vector, and a performance index corresponding to each first vector, each first vector is used to represent a set of first enhancement strategies, and each first vector corresponds to
  • the performance indicators of include the performance indicators of the first neural network model, the first neural network model is obtained by training the second training data, the second training data is the use of the first enhancement strategy to enhance the processing of the first training data The training data obtained afterwards.
  • the prediction module 902 is configured to determine at least one second vector according to the performance index corresponding to the at least one first vector and the at least one first vector, and each second vector is used to represent a group of second enhancement strategies.
  • the enhancement strategy determination module 903 is configured to determine at least one target vector according to the performance index corresponding to the at least one first vector and the performance index corresponding to the at least one second vector, and the performance index corresponding to each second vector includes the second nerve
  • the performance index corresponding to the at least one target vector is higher than the performance index corresponding to the at least one first vector and the at least one second vector other than the at least one target vector, and each target vector represents a group A target enhancement strategy, at least one set of target enhancement strategies represented by at least one target vector is used to perform enhancement processing on the first training data to obtain target training data, and the target training data is used to train to obtain a target neural network model.
  • the prediction module 902 is configured to: map the at least one first vector from a discrete parameter space to a continuous parameter space to obtain at least one third vector; according to the at least one third vector and the at least one first vector A performance index corresponding to a vector determines at least one second vector.
  • the prediction module 902 is configured to: determine the mapping relationship between the third vector and the performance indicator according to the performance index corresponding to the at least one third vector and the at least one first vector; and according to the mapping relationship , Determine at least one second vector.
  • the prediction module 902 is configured to: input the performance index corresponding to the at least one third vector and the at least one first vector to the third neural network model, and output the relationship between the third vector and the performance index The mapping relationship.
  • the prediction module 902 is configured to: determine at least one fourth vector according to the mapping relationship; map the at least one fourth vector from the continuous parameter space to the discrete parameter space, and obtain the at least one first vector Two vectors.
  • the prediction module 902 is used to determine the at least one fourth vector in the mapping relationship in a gradient update manner.
  • the prediction module 902 is configured to: input the at least one fourth vector into a fourth neural network model, and output the at least one second vector, and the fourth neural network model is used to transfer each fourth vector Mapping from continuous parameter space to discrete parameter space.
  • the acquiring module 901 is further configured to: determine whether a preset condition is met, and if the preset condition is not met, use the at least one second vector and the at least one first vector as at least one first vector, and execute The step of obtaining at least one first vector and the performance index corresponding to each first vector.
  • the prediction module 902 is configured to: input the at least one first vector into the fifth neural network model, and output the at least one third vector, and the fifth neural network model is used to input each first vector Mapping from discrete parameter space to continuous parameter space.
  • the obtaining module 901 is configured to: randomly sample in the search space of the data enhancement strategy to obtain the at least one first vector.
  • the device further includes: a transceiver module 904.
  • the transceiver module 904 is used to send neural network model configuration information to the testing device, and the neural network model configuration information is used to configure the first neural network model.
  • the transceiver module 904 is further configured to receive the performance index of the first neural network model sent by the testing device.
  • the device further includes: a preprocessing module 905 and a training module 906.
  • the preprocessing module 905 is configured to use the at least one set of target enhancement strategies to perform enhancement processing on the first training data to obtain target training data.
  • the training module 906 is configured to use the target training data to train the initial neural network model, and obtain the target neural network model.
  • the transceiver module 904 is also used to send target model configuration information, and the target model configuration information is used to configure the target neural network model.
  • the data enhancement device 900 may also have other functions in the method embodiment at the same time.
  • the acquisition module 901, the prediction module 902, the enhancement strategy determination module 903, the preprocessing module 905, and the training module 906 may be processors, and the transceiver module 904 may be a transceiver.
  • the transceiver includes a receiver and a transmitter, and has both sending and receiving functions.
  • the acquisition module 901, the prediction module 902, the enhancement strategy determination module 903, the preprocessing module 905, and the training module 906 may be one processing device or multiple processing devices, and the functions of the processing devices may be partially or fully implemented by software.
  • the functions of the processing device may be partially or fully implemented by software.
  • the processing device may include a memory and a processor.
  • the memory is used to store a computer program, and the processor reads and executes the computer program stored in the memory to execute the steps in each method embodiment.
  • the processing device includes a processor.
  • the memory for storing the computer program is located outside the processing device, and the processor is connected to the memory through a circuit/wire to read and execute the computer program stored in the memory.
  • the data enhancement device 900 may be a chip.
  • the transceiver module 904 may specifically be a communication interface or a transceiver circuit.
  • FIG. 10 is a schematic structural diagram of an electronic device 1000 provided by this application.
  • the electronic device 1000 includes a processor 1001 and a transceiver 1002.
  • the electronic device 1000 further includes a memory 1003.
  • the processor 1001, the transceiver 1002, and the memory 1003 can communicate with each other through an internal connection path to transfer control signals and/or data signals.
  • the memory 1003 is used to store computer programs.
  • the processor 1001 is configured to execute a computer program stored in the memory 1703, so as to realize each function of the data enhancement device 900 in the foregoing device embodiment.
  • the processor 1001 may be used to perform operations and/ Or processing
  • the transceiver 1002 is used to perform operations and/or processing performed by the transceiver module 904.
  • the memory 1003 may also be integrated in the processor 1001 or independent of the processor 1001.
  • the electronic device of this embodiment can execute the data enhancement method of the foregoing method embodiment, and its technical principles and technical effects can be referred to the explanation of the foregoing embodiment, which will not be repeated here.
  • the present application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium.
  • the computer program When the computer program is executed by a computer, the computer executes the steps and/or processing in any of the above-mentioned method embodiments. .
  • the computer program product includes computer program code.
  • the computer program code runs on a computer, the computer executes the steps and/or processing in any of the foregoing method embodiments.
  • the application also provides a chip including a processor.
  • the memory for storing the computer program is provided independently of the chip, and the processor is used to execute the computer program stored in the memory to execute the steps and/or processing in any method embodiment.
  • the chip may also include a memory and a communication interface.
  • the communication interface may be an input/output interface, a pin, an input/output circuit, or the like.
  • the processor mentioned in the above embodiments may be an integrated circuit chip with signal processing capability.
  • the steps of the foregoing method embodiments can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the processor can be a general-purpose processor, digital signal processor (digital signal processor, DSP), application-specific integrated circuit (ASIC), field programmable gate array (field programmable gate array, FPGA) or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • dynamic RAM dynamic RAM
  • DRAM dynamic random access memory
  • synchronous dynamic random access memory synchronous DRAM, SDRAM
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • synchronous connection dynamic random access memory serial DRAM, SLDRAM
  • direct rambus RAM direct rambus RAM
  • the disclosed system, device, and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种数据增强方法和装置。该方法包括:获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标(步骤101),根据至少一个第一向量和至少一个向量对应的性能指标,确定至少一个第二向量(步骤102),根据至少一个第一向量对应的性能指标和至少一个第二向量对应的性能指标,确定至少一个目标向量(步骤103),该至少一个目标向量对应的性能指标高于至少一个第一向量和至少一个第二向量中除至少一个目标向量之外的其他向量对应的性能指标,每个目标向量表示一组目标增强,至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,该目标训练数据用于训练得到目标神经网络模型。该方法可以实现基于训练数据自动化确定相应的增强策略,以扩充训练数据,提升目标神经网络模型的性能。

Description

数据增强方法和装置
本申请要求于2020年2月25日提交中国专利局、申请号为202010117866.6、申请名称为“数据增强方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,特别涉及一种数据增强方法和装置。
背景技术
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能领域的研究包括机器人,自然语言处理,计算机视觉,决策与推理,人机交互,推荐与搜索,AI基础理论等。
近年来,机器学习技术在各个领域都取得了重大的突破,如金融,医疗和交通等。其中,通过神经网络模型实现的机器学习技术,被广泛应用,该神经网络模型可以支持处理图像、文本、语音以及序列等多种类型的数据,以实现分类、回归和预测等。通过提高训练数据的质量、多样性和数量,可以有效地提高神经网络模型的性能。
然而,为了提高训练数据的质量、多样性和数量,通常需要人类专家根据任务需求手动设计数据增强方法,以扩充训练数据,提升神经网络模型的性能。这样的方式存在设计成本高且数据增强方法的可迁移性差的问题。
发明内容
本申请提供一种数据增强方法和装置,以实现基于训练数据自动化确定相应的增强策略。
第一方面,本申请实施例提供一种数据增强方法,该方法可以包括:获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标,每个第一向量用于表示一组第一增强策略,该每个第一向量对应的性能指标包括第一神经网络模型的性能指标,该第一神经网络模型是由第二训练数据训练得到的,该第二训练数据为使用该第一增强策略对该第一训练数据进行增强处理后得到的训练数据。根据该至少一个第一向量和该至少一个第一向量对应的性能指标,确定至少一个第二向量,每个第二向量用于表示一组第二增强策略。根据该至少一个第一向量对应的性能指标和该至少一个第二向量对应的性能指标,确定至少一个目标向量,每个第二向量对应的性能指标包括第二神经网络模型的性能指标,该第二神经网络模型是由第三训练数据训练得到的,该第三训练数据为使用该第二增强策略对 该第一训练数据进行增强处理后得到的训练数据。其中,该至少一个目标向量对应的性能指标高于该至少一个第一向量和该至少一个第二向量中除该至少一个目标向量之外的其他向量对应的性能指标,每个目标向量表示一组目标增强策略,该至少一个目标向量表示的至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,该目标训练数据用于训练得到目标神经网络模型。
本实现方式,根据至少一个第一向量和至少一个第一向量对应的性能指标,预测得到该至少一个第二向量,再结合该至少一个第二向量的实际性能指标,选取性能指标较高的向量作为目标向量,该实际性能指标为将该至少一个第二向量所表示的增强策略应用于第一训练数据所得到的神经网络模型的实际性能指标,可以实现基于训练数据自动化确定相应的增强策略,以扩充训练数据,提升目标神经网络模型的性能。
在一种可能的设计中,根据该至少一个第一向量和该至少一个第一向量对应的性能指标,确定至少一个第二向量,可以包括:将该至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量。根据该至少一个第三向量和该至少一个第一向量对应的性能指标,确定至少一个第二向量。
本实现方式,相较于在离散的增强策略的搜索空间中搜索指标性能更高的增强策略,通过将该至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量,根据该至少一个第三向量和该性能指标预测性能指标更高的至少一个第二向量,可以提升采样效率,即少量的采样便可以得到性能指标更高的增强策略,还可以降低资源消耗,即降低预处理过程中所需要的TPU或CPU的资源。
在一种可能的设计中,根据该至少一个第三向量和该至少一个第一向量对应的性能指标,确定至少一个第二向量,可以包括:根据该至少一个第三向量和该至少一个第一向量对应的性能指标,确定该第三向量与该至少一个第一向量对应的性能指标之间的映射关系。根据该映射关系,确定至少一个第二向量。
本实现方式,通过该第三向量与该至少一个第一向量对应的性能指标之间的映射关系,搜索确定性能指标更高的至少一个第二向量,可以提升搜索性能指标更高的增强策略的搜索效率,降低资源消耗。
在一种可能的设计中,根据该至少一个第三向量和该至少一个第一向量对应的性能指标,确定该第三向量与该至少一个第一向量对应的性能指标之间的映射关系,可以包括:将该至少一个第三向量和该至少一个第一向量对应的性能指标输入至第三神经网络模型,输出该第三向量与该性能指标之间的映射关系。
本实现方式,通过神经网络模型确定该第三向量与该性能指标之间的映射关系,可以提升映射关系的准确性,有利于搜索预测性能指标更高的该至少一个第二向量。
在一种可能的设计中,根据该映射关系,确定至少一个第二向量,可以包括:根据该映射关系,确定至少一个第四向量。将该至少一个第四向量从该连续的参数空间映射至该离散的参数空间,获取该至少一个第二向量。
在一种可能的设计中,根据该映射关系,确定至少一个第四向量,包括:采用梯度更新的方式,在该映射关系中,确定该至少一个第四向量。
本实现方式,采用梯度更新的方式在连续的参数空间内确定性能指标更高的增强策略,可以提升增强策略的搜索效率,提升增强策略的采样效率。
在一种可能的设计中,将该至少一个第四向量从该连续的参数空间映射至该离散的参数空间,获取该至少一个第二向量,可以包括:将该至少一个第四向量分别输入至第四神经网络模型,输出该至少一个第二向量,该第四神经网络模型用于将每个第四向量从连续的参数空间映射至离散的参数空间。
本实现方式,通过神经网络模型将第四向量从连续的参数空间映射至离散的参数空间,可以提升映射的效率和准确性。
在一种可能的设计中,该方法还可以包括:判断是否满足预设条件,若不满足预设条件,则将该至少一个第二向量和该至少一个第一向量作为至少一个第一向量,执行该获取至少一个第一向量和每个第一向量对应的性能指标的步骤。
在一种可能的设计中,将该至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量,可以包括:将该至少一个第一向量分别输入至第五神经网络模型,输出该至少一个第三向量,该第五神经网络模型用于将每个第一向量从离散的参数空间映射至连续的参数空间。
本实现方式,通过神经网络模型将第一向量从离散的参数空间映射至连续的参数空间,可以提升映射的效率和准确性,以准确确定该第三向量与该至少一个第一向量对应的性能指标之间的映射关系,进而基于该映射关系预测性能指标更高的至少一个第二向量。
在一种可能的设计中,获取至少一个第一向量,可以包括:在数据增强策略的搜索空间内,随机采样,获取该至少一个第一向量。
在一种可能的设计中,该方法还可以包括:向测试装置发送神经网络模型配置信息,该神经网络模型配置信息用于配置该第一神经网络模型。接收该测试装置发送的该第一神经网络模型的性能指标。
本实现方式,通过测试装置反馈第一神经网络模型的性能指标,该第一神经网络模型为由应用第一向量所表示的第一增强策略的训练数据训练得到的,有利于对增强策略的搜索空间进行准确建模,进而确定性能指标更高的增强策略,将性能指标更高的增强策略应用于训练流程中的预处理过程中,以扩充训练数据,提升目标神经网络模型的性能。
在一种可能的设计中,该方法还可以包括:使用该至少一组目标增强策略对该第一训练数据进行增强处理,获取目标训练数据。使用该目标训练数据对初始神经网络模型进行训练,获取该目标神经网络模型。发送目标模型配置信息,该目标模型配置信息用于配置该目标神经网络模型。
本实现方式,通过将该至少一组目标增强策略应用于获取目标神经网络模型的训练流程中的预处理操作中,可以扩充训练数据,提升目标神经网络模型的性能。将训练得到的目标神经网络模型配置给相应的模型应用装置,例如,服务器或终端设备等,可以提升模型应用装置的处理性能。
第二方面,本申请实施例提供一种数据增强装置,该数据增强装置用于执行上述第一方面或第一方面的任一可能的设计中的数据增强方法。具体地,该数据增强装置可以包括用于执行第一方面或第一方面的任一可能的设计中的数据增强方法的模块。例如,获取模块、预测模块、增强策略确定模块等。
第三方面,本申请实施例提供一种电子设备,该电子设备包括存储器和处理器,该存储器用于存储指令,该处理器用于执行所述存储器存储的指令,并且对该存储器中存储的 指令的执行使得该处理器执行上述第一方面或第一方面的任一可能的设计中的数据增强方法。
第四方面,本申请实施例提供一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现第一方面或第一方面的任一可能的设计中的方法。
第五方面,本申请提供一种计算机程序产品,该计算机程序产品包括指令,在计算机上运行时,使得计算机执行上述第一方面中任一项所述的方法。
第六方面,本申请提供一种芯片,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行如上述第一方面中任一项所述的方法。
本申请实施例的数据增强方法和装置,通过获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标,根据至少一个第一向量和至少一个向量对应的性能指标,确定至少一个第二向量,根据至少一个第一向量对应的性能指标和至少一个第二向量对应的性能指标,确定至少一个目标向量,该至少一个目标向量对应的性能指标高于至少一个第一向量和至少一个第二向量中除至少一个目标向量之外的其他向量对应的性能指标,每个目标向量表示一组目标增强,至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,该目标训练数据用于训练得到目标神经网络模型,可以实现基于训练数据自动化确定相应的增强策略,以扩充训练数据,提升目标神经网络模型的性能。
附图说明
图1为本申请实施例提供的一种人工智能主体框架示意图;
图2A为本申请实施例提供的一种应用环境示意图;
图2B为本申请实施例提供的一种应用环境示意图;
图3为本申请实施例提供的一组增强策略的示意图;
图4为本申请实施例的一种数据增强方法的流程图;
图5为本申请实施例的另一种数据增强方法的流程图;
图6A为本申请实施例的一种数据增强装置的示意图;
图6B为本申请实施例的一种数据增强方法的示意图;
图7为本申请实施例的一种增强处理的示意图;
图8为本申请实施例的另一种数据增强方法的流程图;
图9为本申请实施例的一种数据增强装置的示意图;
图10为本申请实施例的一种电子设备的示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
图1示出一种人工智能主体框架示意图,该主体框架描述了人工智能系统总体工作流程,适用于通用的人工智能领域需求。
下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。
“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。
“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到系统的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施:
基础设施为人工智能系统提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片(CPU、NPU、GPU、ASIC、FPGA等硬件加速芯片)提供;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算系统中的智能芯片进行计算。
(2)数据
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有系统的业务数据以及力、位移、液位、温度、湿度等感知数据。
(3)数据处理
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能系统中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用系统,例如,翻译,文本的分析,计算机视觉的处理,语音识别,图像的识别等等。
(5)智能产品及行业应用
智能产品及行业应用指人工智能系统在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能制造、智能交通、智能家居、智能医疗、智能安防、自动驾驶,平安城市,智能终端等。
参见附图2A,本申请实施例提供了一种系统架构200。数据采集设备260用于采集目标数据(下文也称之为训练数据)并存入数据库230,训练设备220基于数据库230中维护的目标数据生成目标模型/规则201。下面将详细地描述训练设备220如何基于目标数据得到目标模型/规则201,目标模型/规则201能够应用于计算机视觉(例如,图像分类)、语音识别、文本识别等。
训练设备220通过本申请实施例的数据增强方法对训练流程中的预处理过程进行优化, 以实现自动化的确定与不同训练数据相对应的增强策略,无需人工手动设计数据增强方法。训练设备220可以根据训练数据确定预处理过程中所使用的增强策略,使用该增强策略对训练数据进行增强处理,获取增强处理后的训练数据,使用增强处理后的训练数据对神经网络模型进行训练,得到目标模型/规则。例如,使用增强处理后的训练数据,在搜索空间内搜索神经网络模型结构和损失函数,得到目标模型/规则。该搜索空间包括多个神经网络模型结构、以及多个损失函数等。
对增强策略的搜索空间进行解释说明,该增强策略的搜索空间可以包括多组增强策略,每组增强策略可以包括N个子策略,每个子策略包括两步依次执行的处理操作(操作1和操作2),而每个操作又关联着两个参数:1)应用该操作的概率,2)该操作的强度值。以该处理操作为图像处理操作为例,该处理操作的类型可以包括:旋转、平移、亮度调整等16种。每种处理操作的可选强度值共有10种可能性。每种处理操作的可选概率值共有11种可能性。所以,训练数据为图像的增强策略的搜索空间内,共有(16×10×11) 2N种可供选择的增强策略。N为任意自然数。
以N=5为例,一组增强策略的示意图如图3所示,该增强策略包括5个子策略,每个子策略包括两步依次执行的处理操作,例如,如图3所示的第二个子策略包括的操作1和操作2,其他子策略所包括的处理操作未示出。训练数据为图像的增强策略的搜索空间内,共有(16×10×11) 10种可供选择的增强策略。通过本申请实施例的数据增强方法,可以对该增强策略的搜索空间进行搜索,以确定性能指标较好的增强策略,使用该性能指标较好的增强策略对训练数据进行增强处理,以扩充训练数据,提升神经网络模型的性能。
训练设备220得到的目标模型/规则可以应用不同的系统或设备中。在附图2A中,执行设备210配置有I/O接口212,与外部设备进行数据交互,“用户”可以通过客户设备240向I/O接口212输入数据。
执行设备210可以调用数据存储系统250中的数据、代码等,也可以将数据、指令等存入数据存储系统250中。
计算模块211使用目标模型/规则201对输入的数据进行处理,通过I/O接口212将处理结果返回给客户设备240,提供给用户。
更深层地,训练设备220可以针对不同的目标,基于不同的数据生成相应的目标模型/规则201,以给用户提供更佳的结果。
在附图2A中所示情况下,用户可以手动指定输入执行设备210中的数据,例如,在I/O接口212提供的界面中操作。另一种情况下,客户设备240可以自动地向I/O接口212输入数据并获得结果,如果客户设备240自动输入数据需要获得用户的授权,用户可以在客户设备240中设置相应权限。用户可以在客户设备240查看执行设备210输出的结果,具体的呈现形式可以是显示、声音、动作等具体方式。客户设备240也可以作为数据采集端将采集到目标数据存入数据库230。
值得注意的,附图2A仅是本申请实施例提供的一种系统架构的示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制,例如,在附图2A中,数据存储系统250相对执行设备210是外部存储器,在其它情况下,也可以将数据存储系统250置于执行设备210中。
再例如,参见附图2B,本申请实施例提供了另一种系统架构400,该系统架构400可 以包括客户设备410和服务器420,该客户设备410可以与服务器420建立连接,服务器420可以通过本申请实施例的数据增强方法对训练数据进行预处理,进行根据数据增强后的训练数据生成目标模型/规则,将目标模型/规则提供给客户设备410。在一些实施例中,可以由客户设备410将目标模型/规则配置到相应的执行设备上,例如,嵌入式神经网络处理器(Neural-network Processing Unit,NPU)。
本申请可以通过如下所述的数据增强方法优化目标神经网络模型的训练流程中的预处理过程。根据训练数据确定预处理过程中所使用的增强策略,使用该增强策略对训练数据进行增强处理,获取增强处理后的训练数据,使用增强处理后的训练数据训练得到目标神经网络模型,该目标神经网络模型可以应用于场景识别、人体属性识别、自动化机器学习(AutoML)等场景中。应用于场景识别,例如,手机相册分类、手机识物等,即可以在终端设备(例如智能手机)中应用该目标神经网络模型。应用于人体属性识别,例如,智慧城市所涉及的行人属性识别、骑行属性识别等,即可以在智慧城市所涉及的终端设备(例如,摄像设备、计算中心、服务器等)中应用该目标神经网络模型。应用于自动化机器学习(AutoML),即可以在自动化机器学习(AutoML)所涉及的服务器中应用该目标神经网络模型,为用户提供定制化的数据增强服务。本申请实施例的数据增强方法的具体解释说明可以参见下述实施例。
图4为本申请实施例的一种数据增强方法的流程图,如图4所示,本实施例的方法可以由如图2A所示的训练设备220或训练设备220的处理器执行,或者,可以由如图2B所示的服务器420或服务器420的处理器执行,本实施例的方法可以包括:
步骤101、获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标。
该第一训练数据可以是原始的训练数据,也即未经过数据增强处理的训练数据。例如,该第一训练数据可以是如图2A所示的数据库230中维护的训练数据。再例如,该第一训练数据可以是如图2B所示的客户设备410发送给服务器420的训练数据,以使得服务器420基于该训练数据,将目标神经网络模型反馈给客户设备410。
该至少一个第一向量中的每个第一向量用于表示一组第一增强策略,例如如图3所示的一组增强策略。本申请实施例可以在增强策略的搜索空间中,选取一个或多个第一向量,例如随机采样选取一个或多个第一向量。每个第一向量对应的性能指标包括第一神经网络模型的性能指标。该性能指标可以包括正确率、召回率、时延等任意一项或其组合。该第一神经网络模型是由第二训练数据训练得到的,该第二训练数据为使用该第一增强策略对第一训练数据进行增强处理后得到的训练数据。
示例性的,以一个第一向量为例,可以使用该第一向量所表示的第一增强策略对第一训练数据进行增强处理,得到第二训练数据(即预处理之后的训练数据)。使用第二训练数据对神经网络模型进行训练,得到第一神经网络模型,使用测试数据确定该第一神经网络模型的性能指标,该第一神经网络模型的性能指标即为该第一向量对应的性能指标。
步骤102、根据至少一个第一向量和至少一个第一向量对应的性能指标,确定至少一个第二向量。
根据至少一个第一向量和至少一个第一向量对应的性能指标,预测得到该至少一个第 二向量,每个第二向量用于表示一组第二增强策略。例如,根据至少一个第一向量和至少一个第一向量对应的性能指标,确定第一向量与性能指标的对应关系,优化性能指标,预测得到该至少一个第二向量。例如,可以在离散的参数空间优化预测得到该至少一个第二向量,也可以采用如下的一种可实现方式预测得到该至少一个第二向量。
一种可实现方式,将该至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量,根据该至少一个第三向量和该至少一个向量对应的性能指标,确定至少一个第二向量。由于增强策略的搜索空间内的各组增强策略是离散的,所以该至少第一向量所表示的增强策略为离散的参数空间内的增强策略。对于离散空间内的第一向量的性能指标的优化,可以先将离散的增强策略(第一向量)映射至连续的参数空间,从而获取至少一个第三向量,该至少一个第三向量即表示连续的参数空间内的增强策略,也可称为增强策略的连续表示。再根据连续的参数空间内的第三向量和对应的性能指标,预测该至少一个第二向量。本申请实施例根据连续的参数空间的至少一个第三向量和与其对应的性能指标,可以搜索预测出性能指标更高的至少一个第二向量。
在一些实施例中,可以根据至少一个第三向量和性能指标,确定第三向量与性能指标之间的映射关系,根据该映射关系,确定至少一个第二向量。第三向量与性能指标之间的映射关系可以是连续的参数空间内的映射函数,可以在该映射函数中搜索预测出性能更高的第二向量。
根据该映射关系,确定至少一个第二向量的一种可实现方式可以为:根据所述映射关系,确定至少一个第四向量,将该至少一个第四向量从连续的参数空间映射至离散的参数空间,获取至少一个第二向量。例如,可以采用梯度更新的方式,在该映射关系中,确定该至少一个第四向量。该第四向量为连续的参数空间内的增强策略,该第二向量为第四向量在离散的参数空间内的表示。
步骤103、根据至少一个第一向量对应的性能指标和至少一个第二向量对应的性能指标,确定至少一个目标向量。
每个第二向量对应的性能指标包括第二神经网络模型的性能指标,也即第二向量对应的实际性能指标,该第二神经网络模型是由第三训练数据训练得到的,该第三训练数据为使用第二增强策略对第一训练数据进行增强处理后得到的训练数据。
该至少一个目标向量对应的性能指标高于至少一个第一向量和至少一个第二向量中除所述至少一个目标向量之外的其他向量对应的性能指标。
示例性的,以一个第二向量为例,在通过步骤102得到一个第二向量后,可以使用该第二向量所表示的第二增强策略对第一训练数据进行增强处理,得到第三训练数据(即预处理之后的训练数据)。使用第三训练数据对神经网络模型进行训练,得到第二神经网络模型,使用测试数据确定该第二神经网络模型的性能指标,该第二神经网络模型的性能指标即为该第二向量对应的性能指标。
本申请实施例可以基于至少一个第一向量对应的性能指标和至少一个第二向量对应的性能指标,从中选取性能指标较好的M个向量作为该至少一个目标向量,M为正整数。将该至少一个目标向量所表示的目标增强策略作为用于获取目标神经网络模型的训练流程中的预处理操作。
本实施例,通过获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指 标,根据至少一个第一向量和至少一个向量对应的性能指标,确定至少一个第二向量,根据至少一个第一向量对应的性能指标和至少一个第二向量对应的性能指标,确定至少一个目标向量,该至少一个目标向量对应的性能指标高于至少一个第一向量和至少一个第二向量中除至少一个目标向量之外的其他向量对应的性能指标,每个目标向量表示一组目标增强策略,至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,该目标训练数据用于训练得到目标神经网络模型,可以实现基于训练数据自动化确定相应的增强策略,以扩充训练数据,提升目标神经网络模型的性能。
基于离散的参数空间内的第一向量和其对应的性能指标,对增强策略的搜索空间进行建模,将离散的参数空间内的第一向量映射至连续的参数空间,在连续的参数空间内预测性能指标更高的增强策略,将性能指标更高的增强策略应用于训练流程中的预处理过程中,以扩充训练数据,提升目标神经网络模型的性能。
相较于在离散的增强策略的搜索空间中搜索指标性能更高的增强策略,通过将该至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量,根据该至少一个第三向量和该性能指标预测性能指标更高的至少一个第二向量,可以提升采样效率,即少量的采样便可以得到性能指标更高的增强策略,还可以降低资源消耗,即降低预处理过程中所需要的TPU或CPU的资源。
图5为本申请实施例的另一种数据增强方法的流程图,如图5所示,本实施例的方法可以由如图2A所示的训练设备220或训练设备220的处理器执行,或者,可以由如图2B所示的服务器420或服务器420的处理器执行,本实施例可以通过两个神经网络模型将离散的参数空间内的增强策略映射至连续的参数空间,并学习得到连续的参数空间内的增强策略与性能指标之间的关系,进而基于增强策略与性能指标之间的关系搜索得到性能指标更高的增强策略,本实施例的方法可以包括:
步骤201、获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标。
其中,步骤201的解释说明可以参见图4所示实施例的步骤101,此处不再赘述。
步骤202、将该至少一个第一向量分别输入至第五神经网络模型,输出至少一个第三向量,该第五神经网络模型用于将每个第一向量从离散的参数空间映射至连续的参数空间。
该第五神经网络模型可以是任意的神经网络模型,该第五神经网络模型可以由离散的参数空间内的增强策略的向量表示,和与其对应的连续的参数空间的增强策略向量表示训练得到的。本申请实施例将至少一个第一向量分别输入至第五神经网络模型,输出至少一个第三向量,即输出离散的参数空间内的增强策略在连续的参数空间内的向量表示。
例如,该第五神经网络模型可以包括嵌入层(Embedding)、长短期记忆网络(Long Short-Term Memory,LSTM)和全连接层(Linear)。当然可以理解的,该第五神经网络模型也可以是其他具体的形式,本申请实施例不一一举例说明。
步骤203、将至少一个第三向量和该性能指标输入至第三神经网络模型,输出第三向量与性能指标之间的映射关系。
该第三神经网络模型可以是任意的神经网络模型,该第三神经网络模型可以连续的参数空间的增强策略向量表示、对应的性能指标和二者之间的映射关系训练得到的。本申请实施例将至少一个第三向量和该至少一个第一向量对应的性能指标输入至第二神经网络 模型,输出第三向量与性能指标之间的映射关系。
例如,该第三神经网络模型可以是多层感知器(Multi-Layer Perception,MLP)。
步骤204、采用梯度更新的方式,在映射关系中,确定至少一个第四向量。
基于该映射关系,利用梯度更新的方式,预测性能指标更高的至少一组增强策略在连续的参数空间的向量表示,也即该至少一个第四向量,该第四向量为连续的参数空间内的增强策略。
步骤205、将至少一个第四向量分别输入至第四神经网络模型,输出至少一个第二向量,该第四神经网络模型用于将每个第四向量从连续的参数空间映射至离散的参数空间。
该第四神经网络模型可以是任意的神经网络模型,该第四神经网络模型可以由连续的参数空间内的增强策略的向量表示,和与其对应的离散的参数空间的增强策略向量表示训练得到的。本申请实施例将至少一个第四向量分别输入至第四神经网络模型,输出至少一个第二向量,即输出离散的参数空间内的增强策略的向量表示。
例如,该第四神经网络模型可以包括长短期记忆网络(LSTM)和全连接层(Linear)。当然可以理解的,该第四神经网络模型也可以是其他具体的形式,本申请实施例不一一举例说明。
需要说明的是,第五神经网络模型和第四神经网络模型还可以进行联合训练,即第五神经网络模型的输出作为第四神经网络模型的输入。该第五神经网络模型和第四神经网络模型可以由离散的参数空间内的增强策略的向量表示训练得到,训练所得到的第五神经网络模型的输入与第四神经网络模型的输出相同。
步骤206、判断是否满足预设条件,若不满足预设条件,则将至少一个第二向量和至少一个第一向量作为下一次迭代的至少一个第一向量,并执行步骤201,若满足预设条件,则停止搜索,执行步骤207。
该预设条件可以包括收敛条件和搜索终止条件。示例性的,该收敛条件可以是随着搜索的不断继续,无法继续搜索到性能指标更优的增强策略,例如,至少一个第二向量对应的性能指标中没有高于任意一个第一向量对应的性能指标。该搜索终止条件可以是搜索(迭代)步数达到设定的阈值。
不满足收敛条件和搜索终止条件,则将至少一个第二向量和至少一个第一向量作为下一次迭代的至少一个第一向量,并执行步骤201至步骤205,预测下一轮性能指标更高的增强策略。
满足收敛条件或搜索终止条件中任意一项,则停止搜索,在该至少一个第二向量和该至少一个第一向量中选取性能指标较高的向量,作为目标向量,将该目标向量所表示的增强策略应用于训练流程中的预处理过程中,以训练得到目标神经网络模型。
步骤207、根据至少一个第一向量对应的性能指标和至少一个第二向量对应的性能指标,确定至少一个目标向量。
其中,步骤207的解释说明可以参见图4所示实施例的步骤103,此处不再赘述。即通过步骤207,在上述步骤搜索得到的第二向量和第一向量中选取性能指标较高的向量作为目标向量,以扩充训练数据,提升目标神经网络模型的性能。
本实施例,通过获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标,将该至少一个第一向量分别输入至第五神经网络模型,输出至少一个第三向量,该第 五神经网络模型用于将每个第一向量从离散的参数空间映射至连续的参数空间,将至少一个第三向量和该性能指标输入至第三神经网络模型,输出第三向量与性能指标之间的映射关系,采用梯度更新的方式,在映射关系中,确定至少一个第四向量,将至少一个第四向量分别输入至第四神经网络模型,输出至少一个第二向量,该第四神经网络模型用于将每个第四向量从连续的参数空间映射至离散的参数空间,判断是否满足预设条件,若不满足预设条件,则将至少一个第三向量和至少一个第一向量作为下一次迭代的至少一个第一向量,并执行获取至少一个第一向量和每个第一向量对应的性能指标,若满足预设条件,则停止搜索。基于离散的参数空间内的第一向量和其对应的性能指标,对增强策略的搜索空间进行建模,将离散的参数空间内的第一向量映射至连续的参数空间,在连续的参数空间内预测性能指标更高的增强策略,将性能指标更高的增强策略应用于训练流程中的预处理过程中,以扩充训练数据,提升目标神经网络模型的性能。
采用梯度更新的方式在连续的参数空间内确定性能指标更高的增强策略,可以提升增强策略的搜索效率,提升增强策略的采样效率。
参考图6,本发明实施例提供一种数据增强装置以实现前述数据增强方法。该数据增强装置包括随机策略生成模块61、策略评估模块62、编码器63、预测器64和解码器65。该数据增强装置可以是执行本申请实施例的数据增强方法的执行主体,该数据增强装置可以是如图2A所示的训练设备220或训练设备220的处理器,或者如图2B所示的服务器420或服务器420的处理器。如图6A所示,随机策略生成模块61可以在增强策略的搜索空间内,随机采样M组增强策略,即获取M个第一向量。策略评估模块62向测试装置发送神经网络模型配置信息,该神经网络模型配置信息用于配置M个第一神经网络模型,该M个第一神经网络模型是由使用该M个第一向量所表示的增强策略训练得到的,测试装置根据神经网络模型配置信息还原出该M个第一神经网络模型,测试装置测量该M个神经网络模型的性能指标,例如,正确率、时延等,并将M个第一神经网络模型的性能指标发送给策略评估模块62,策略评估模块62得到M个第一向量与性能指标的组合。编码器63将M个第一向量从离散的参数空间映射至连续的参数空间,得到M个第三向量,即不同增强策略的连续表示。结合图6B所示,编码器63将M个第一向量通过嵌入层、LSTM和全连接层,输出M个第三向量。预测器64学习增强策略的连续表示与性能指标之间的关系,并利用梯度更新的方式,预测性能更优的K组增强策略的连续表示,即得到K个第四向量。结合图6B所示,编码器的输出(M个第二向量)和性能指标作为MLP的输入,经过MLP学习增强策略的连续表示与性能指标之间的关系,并利用梯度更新的方式,预测性能更优的K组增强策略的连续表示(K个第四向量)。解码器65将该K个第四向量解码成离散的增强策略,即K个上述第二向量。结合图6B所示,解码器65通过嵌入层、LSTM和全连接层输出预测的离散的增强向量。
在一些实施例中,得到该K个第二向量后,通过上述步骤206确定不满足收敛条件或搜索终止条件,则将M个第一向量和K个第二向量作为下一次迭代的第一向量,循环迭代,搜索性能指标更高的增强策略,直至满足收敛条件或搜索终止条件。
本实施例,采用编码器将离散的增强策略映射至连续的参数空间,利用预测器学习策略的连续表示和性能指标之间的关系,选择当前性能最优的增强策略,根据其连续表示, 采用梯度更新的方式预测性能更优的增强策略的连续表示,利用解码器将预测的连续表示解码成离散的增强策略,从而通过少量的增强策略采样对增强策略的搜索空间进行有效建模,将其映射成连续参数空间,采用高效的梯度更新方式进行增强策略搜索,可以提升增强策略的搜索效率,提升增强策略的采样效率,还可以降低资源消耗,即降低预处理过程中所需要的TPU或CPU的资源。
需要说明的是,该测试装置可以是服务器或服务器的内部芯片、也可以是终端设备或终端设备的内部芯片,例如,该终端设备可以是无线通信设备、物联网(Internet of Things,IoT)设备、可穿戴设备或车载设备、移动终端、客户终端设备(Customer Premise Equipment,CPE)等。
图8为本申请实施例的另一种数据增强方法的流程图,如图8所示,本申请实施例涉及数据增强装置和模型应用装置。该数据增强装置可以是如图2A所示的训练设备220或训练设备220的处理器,或者如图2B所示的服务器420或服务器420的处理器。该模型应用装置可以是如图2A所示的执行设备210或执行设备210的处理器,或者如图2B所示的客户设备410或客户设备410的处理器。在上述任一实施例的基础上,本申请实施例的数据增强方法还可以包括:
步骤301、数据增强装置使用至少一组目标增强策略对第一训练数据进行增强处理,获取目标训练数据。
在实施例之前,数据增强装置可以通过上述任一实施例的方法确定至少一组目标增强策略,在本实施例中,数据增强装置可以将该至少一组目标增强策略应用至训练流程中的预处理过程中,即使用至少一组目标增强策略对第一训练数据进行增强处理,获取目标训练数据。
以训练得到的目标神经网络模型应用于手机相册分类的场景为例,如图7所示,数据集A即为本申请实施例的第一训练数据,使用至少一组目标增强策略对该数据集A进行增强处理,可以获取数据集A’。其中,数据集A可以包括手机所采集的各种图片,如图7中最左侧的一张图片,该图片经过增强处理后,可以如图7最右侧的一张图片。
步骤302、数据增强装置使用目标训练数据训练神经网络模型,获取目标神经网络模型。
使用目标训练数据训练神经网络模型包括但不限于:神经网络模型结构搜索、神经网络模型参数(例如,权重、偏置等)搜索等。
步骤303、数据增强装置向模型应用装置发送目标模型配置信息,该目标模型配置信息用于配置所述目标神经网络模型。
相应的,模型应用装置接收数据增强装置发送的目标模型配置信息,模型应用装置可以根据该神经网络模型配置信息还原出目标神经网络模型,并使用该目标神经网络模型处理相应的数据,例如,使用该目标神经网络模型处理手机相册中的图片,进行相册分类等。
本实施例,通过使用至少一组目标增强策略对第一训练数据进行增强处理,获取目标训练数据,使用目标训练数据对初始神经网络模型进行训练,获取目标神经网络模型,可以提升目标神经网络模型的性能,进而提升使用该目标神经网络模型的模型应用装置的处理性能。
参见图9,图9为本申请提供的数据增强装置900的示意性框图。数据增强装置900包括获取模块901、预测模块902和增强策略确定模块903。
在一个实施例中,数据增强装置900具有方法实施例中训练设备或服务器的功能。例如,,数据增强装置900可以执行如图4或图5实施例的方法,或者执行如图8实施例的数据增强装置所执行的方法。此时,数据增强装置900的各单元分别用于执行如下操作和/或处理。
获取模块901,用于获取第一训练数据、至少一个第一向量和每个第一向量对应的性能指标,每个第一向量用于表示一组第一增强策略,该每个第一向量对应的性能指标包括第一神经网络模型的性能指标,该第一神经网络模型是由第二训练数据训练得到的,该第二训练数据为使用该第一增强策略对该第一训练数据进行增强处理后得到的训练数据。
预测模块902,用于根据该至少一个第一向量和该至少一个第一向量对应的性能指标,确定至少一个第二向量,每个第二向量用于表示一组第二增强策略。
增强策略确定模块903,用于根据该至少一个第一向量对应的性能指标和该至少一个第二向量对应的性能指标,确定至少一个目标向量,每个第二向量对应的性能指标包括第二神经网络模型的性能指标,该第二神经网络模型是由第三训练数据训练得到的,该第三训练数据为使用该第二增强策略对该第一训练数据进行增强处理后得到的训练数据。
其中,该至少一个目标向量对应的性能指标高于该至少一个第一向量和该至少一个第二向量中除该至少一个目标向量之外的其他向量对应的性能指标,每个目标向量表示一组目标增强策略,至少一个目标向量表示的至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,该目标训练数据用于训练得到目标神经网络模型。
在一些实施例中,预测模块902用于:将该至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量;根据该至少一个第三向量和该至少一个第一向量对应的性能指标,确定至少一个第二向量。
在一些实施例中,预测模块902用于:根据该至少一个第三向量和该至少一个第一向量对应的性能指标,确定该第三向量与该性能指标之间的映射关系;根据该映射关系,确定至少一个第二向量。
在一些实施例中,该预测模块902用于:将该至少一个第三向量和该至少一个第一向量对应的性能指标输入至第三神经网络模型,输出该第三向量与该性能指标之间的映射关系。
在一些实施例中,预测模块902用于:根据该映射关系,确定至少一个第四向量;将该至少一个第四向量从该连续的参数空间映射至该离散的参数空间,获取该至少一个第二向量。
在一些实施例中,预测模块902用于:采用梯度更新的方式,在该映射关系中,确定该至少一个第四向量。
在一些实施例中,预测模块902用于:将该至少一个第四向量分别输入至第四神经网络模型,输出该至少一个第二向量,该第四神经网络模型用于将每个第四向量从连续的参数空间映射至离散的参数空间。
在一些实施例中,获取模块901还用于:判断是否满足预设条件,若不满足预设条件, 则将该至少一个第二向量和该至少一个第一向量作为至少一个第一向量,执行该获取至少一个第一向量和每个第一向量对应的性能指标的步骤。
在一些实施例中,预测模块902用于:将该至少一个第一向量分别输入至第五神经网络模型,输出该至少一个第三向量,该第五神经网络模型用于将每个第一向量从离散的参数空间映射至连续的参数空间。
在一些实施例中,获取模块901用于:在数据增强策略的搜索空间内,随机采样,获取该至少一个第一向量。
在一些实施例中,该装置还包括:收发模块904。该收发模块904用于向测试装置发送神经网络模型配置信息,该神经网络模型配置信息用于配置该第一神经网络模型。收发模块904还用于接收该测试装置发送的该第一神经网络模型的性能指标。
在一些实施例中,该装置还包括:预处理模块905和训练模块906。
预处理模块905,用于使用该至少一组目标增强策略对该第一训练数据进行增强处理,获取目标训练数据。训练模块906用于使用该目标训练数据对初始神经网络模型进行训练,获取该目标神经网络模型。收发模块904还用于发送目标模型配置信息,该目标模型配置信息用于配置该目标神经网络模型。
可选地,数据增强装置900也可以同时具有方法实施例中的其它功能。类似说明可以参考前述方法实施例的描述。为避免重复,这里不再赘述。
可选地,获取模块901、预测模块902、增强策略确定模块903、预处理模块905和训练模块906可以是处理器,收发模块904可以是收发器。收发器包括接收器和发射器,同时具有发送和接收的功能。
可选地,获取模块901、预测模块902、增强策略确定模块903、预处理模块905和训练模块906可以是一个处理装置或多个处理装置,处理装置的功能可以部分或全部通过软件实现。
在一种可能的实现方式中,处理装置的功能可以部分或全部通过软件实现。此时,处理装置可以包括存储器和处理器。其中,存储器用于存储计算机程序,处理器读取并执行存储器中存储的计算机程序,以执行各方法实施例中的步骤。
可选地,在一种可能的实现方式中,处理装置包括处理器。用于存储计算机程序的存储器位于处理装置之外,处理器通过电路/电线与存储器连接,以读取并执行存储器中存储的计算机程序。
在另一个实施例中,数据增强装置900可以为芯片。此时,收发模块904具体可以为通信接口或者收发电路。
参见图10,图10为本申请提供的电子设备1000的示意性结构图。如图10所示,电子设备1000包括处理器1001和收发器1002。
可选地,电子设备1000还包括存储器1003。其中,处理器1001、收发器1002和存储器1003之间可以通过内部连接通路互相通信,传递控制信号和/或数据信号。
其中,存储器1003用于存储计算机程序。处理器1001用于执行存储器1703中存储的计算机程序,从而实现上述装置实施例中数据增强装置900的各功能。
具体地,处理器1001可以用于执行装置实施例(例如,图9)中描述的由获取模块901、预测模块902、增强策略确定模块903、预处理模块905和训练模块906执行的操作和/或 处理,而收发器1002用于执行由收发模块904执行操作和/处理。
可选地,存储器1003也可以集成在处理器1001中,或者独立于处理器1001。
本实施例的电子设备可以执行上述方法实施例的数据增强方法,其技术原理和技术效果可以参见上述实施例的解释说明,此处不再赘述。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被计算机执行时,使得计算机执行上述任一方法实施例中的步骤和/或处理。
本申请还提供一种计算机程序产品,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述任一方法实施例中的步骤和/或处理。
本申请还提供一种芯片,所述芯片包括处理器。用于存储计算机程序的存储器独立于芯片而设置,处理器用于执行存储器中存储的计算机程序,以执行任一方法实施例中的步骤和/或处理。
进一步地,所述芯片还可以包括存储器和通信接口。所述通信接口可以是输入/输出接口、管脚或输入/输出电路等。
以上各实施例中提及的处理器可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以 硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (28)

  1. 一种数据增强方法,其特征在于,包括:
    获取第一训练数据、至少一个第一向量和每个所述第一向量对应的性能指标,每个所述第一向量用于表示一组第一增强策略,所述每个第一向量对应的性能指标包括第一神经网络模型的性能指标,所述第一神经网络模型是由第二训练数据训练得到的,所述第二训练数据为使用所述第一增强策略对所述第一训练数据进行增强处理后得到的训练数据;
    根据所述至少一个第一向量和所述至少一个第一向量对应的性能指标,确定至少一个第二向量,每个所述第二向量用于表示一组第二增强策略;
    根据所述至少一个第一向量对应的性能指标和所述至少一个第二向量对应的性能指标,确定至少一个目标向量,每个第二向量对应的性能指标包括第二神经网络模型的性能指标,所述第二神经网络模型是由第三训练数据训练得到的,所述第三训练数据为使用所述第二增强策略对所述第一训练数据进行增强处理后得到的训练数据;
    其中,所述至少一个目标向量对应的性能指标高于所述至少一个第一向量和所述至少一个第二向量中除所述至少一个目标向量之外的其他向量对应的性能指标,每个所述目标向量表示一组目标增强策略,所述至少一个目标向量表示的至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,所述目标训练数据用于训练得到目标神经网络模型。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述至少一个第一向量和所述至少一个第一向量对应的性能指标,确定至少一个第二向量,包括:
    将所述至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量;
    根据所述至少一个第三向量和所述至少一个第一向量对应的性能指标,确定至少一个第二向量。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述至少一个第三向量和所述至少一个第一向量对应的性能指标,确定至少一个第二向量,包括:
    根据所述至少一个第三向量和所述至少一个第一向量对应的性能指标,确定所述第三向量与所述至少一个第一向量对应的性能指标之间的映射关系;
    根据所述映射关系,确定至少一个第二向量。
  4. 根据权利要求3所述的方法,其特征在于,所述根据所述至少一个第三向量和所述至少一个第一向量对应的性能指标,确定所述第三向量与所述至少一个第一向量对应的性能指标之间的映射关系,包括:
    将所述至少一个第三向量和所述至少一个第一向量对应的性能指标输入至第三神经网络模型,输出所述第三向量与所述性能指标之间的映射关系。
  5. 根据权利要求3或4所述的方法,其特征在于,所述根据所述映射关系,确定至少一个第二向量,包括:
    根据所述映射关系,确定至少一个第四向量;
    将所述至少一个第四向量从所述连续的参数空间映射至所述离散的参数空间,获取所述至少一个第二向量。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述映射关系,确定至少一个 第四向量,包括:
    采用梯度更新的方式,在所述映射关系中,确定所述至少一个第四向量。
  7. 根据权利要求5或6所述的方法,其特征在于,所述将所述至少一个第四向量从所述连续的参数空间映射至所述离散的参数空间,获取所述至少一个第二向量,包括:
    将所述至少一个第四向量分别输入至第四神经网络模型,输出所述至少一个第二向量,所述第四神经网络模型用于将每个第四向量从连续的参数空间映射至离散的参数空间。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述方法还包括:
    判断是否满足预设条件,若不满足预设条件,则将所述至少一个第二向量和所述至少一个第一向量作为至少一个第一向量,执行所述获取至少一个第一向量和每个所述第一向量对应的性能指标的步骤。
  9. 根据权利要求2至7任一项所述的方法,其特征在于,所述将所述至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量,包括:
    将所述至少一个第一向量分别输入至第五神经网络模型,输出所述至少一个第三向量,所述第五神经网络模型用于将每个第一向量从离散的参数空间映射至连续的参数空间。
  10. 根据权利要求1至7任一项所述的方法,其特征在于,所述获取至少一个第一向量,包括:
    在数据增强策略的搜索空间内,随机采样,获取所述至少一个第一向量。
  11. 根据权利要求1至10任一项所述的方法,其特征在于,所述方法还包括:
    向测试装置发送神经网络模型配置信息,所述神经网络模型配置信息用于配置所述第一神经网络模型;
    接收所述测试装置发送的所述第一神经网络模型的性能指标。
  12. 根据权利要求1至11任一项所述的方法,其特征在于,所述方法还包括:
    使用所述至少一组目标增强策略对所述第一训练数据进行增强处理,获取目标训练数据;
    使用所述目标训练数据对初始神经网络模型进行训练,获取所述目标神经网络模型;
    发送目标模型配置信息,所述目标模型配置信息用于配置所述目标神经网络模型。
  13. 一种数据增强装置,其特征在于,包括:
    获取模块,用于获取第一训练数据、至少一个第一向量和每个所述第一向量对应的性能指标,每个第一向量用于表示一组第一增强策略,所述每个第一向量对应的性能指标包括第一神经网络模型的性能指标,所述第一神经网络模型是由第二训练数据训练得到的,所述第二训练数据为使用所述第一增强策略对所述第一训练数据进行增强处理后得到的训练数据;
    预测模块,用于根据所述至少一个第一向量和所述至少一个第一向量对应的性能指标,确定至少一个第二向量,每个所述第二向量用于表示一组第二增强策略;
    增强策略确定模块,用于根据所述至少一个第一向量对应的性能指标和所述至少一个第二向量对应的性能指标,确定至少一个目标向量,每个第二向量对应的性能指标包括第二神经网络模型的性能指标,所述第二神经网络模型是由第三训练数据训练得到的,所述第三训练数据为使用所述第二增强策略对所述第一训练数据进行增强处理后得到的训练 数据;
    其中,所述至少一个目标向量对应的性能指标高于所述至少一个第一向量和所述至少一个第二向量中除所述至少一个目标向量之外的其他向量对应的性能指标,每个所述目标向量表示一组目标增强策略,所述至少一个目标向量表示的至少一组目标增强策略用于对第一训练数据进行增强处理获取目标训练数据,所述目标训练数据用于训练得到目标神经网络模型。
  14. 根据权利要求13所述的装置,其特征在于,所述预测模块用于:将所述至少一个第一向量从离散的参数空间映射至连续的参数空间,获取至少一个第三向量;根据所述至少一个第三向量和所述至少一个第一向量对应的性能指标,确定至少一个第二向量。
  15. 根据权利要求14所述的装置,其特征在于,所述预测模块用于:根据所述至少一个第三向量和所述至少一个第一向量对应的性能指标,确定所述第三向量与所述至少一个第一向量对应的性能指标之间的映射关系;根据所述映射关系,确定至少一个第二向量。
  16. 根据权利要求15所述的装置,其特征在于,所述预测模块用于:将所述至少一个第三向量和所述至少一个第一向量对应的性能指标输入至第三神经网络模型,输出所述第三向量与所述性能指标之间的映射关系。
  17. 根据权利要求15或16所述的装置,其特征在于,所述预测模块用于:根据所述映射关系,确定至少一个第四向量;将所述至少一个第四向量从所述连续的参数空间映射至所述离散的参数空间,获取所述至少一个第二向量。
  18. 根据权利要求17所述的装置,其特征在于,所述预测模块用于:采用梯度更新的方式,在所述映射关系中,确定所述至少一个第四向量。
  19. 根据权利要求17或18所述的装置,其特征在于,所述预测模块用于:将所述至少一个第四向量分别输入至第四神经网络模型,输出所述至少一个第二向量,所述第四神经网络模型用于将每个第四向量从连续的参数空间映射至离散的参数空间。
  20. 根据权利要求13至19任一项所述的装置,其特征在于,所述获取模块还用于:判断是否满足预设条件,若不满足预设条件,则将所述至少一个第二向量和所述至少一个第一向量作为至少一个第一向量,执行所述获取至少一个第一向量和每个所述第一向量对应的性能指标的步骤。
  21. 根据权利要求14至19任一项所述的装置,其特征在于,所述预测模块用于:将所述至少一个第一向量分别输入至第五神经网络模型,输出所述至少一个第三向量,所述第五神经网络模型用于将每个第一向量从离散的参数空间映射至连续的参数空间。
  22. 根据权利要求13至19任一项所述的装置,其特征在于,所述获取模块用于:在数据增强策略的搜索空间内,随机采样,获取所述至少一个第一向量。
  23. 根据权利要求13至22任一项所述的装置,其特征在于,所述装置还包括:收发模块;
    所述收发模块用于向测试装置发送神经网络模型配置信息,所述神经网络模型配置信息用于配置所述第一神经网络模型;
    所述收发模块还用于接收所述测试装置发送的所述第一神经网络模型的性能指标。
  24. 根据权利要求13至23任一项所述的装置,其特征在于,所述装置还包括:预处理模块和训练模块;
    所述预处理模块,用于使用所述至少一组目标增强策略对所述第一训练数据进行增强处理,获取目标训练数据;
    所述训练模块用于使用所述目标训练数据对初始神经网络模型进行训练,获取所述目标神经网络模型;
    收发模块还用于发送目标模型配置信息,所述目标模型配置信息用于配置所述目标神经网络模型。
  25. 一种电子设备,其特征在于,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-12中任一项所述的方法。
  26. 一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行权利要求1-12中任一项所述的方法。
  27. 一种计算机程序产品,其特征在于,所述计算机程序产品包括指令,在计算机上运行时,使得计算机执行如权利要求1-12中任一项所述的方法。
  28. 一种芯片,其特征在于,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行如权利要求1-12中任一项所述的方法。
PCT/CN2020/125338 2020-02-25 2020-10-30 数据增强方法和装置 WO2021169366A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010117866.6A CN113379045B (zh) 2020-02-25 2020-02-25 数据增强方法和装置
CN202010117866.6 2020-02-25

Publications (1)

Publication Number Publication Date
WO2021169366A1 true WO2021169366A1 (zh) 2021-09-02

Family

ID=77490628

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/125338 WO2021169366A1 (zh) 2020-02-25 2020-10-30 数据增强方法和装置

Country Status (2)

Country Link
CN (1) CN113379045B (zh)
WO (1) WO2021169366A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837279A (zh) * 2021-09-24 2021-12-24 苏州浪潮智能科技有限公司 一种数据增强方法、系统、设备及计算机可读存储介质
CN117132978A (zh) * 2023-10-27 2023-11-28 深圳市敏视睿行智能科技有限公司 一种微生物图像识别系统及方法

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113836330A (zh) * 2021-09-13 2021-12-24 清华大学深圳国际研究生院 基于生成对抗性自动增强网络的图像检索方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7389208B1 (en) * 2000-06-30 2008-06-17 Accord Solutions, Inc. System and method for dynamic knowledge construction
CN109639479A (zh) * 2018-12-07 2019-04-16 北京邮电大学 基于生成对抗网络的网络流量数据增强方法及装置
CN109902798A (zh) * 2018-05-31 2019-06-18 华为技术有限公司 深度神经网络的训练方法和装置
CN110222824A (zh) * 2019-06-05 2019-09-10 中国科学院自动化研究所 智能算法模型自主生成及进化方法、系统、装置
CN110782015A (zh) * 2019-10-25 2020-02-11 腾讯科技(深圳)有限公司 神经网络的网络结构优化器的训练方法、装置及存储介质
CN110807109A (zh) * 2019-11-08 2020-02-18 北京金山云网络技术有限公司 数据增强策略的生成方法、数据增强方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7593906B2 (en) * 2006-07-31 2009-09-22 Microsoft Corporation Bayesian probability accuracy improvements for web traffic predictions
US20180247218A1 (en) * 2017-02-24 2018-08-30 Accenture Global Solutions Limited Machine learning for preventive assurance and recovery action optimization
JP7017640B2 (ja) * 2018-05-18 2022-02-08 グーグル エルエルシー データ拡張方策の学習
CN110348509B (zh) * 2019-07-08 2021-12-14 睿魔智能科技(深圳)有限公司 数据增广参数的调整方法、装置、设备及存储介质
CN110555526B (zh) * 2019-08-20 2022-07-29 北京迈格威科技有限公司 神经网络模型训练方法、图像识别方法和装置
CN110796248A (zh) * 2019-08-27 2020-02-14 腾讯科技(深圳)有限公司 数据增强的方法、装置、设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7389208B1 (en) * 2000-06-30 2008-06-17 Accord Solutions, Inc. System and method for dynamic knowledge construction
CN109902798A (zh) * 2018-05-31 2019-06-18 华为技术有限公司 深度神经网络的训练方法和装置
CN109639479A (zh) * 2018-12-07 2019-04-16 北京邮电大学 基于生成对抗网络的网络流量数据增强方法及装置
CN110222824A (zh) * 2019-06-05 2019-09-10 中国科学院自动化研究所 智能算法模型自主生成及进化方法、系统、装置
CN110782015A (zh) * 2019-10-25 2020-02-11 腾讯科技(深圳)有限公司 神经网络的网络结构优化器的训练方法、装置及存储介质
CN110807109A (zh) * 2019-11-08 2020-02-18 北京金山云网络技术有限公司 数据增强策略的生成方法、数据增强方法和装置

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837279A (zh) * 2021-09-24 2021-12-24 苏州浪潮智能科技有限公司 一种数据增强方法、系统、设备及计算机可读存储介质
CN113837279B (zh) * 2021-09-24 2023-08-08 苏州浪潮智能科技有限公司 一种数据增强方法、系统、设备及计算机可读存储介质
CN117132978A (zh) * 2023-10-27 2023-11-28 深圳市敏视睿行智能科技有限公司 一种微生物图像识别系统及方法
CN117132978B (zh) * 2023-10-27 2024-02-20 深圳市敏视睿行智能科技有限公司 一种微生物图像识别系统及方法

Also Published As

Publication number Publication date
CN113379045A (zh) 2021-09-10
CN113379045B (zh) 2022-08-09

Similar Documents

Publication Publication Date Title
US20210012198A1 (en) Method for training deep neural network and apparatus
CN111797893B (zh) 一种神经网络的训练方法、图像分类系统及相关设备
WO2021169366A1 (zh) 数据增强方法和装置
CN113011282A (zh) 图数据处理方法、装置、电子设备及计算机存储介质
WO2022156561A1 (zh) 一种自然语言处理方法以及装置
WO2024041479A1 (zh) 一种数据处理方法及其装置
CN113792871A (zh) 神经网络训练方法、目标识别方法、装置和电子设备
CN112419326B (zh) 图像分割数据处理方法、装置、设备及存储介质
CN113095475A (zh) 一种神经网络的训练方法、图像处理方法以及相关设备
WO2022111387A1 (zh) 一种数据处理方法及相关装置
WO2024001806A1 (zh) 一种基于联邦学习的数据价值评估方法及其相关设备
WO2023231753A1 (zh) 一种神经网络的训练方法、数据的处理方法以及设备
WO2021036397A1 (zh) 目标神经网络模型的生成方法和装置
WO2024083121A1 (zh) 一种数据处理方法及其装置
WO2022100607A1 (zh) 一种神经网络结构确定方法及其装置
WO2024046144A1 (zh) 一种视频处理方法及其相关设备
CN113627421A (zh) 一种图像处理方法、模型的训练方法以及相关设备
WO2023185541A1 (zh) 一种模型训练方法及其相关设备
WO2023197910A1 (zh) 一种用户行为预测方法及其相关设备
CN116431827A (zh) 信息处理方法、装置、存储介质及计算机设备
CN114281933A (zh) 文本处理方法、装置、计算机设备及存储介质
CN114237861A (zh) 一种数据处理方法及其设备
WO2023231796A1 (zh) 一种视觉任务处理方法及其相关设备
WO2023051678A1 (zh) 一种推荐方法及相关装置
WO2023236900A1 (zh) 一种项目推荐方法及其相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921165

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20921165

Country of ref document: EP

Kind code of ref document: A1