CN113010762A - Data enhancement method and device, storage medium and electronic equipment - Google Patents

Data enhancement method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN113010762A
CN113010762A CN202110227674.5A CN202110227674A CN113010762A CN 113010762 A CN113010762 A CN 113010762A CN 202110227674 A CN202110227674 A CN 202110227674A CN 113010762 A CN113010762 A CN 113010762A
Authority
CN
China
Prior art keywords
enhancement
target
strategy
determining
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110227674.5A
Other languages
Chinese (zh)
Inventor
赵元
侯峦轩
赫然
沈海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Didi Infinity Technology and Development Co Ltd
Original Assignee
Beijing Didi Infinity Technology and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology and Development Co Ltd filed Critical Beijing Didi Infinity Technology and Development Co Ltd
Priority to CN202110227674.5A priority Critical patent/CN113010762A/en
Publication of CN113010762A publication Critical patent/CN113010762A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The embodiment of the invention discloses a data enhancement method, a data enhancement device, a storage medium and electronic equipment. And determining at least one enhancement operation with a corresponding attribute value corresponding to each target enhancement strategy in a preset enhancement operation set. And sequentially executing the target enhancement strategies according to the execution sequence of the target enhancement strategies and the included enhancement operation, and performing data enhancement on the original data. The embodiment of the invention can realize automatic data enhancement, reduce the calculation cost and improve the processing speed in the data enhancement process, and avoid the generation of noise in the data enhancement process based on the manually set parameters.

Description

Data enhancement method and device, storage medium and electronic equipment
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data enhancement method and apparatus, a storage medium, and an electronic device.
Background
At present, the development of artificial intelligence technology is accelerated by the rapid development of deep learning, and the multilayer neural network in the deep learning can extract relevant semantic features through a large amount of training data. However, overfitting can occur in the case of insufficient training data, resulting in a model that will not generalize the undiscovered examples. For the above problem of insufficient training data, data enhancement is usually selected to solve, i.e. new data is obtained by performing conversion processing on the original data set. Most of data enhancement schemes in the prior art are manually set parameters for regulation, and manual design of data enhancement operations and parameters usually requires a large amount of engineering experience for parameter adjustment, so that the calculation cost is high and the processing speed is low. Meanwhile, improper parameter adjustment can bring noisy data to the training of the neural network, and further brings negative influence to reduce the performance of the model.
Disclosure of Invention
In view of this, embodiments of the present invention provide a data enhancement method, an apparatus, a storage medium, and an electronic device, which aim to reduce the computation overhead of a data enhancement process and improve the processing speed and the noise immunity.
In a first aspect, an embodiment of the present invention provides a data enhancement method, where the method includes:
determining an enhancement strategy search space, wherein the enhancement strategy search space comprises a plurality of enhancement strategies;
determining a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy;
determining the execution sequence of each target enhancement strategy;
determining at least one enhancement operation corresponding to each target enhancement strategy in a preset enhancement operation set, wherein each enhancement operation has a corresponding attribute value;
and sequentially executing each target enhancement strategy based on the execution sequence and the included enhancement operation with the corresponding attribute value so as to perform data enhancement on the original data.
In a second aspect, an embodiment of the present invention provides a data enhancement apparatus, where the apparatus includes:
a search space determination module, configured to determine an enhancement policy search space, where the enhancement policy search space includes a plurality of enhancement policies;
the strategy determining module is used for determining a preset number of target enhancement strategies in the enhancement strategy searching space according to a preset searching strategy;
the sequence determining module is used for determining the execution sequence of each target enhancement strategy;
an operation determining module, configured to determine, in a preset enhancement operation set, at least one enhancement operation corresponding to each target enhancement policy, where each enhancement operation has a corresponding attribute value;
and the data enhancement module is used for sequentially executing each target enhancement strategy based on the execution sequence and the included enhancement operation with the corresponding attribute value so as to perform data enhancement on the original data.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium for storing computer program instructions, which when executed by a processor implement the method according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, the memory being configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method according to the first aspect.
The embodiment of the invention searches the target enhancement strategy by determining the enhancement strategy search space comprising a plurality of enhancement strategies and determining the execution sequence of the plurality of target enhancement strategies obtained by searching. And determining at least one enhancement operation with a corresponding attribute value corresponding to each target enhancement strategy in a preset enhancement operation set. And sequentially executing the target enhancement strategies according to the execution sequence of the target enhancement strategies and the included enhancement operation, and performing data enhancement on the original data. The embodiment of the invention can realize automatic data enhancement, reduce the calculation cost in the data enhancement process, improve the processing speed and avoid the noise in the data enhancement process.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of a data enhancement method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a process for determining a target enhancement policy according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a data enhancement process according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a target enhancement strategy according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating an application process of a data enhancement method according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a data enhancement apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.
Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".
In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
The data enhancement method of the embodiment of the invention can be realized by a server for deploying and training a deep learning model. The server determines original data used for training the deep learning model, and the original data is subjected to data enhancement based on the data enhancement method, so that the result of each data enhancement is used as a sample of the deep learning model to train the deep learning model. Further, after the data enhancement result is determined each time, the server optimizes parameters applied in the data enhancement process, so as to perform the next data enhancement according to the optimized parameters. The original data can be sent to the server through the terminal equipment. The terminal device may be a general data processing terminal capable of running a computer program and having a communication function, such as a smart phone, a tablet computer, or a notebook computer. The server may be a single server or a cluster of servers configured in a distributed manner.
Fig. 1 is a flowchart of a data enhancement method according to an embodiment of the present invention. As shown in fig. 1, the data enhancement method includes the steps of:
and step S100, determining an enhancement strategy search space.
Specifically, the enhanced policy search space is determined by the server and the target task, and comprises a plurality of enhanced policies. The enhancement strategy is used for data enhancement of the original data. The enhanced policy search space may be determined, for example, by the server retrieving a pre-stored set of enhanced policies from a local storage space or a connected remote storage. The server selects a corresponding enhancement policy set as an enhancement policy search space according to an original data format which needs data enhancement. Optionally, the enhancement policy may further include one or more enhancement operations, and each of the enhancement operations has a corresponding attribute value. When data enhancement is carried out on original data through an enhancement strategy, the original data are processed based on corresponding attribute values through enhancement operations respectively.
For example, when the original data is image data, the enhancement strategy search space includes a plurality of image enhancement strategies for performing data enhancement on the image. Each image enhancement strategy comprises a plurality of image enhancement operations and corresponding attribute values. The image enhancement operation may be, for example, an operation for image processing such as cropping, rotating, adjusting pixels, adjusting contrast, and the like.
Step S200, determining a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy.
Specifically, after determining an enhancement policy search space, the server searches in the enhancement policy search space according to a preset search policy to determine at least one enhancement policy suitable for data enhancement as a target enhancement policy. In an implementation manner of the embodiment of the present invention, the preset search strategies further include a preset number, that is, a predetermined number of enhancement strategies are selected as target enhancement strategies in the enhancement strategy search space. For example, when the predetermined number is 3, the server selects 3 target enhancement policies in the enhancement policy search space according to a preset search policy.
In the embodiment of the present invention, the process of determining the target enhancement policy may include the following steps:
step S210, determining a preference parameter and a first perturbation parameter corresponding to each of the enhancement policies.
Specifically, each of the enhancement policies in the enhancement policy search space has a corresponding preference parameter. The preference parameters may be predetermined by the server. For example, when the server needs to perform a data enhancement process for multiple times, the preference parameter corresponding to each enhancement policy in the current data enhancement method is obtained by optimizing based on the result of the last data enhancement. Meanwhile, when the current data enhancement process is carried out, a first disturbance parameter corresponding to each enhancement strategy is randomly generated. Optionally, the first perturbation parameter is a random value between 0 and 1, and is used for adding perturbation to the determination process of the target enhancement strategy.
Step S220, determining a first probability value corresponding to each enhancement strategy according to the preference parameter and the first disturbance parameter.
Specifically, after the preference parameter and the first disturbance parameter corresponding to each enhancement strategy are determined, the preference parameter and the first disturbance parameter corresponding to each enhancement strategy are input into a Gumbel-Softmax function (Gunbel logistic regression function) for processing, so that the first probability value corresponding to each enhancement strategy is determined according to a preset distribution rule. After the preference parameter and the first disturbance parameter are input into the Gunn-Bell logistic regression function, a first intermediate value is determined according to the preference parameter and the first disturbance parameter corresponding to all the enhancement strategies in the enhancement strategy search space. And for the enhancement strategies needing to determine the first probability value currently, determining a corresponding second intermediate value according to the corresponding preference parameter and the first disturbance parameter, and determining the first probability value corresponding to each enhancement strategy according to the ratio of the corresponding second intermediate value to the first intermediate value.
In the embodiment of the present invention, the distribution rule is as follows:
Figure BDA0002957145370000051
where c is a first probability value, S is a current enhancement strategy, S is an enhancement strategy search space, α is a preference parameter, τ is a temperature of a Gumbel-Softmax function, g ═ log (-log (u)), and u is a first perturbation parameter. Therefore, Σ in the above distribution rules′∈Sexp((αs′+gs′) The/tau) is a first intermediate value determined according to preference parameters and first disturbance parameters corresponding to all enhancement strategies in the enhancement strategies and the control, exp ((alpha)s+gs) And/tau) is a second intermediate value determined according to the preference parameter and the first disturbance parameter corresponding to the current enhancement strategy. And calculating the ratio of the second intermediate value to the first intermediate value to obtain a first probability value corresponding to the current enhancement strategy.
And step S230, determining a preset number of target enhancement strategies according to the corresponding first probability values.
Specifically, the search strategies include a predetermined number, and after the first probability value corresponding to each enhancement strategy is obtained through calculation in step S220, the enhancement strategies are sorted according to the corresponding first probability value, so that the predetermined number of enhancement strategies with the largest first probability value are selected as the target enhancement strategies in the enhancement strategy search space. The following description will be given by taking an example that the enhancement strategy search space includes an enhancement strategy 1, an enhancement strategy 2, an enhancement strategy 3, an enhancement strategy 4, and an enhancement strategy 5, and first probability values corresponding to the respective enhancement strategies in sequence are 0.54, 0.39, 0.81, 0.63, and 0.17, respectively. When the predetermined number determined in the search strategies is 3, the server sorts the enhancement strategies according to the corresponding first probability values from large to small to obtain enhancement strategies 3, 4, 1, 2 and 5, so that 3 enhancement strategies with the maximum first probability values are determined: enhancement policy 3, enhancement policy 4, and enhancement policy 1 are targeted enhancement policies.
Furthermore, the search policy in the embodiment of the present invention may also set a search condition, that is, an enhancement policy that satisfies the search condition in the enhancement policy search space is used as a target enhancement policy. For example, a probability value threshold may be set, and an enhancement policy in the enhancement policy search space, in which a corresponding first probability value is greater than the probability value threshold, may be used as a target enhancement policy.
Fig. 2 is a schematic diagram of a process of determining a target enhancement policy according to an embodiment of the present invention. As shown in fig. 2, the enhancement strategy search space 20 includes K enhancement strategies 21. The server searches the enhancement strategy search space 20 according to a preset search strategy to determine a predetermined number k of enhancement strategies 21 as target enhancement strategies 22.
And step S300, determining the execution sequence of each target enhancement strategy.
Specifically, in order to improve the efficiency of each data enhancement process, when the server determines a plurality of target enhancement policies, it is necessary to further determine the execution order of each of the target enhancement policies. Optionally, the execution order may be determined by sorting the corresponding first probability values from large to small, that is, sorting the target enhancement policies from large to small according to the corresponding first probability values, and using the sorting order as the execution order of the target enhancement policies. Therefore, when the enhancement strategies in the enhancement strategy search space are sorted from large to small based on the corresponding first probability values, and a plurality of enhancement strategies with the maximum corresponding probability values are determined as the target enhancement strategies, the sorting order of the determined target enhancement strategies can be directly used as the execution order of the target enhancement strategies.
Still take the example that the enhancement strategy search space includes enhancement strategy 1, enhancement strategy 2, enhancement strategy 3, enhancement strategy 4 and enhancement strategy 5, and the first probability values corresponding to the respective enhancement strategies in sequence are 0.54, 0.39, 0.81, 0.63 and 0.17, respectively. When the predetermined number determined in the search strategies is 3, the server sorts the enhancement strategies according to the corresponding first probability values from large to small to obtain enhancement strategies 3, 4, 1, 2 and 5, so that 3 enhancement strategies with the maximum first probability values are determined: enhancement policy 3, enhancement policy 4, and enhancement policy 1 are targeted enhancement policies. The execution sequence of each target enhancement strategy is an enhancement strategy 3, an enhancement strategy 4 and an enhancement strategy 1 in sequence.
Step S400, determining at least one enhancement operation corresponding to each of the target enhancement policies in a preset enhancement operation set.
Specifically, the server stores in advance an enhancement operation set including a plurality of enhancement operations, each of which has a corresponding attribute value. The enhancement operation is used for processing original data, and the attribute value is used for representing attributes such as execution probability, operation degree and the like of the corresponding enhancement operation. Optionally, the set of enhanced operations corresponds to a raw data format to be processed. For example, when the original data is text data, the enhancement operation set includes at least one text enhancement operation for performing data enhancement on the text data. When the original data is audio data, the enhancement operation set comprises at least one audio enhancement operation for performing data enhancement on the audio data. When the original data is picture data, the enhancement operation set comprises at least one image data enhancement operation for performing data enhancement on the image data.
Taking the original data as image data as an example, the enhancement operation set may include image processing such as horizontal cropping, vertical cropping, rotating, equalizing, inverting pixel values of an image, adjusting contrast of an image, adjusting saturation of an image, and adjusting brightness of an image. The attribute values may include attribute probability values and amplitude values for each of the enhancement operations. The attribute probability value is used for representing the probability of processing the original data by the corresponding enhancement operation, and the amplitude value is used for representing the processing intensity of the corresponding enhancement operation when the original data is processed. Because each enhancement operation can adjust different parameters corresponding to the original data when performing data enhancement, for example, adjusting the angle of the image when rotating the image, and adjusting the pixel value of the image when inverting the pixel value. Therefore, the amplitude value of each enhancement operation can be normalized to obtain a value between 0 and 10, and the value is only used for representing the processing intensity of the corresponding enhancement operation.
In this embodiment of the present invention, the determining the enhancement operation corresponding to each of the target enhancement policies includes the following steps:
and step S410, determining the enhancement operation quantity corresponding to each target enhancement strategy.
Specifically, each of the target enhancement policies has at least one enhancement operation corresponding thereto, and in the embodiment of the present invention, different target enhancement policies may include the same number of enhancement operations, or may include different numbers of enhancement operations. After determining the target enhancement strategy, the server determines the number of enhancement operations corresponding to each target, so as to further determine the corresponding number of enhancement operations in the enhancement operation set.
Step S420, selecting a number of enhancement operations from the enhancement operation set according to a predetermined filtering rule, so as to determine the enhancement operation corresponding to each of the target enhancement policies.
Specifically, the server sets a screening rule in advance, and selects, for each of the target enhancement policies, a corresponding number of enhancement operations from the enhancement operation set according to the screening rule. In the embodiment of the present invention, the enhancement operations corresponding to the target enhancement policies may be sequentially selected according to the execution order determined in step S300 and the filtering rule. The screening rule may be configured to determine a second probability value corresponding to each of the enhancement operations, and sequentially select a plurality of enhancement operations with the highest corresponding second probability values as corresponding enhancement operations.
Therefore, the determining of the enhancement operation procedure corresponding to each of the enhancement policies includes the following steps:
and step S421, sequentially determining the target enhancement strategy as the current target enhancement strategy according to the execution sequence.
Specifically, the server sequentially determines each of the target enhancement policies as a current target enhancement policy according to the execution sequence determined in step S300, so as to select a corresponding number of enhancement operations from the set of enhancement operations. And then determining the next target enhancement strategy as the current target enhancement strategy, so as to select the enhancement operations with the number corresponding to the enhancement operations from the enhancement operations which are not selected in the enhancement operation set until determining the enhancement operations corresponding to all the target enhancement strategies.
And step S422, determining a second disturbance parameter corresponding to each enhancement operation.
Specifically, when determining the second probability value corresponding to each of the enhancing operations, the server determines the second disturbance parameter corresponding to each of the enhancing operations. The second perturbation parameter can be a random value between 0 and 1, and is used for adding perturbation to the determination process of the target enhancement strategy. The method for determining the second disturbance parameter is to randomly generate a corresponding value between 0 and 1 for each enhancement operation.
Step S423, determining a second probability value corresponding to each of the enhancement operations according to the corresponding attribute value and the second perturbation parameter.
Specifically, the second probability value may be determined by calculating a bernoulli distribution result according to an attribute probability value in the attribute value corresponding to each of the enhancement operations and the second perturbation parameter. The formula for calculating the second probability value is as follows:
Figure BDA0002957145370000091
wherein σ is a logistic regression function, b is a second probability value, β is an attribute probability value, u is a second disturbance parameter, and λ is a predetermined constant value. Therefore, for each of the enhancing operations, the determined attribute probability value and the second disturbance parameter are input into the above formula to determine the corresponding second probability value.
And step S424, selecting the enhancement operations with the number of enhancement operations as the enhancement operations corresponding to the current target enhancement policy according to the corresponding second probability value.
Specifically, after the second probability value corresponding to each enhancement operation is determined, the enhancement operations may be ranked according to the corresponding second probability value from large to small, so as to select the enhancement operation with the maximum second probability value, as the enhancement operation corresponding to the current target enhancement policy. And after selecting the enhancement operation corresponding to the current target enhancement strategy, the server deletes the selected enhancement operation from the sequencing result so as to determine the enhancement operation corresponding to the next sequence of target enhancement strategies according to the rest sequencing result.
Taking an example that the enhancement operation set includes an enhancement operation 1, an enhancement operation 2, an enhancement operation 3, an enhancement operation 4, an enhancement operation 5, and an enhancement operation 6, and second probability values corresponding to the respective enhancement operations in sequence are 0.54, 0.39, 0.81, 0.63, 0.17, and 0.69, respectively. And the server sorts the enhancement operations according to the corresponding second probability values from large to small to obtain enhancement operations 3, 6, 4, 1, 2 and 5. And when the number of the enhancement operations corresponding to the target enhancement strategies is 2 and the execution sequence is the target enhancement strategy 2, the target enhancement strategy 3 and the target enhancement strategy 1 in sequence. The server determines, according to the above sorting result, that the enhancement operation corresponding to the target enhancement policy 2 includes an enhancement operation 3 and an enhancement operation 6, the enhancement operation corresponding to the target enhancement policy 3 includes an enhancement operation 4 and an enhancement operation 1, and the enhancement operation corresponding to the target enhancement policy 1 includes an enhancement operation 2 and an enhancement operation 5.
Step S500, sequentially executing each of the target enhancement policies based on the execution sequence and the included enhancement operations with corresponding attribute values, so as to perform data enhancement on the original data.
Specifically, after determining a plurality of sequentially executed target enhancement policies for processing the original data and at least one enhancement operation included in each of the target enhancement policies, the server sequentially executes each of the target enhancement policies according to the execution order, so as to perform data enhancement on the original data based on the enhancement operation included in each of the target enhancement policies and the corresponding attribute value. That is, after determining the original data, the server sequentially executes each of the target enhancement policies on the original data based on the target enhancement policy execution order determined in step S300 to perform data enhancement. The method for executing the target enhancement strategy on the original data is to process the original data according to at least one enhancement operation and a corresponding attribute value included in the target enhancement strategy so as to determine a corresponding target processing result.
When the original data is processed through each enhancement operation, an attribute probability value and an amplitude value in an attribute value corresponding to the current enhancement operation need to be determined first. The server determines a first processing result according to the attribute probability value, the amplitude value and the original data of the current enhancement operation, wherein the first processing result is used for representing the possibility that the original data is enhanced by the enhancement operation and the amplitude of the data enhancement. And determining a second processing result according to the attribute probability value and the original data, wherein the second processing result is used for representing the possibility that the original data is not processed by the enhancement operation. And finally calculating the sum of the first processing result and the second processing result corresponding to each enhancement operation to determine a corresponding target processing result. The formula for determining the target processing result is as follows:
s(x)=bO(x;m)+(1-b)x
wherein x is original data, s (x) is data processed by the current enhancement operation, b is an attribute probability value, and m is an amplitude value. Therefore, bO (x; m) is the product of the probability of data enhancement by the current enhancement operation and the data value processed by the current operation as a first processing result. (1-b) x is the product of the probability of not being data enhanced by the current enhancement operation and the original data as a second processing result. And calculating the sum of the first processing result and the second processing result corresponding to the current enhancement operation to obtain a target processing result s (x). In this embodiment of the present invention, the server sequentially executes each of the target enhancement policies, so as to sequentially determine, through the formula for determining the target processing result, the target processing result obtained by processing the data through each enhancement operation in each of the target enhancement policies. And the target processing result after each time of processing is used as input data for carrying out the data enhancement process through the enhancement operation next time so as to determine the finally obtained data enhanced result.
Fig. 3 is a schematic diagram of a data enhancement process according to an embodiment of the present invention. As shown in fig. 3, after determining original data 30, the server searches k target enhancement strategies 31 in an enhancement strategy search space, so as to perform data enhancement on the original data 30 sequentially through each target enhancement strategy 31 according to an execution sequence, and obtain enhanced data 32. Each of the target enhancement policies 31 further includes at least one enhancement operation with a corresponding attribute value, which is used to process the original data.
Fig. 4 is a schematic diagram of a target enhancement strategy according to an embodiment of the present invention. As shown in fig. 4, the target enhancement strategy includes at least one enhancement operation 40, each of the enhancement operations 40 having a corresponding attribute value 41. For example, when the attribute values 41 include attribute probability values and amplitude values, the enhancement operation 40 and its corresponding attribute values 41 may be stored in the format of a triple (enhancement operation, attribute probability value, amplitude value).
Fig. 5 is a schematic diagram of an application process of the data enhancement method according to the embodiment of the present invention. As shown in fig. 5, the data enhancement method according to the embodiment of the present invention may be applied to a model training scenario.
Specifically, in the process of model training, the server determines original data 50 used for training the model, and performs data enhancement on the original data 50 by using the data enhancement method 51 according to the embodiment of the present invention to obtain enhanced data 52. The server performs model training, i.e. training the deep neural network in the model, according to the enhanced data 52. After the current training process is completed, the server optimizes parameters 54 required to be used in the data enhancement method 51 according to the embodiment of the present invention based on the loss in the current training process, where the parameters include preference parameters, attribute values corresponding to each enhancement operation, and the like. The parameters used by the data enhancement method are then re-determined for the next data enhancement method 51 based on the optimized parameters. The server re-determines the enhanced data 52 based on the present data enhancement method 51 to perform the training process of the neural network again until the model training process is completed.
The data enhancement method provided by the embodiment of the invention can automatically search a preset number of target enhancement strategies in the enhancement strategy search space, and process the original data after sequencing and confirming the content of each target enhancement strategy. The method and the device realize automatic data enhancement of the original data, and simultaneously can optimize parameters required by current data enhancement after the last data enhancement when the data enhancement process is carried out for multiple times, and the data enhancement process can be carried out without manually setting the parameters. Therefore, the calculation expense in the data enhancement process is reduced, the processing speed is improved, and the noise is avoided in the data enhancement process.
Fig. 6 is a schematic diagram of a data enhancement apparatus according to an embodiment of the present invention, as shown in fig. 6, the apparatus includes a search space determining module 60, a policy determining module 61, an order determining module 62, an operation determining module 63, and a data enhancement module 64.
In particular, the search space determination module 60 is configured to determine an enhancement strategy search space, which includes a plurality of enhancement strategies. The strategy determining module 61 is configured to determine a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy. The order determination module 62 is configured to determine an execution order of each of the target enhancement policies. The operation determining module 63 is configured to determine, in a preset enhancement operation set, at least one enhancement operation corresponding to each target enhancement policy, where each enhancement operation has a corresponding attribute value. The data enhancement module 64 is configured to sequentially execute each of the target enhancement policies based on the execution order and the included enhancement operations with corresponding attribute values to perform data enhancement on the original data.
Further, the policy determination module includes:
the parameter determining submodule is used for determining a preference parameter and a first disturbance parameter corresponding to each enhancement strategy;
the first probability determination submodule is used for determining a first probability value corresponding to each enhancement strategy according to the preference parameter and the first disturbance parameter;
and the strategy determining submodule is used for determining a preset number of target enhancement strategies according to the corresponding first probability value.
Further, the first probability determination submodule includes:
a first intermediate value determining unit, configured to determine a first intermediate value according to the preference parameters and the first perturbation parameters corresponding to all enhancement policies in the enhancement policy search space;
a second intermediate value determining unit, configured to determine, for each of the enhancement policies, a second intermediate value according to the corresponding preference parameter and the first perturbation parameter;
and the first probability determining unit is used for determining a first probability value corresponding to each enhancement strategy according to the ratio of the corresponding second intermediate value to the first intermediate value.
Further, the policy determination sub-module includes:
a first ordering unit for ordering each of the enhancement policies according to a corresponding first probability value;
a policy determination unit for determining a predetermined number of enhancement policies with the largest first probability value as target enhancement policies.
Further, the order determination module includes:
the second sequencing submodule is used for sequencing the target enhancement strategies from large to small according to the corresponding first probability value;
and the sequence determining submodule is used for taking the sequencing sequence as the execution sequence of each target enhancement strategy.
Further, the operation determination module includes:
the quantity determining submodule is used for determining the quantity of the enhancement operation corresponding to each target enhancement strategy;
and the operation determining submodule is used for selecting the enhancement operation with the number of enhancement operations in the enhancement operation set according to a preset screening rule so as to determine the enhancement operation corresponding to each target enhancement strategy.
Further, the operation determination submodule includes:
a current strategy determining unit, configured to sequentially determine, according to the execution order, a target enhancement strategy as a current target enhancement strategy;
a parameter determining unit, configured to determine a second disturbance parameter corresponding to each of the enhancement operations;
a second probability determining unit, configured to determine a second probability value corresponding to each enhancement operation according to the corresponding attribute value and a second disturbance parameter;
and the operation determining unit is used for selecting the enhancement operations with the number of enhancement operations as the enhancement operations corresponding to the current target enhancement strategy according to the corresponding second probability value.
Further, the attribute values comprise attribute probability values;
the second probability determination unit is specifically:
and the distribution determining subunit is used for calculating a Bernoulli distribution result according to the attribute probability value in the attribute value corresponding to each enhancement operation and the second disturbance parameter so as to determine a second probability value.
Further, the data enhancement module comprises:
the original data determining submodule is used for determining original data;
the data enhancement submodule is used for sequentially executing each target enhancement strategy on the original data based on the execution sequence so as to enhance the data;
the executing the target enhancement strategy on the original data specifically comprises the following steps:
and processing the original data according to at least one enhancement operation and a corresponding attribute value included in the target enhancement strategy to determine a corresponding target processing result.
Further, the attribute values include attribute probability values and amplitude values;
the data enhancer module includes:
a first processing unit, configured to, for each enhancement operation, determine a first processing result according to the attribute probability value, the amplitude value, and the raw data, where the first processing result is used to represent a possibility that the raw data is enhanced by the enhancement operation and an amplitude of the data enhancement;
a second processing unit, configured to determine a second processing result according to the attribute probability value and the raw data, where the second processing result is used to represent a possibility that the raw data is not processed by the enhancement operation;
and the third processing unit is used for calculating the sum of the first processing result and the second processing result corresponding to each enhancement operation so as to determine a corresponding target processing result.
The data enhancement device provided by the embodiment of the invention can automatically search a preset number of target enhancement strategies in the enhancement strategy search space, and process the original data after sequencing and confirming the content of each target enhancement strategy. The method and the device realize automatic data enhancement of the original data, and simultaneously can optimize parameters required by current data enhancement after the last data enhancement when the data enhancement process is carried out for multiple times, and the data enhancement process can be carried out without manually setting the parameters. Therefore, the calculation expense in the data enhancement process is reduced, the processing speed is improved, and the noise is avoided in the data enhancement process.
Fig. 7 is a schematic diagram of an electronic device according to an embodiment of the invention. As shown in fig. 7, the electronic device shown in fig. 7 is a general address query device, which includes a general computer hardware structure, which includes at least a processor 70 and a memory 71. The processor 70 and the memory 71 are connected by a bus 72. The memory 71 is adapted to store instructions or programs executable by the processor 70. Processor 70 may be a stand-alone microprocessor or may be a collection of one or more microprocessors. Thus, the processor 70 implements the processing of data and the control of other devices by executing instructions stored by the memory 71 to perform the method flows of embodiments of the present invention as described above. The bus 72 connects the above components together, as well as to a display controller 73 and a display device and an input/output (I/O) device 74. Input/output (I/O) devices 74 may be a mouse, keyboard, modem, network interface, touch input device, motion sensing input device, printer, and other devices known in the art. Typically, the input/output devices 74 are connected to the system through input/output (I/O) controllers 75.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, apparatus (device) or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may employ a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations of methods, apparatus (devices) and computer program products according to embodiments of the application. It will be understood that each flow in the flow diagrams can be implemented by computer program instructions.
These computer program instructions may be stored in a computer-readable memory that can direct a computer or other programmable vehicle dispatch device to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows.
These computer program instructions may also be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable vehicle scheduling apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable vehicle scheduling apparatus, create means for implementing the functions specified in the flowchart flow or flows.
Another embodiment of the invention is directed to a non-transitory storage medium storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be accomplished by specifying the relevant hardware through a program, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiment of the invention discloses a TS1 and a data enhancement method, wherein the method comprises the following steps:
determining an enhancement strategy search space, wherein the enhancement strategy search space comprises a plurality of enhancement strategies;
determining a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy;
determining the execution sequence of each target enhancement strategy;
determining at least one enhancement operation corresponding to each target enhancement strategy in a preset enhancement operation set, wherein each enhancement operation has a corresponding attribute value;
and sequentially executing each target enhancement strategy based on the execution sequence and the included enhancement operation with the corresponding attribute value so as to perform data enhancement on the original data.
TS2, according to the method in TS1, the determining a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy comprising:
determining a preference parameter and a first disturbance parameter corresponding to each enhancement strategy;
determining a first probability value corresponding to each enhancement strategy according to the preference parameter and the first disturbance parameter;
and determining a preset number of target enhancement strategies according to the corresponding first probability values.
TS3, the method of TS2, wherein the determining a first probability value for each of the enhancement policies based on a preference parameter and a first perturbation parameter comprises:
determining a first intermediate value according to preference parameters and first disturbance parameters corresponding to all enhancement strategies in the enhancement strategy search space;
for each enhancement strategy, determining a second intermediate value according to the corresponding preference parameter and the first disturbance parameter;
and determining a first probability value corresponding to each enhancement strategy according to the ratio of the corresponding second intermediate value to the first intermediate value.
TS4, method according to TS2, the determining a preset number of target enhancement strategies according to corresponding first probability values comprising:
sorting the enhancement strategies according to the corresponding first probability values;
a predetermined number of enhancement strategies with the largest first probability value are determined as target enhancement strategies.
TS5, the method of TS2, wherein the determining an execution order of each of the target enhancement policies comprises:
sequencing the target enhancement strategies from large to small according to the corresponding first probability values;
and taking the sequencing sequence as the execution sequence of each target enhancement strategy.
TS6, according to the method in TS1, the determining, in a preset set of enhancement operations, at least one enhancement operation corresponding to each of the target enhancement policies includes:
determining the enhancement operation quantity corresponding to each target enhancement strategy;
and selecting a number of enhancement operations from the enhancement operation set according to a preset screening rule so as to determine the enhancement operation corresponding to each target enhancement strategy.
TS7, the method according to TS6, wherein the selecting, according to a predetermined filtering rule, a number of enhancement operations from the set of enhancement operations to determine the enhancement operation corresponding to each of the target enhancement policies includes:
sequentially determining the target enhancement strategy as the current target enhancement strategy according to the execution sequence;
determining a second disturbance parameter corresponding to each enhancement operation;
determining a second probability value corresponding to each enhancement operation according to the corresponding attribute value and a second disturbance parameter;
and selecting the enhancement operation with the number of enhancement operations as the enhancement operation corresponding to the current target enhancement strategy according to the corresponding second probability value.
TS8, the method of TS7, the attribute values comprising attribute probability values;
the determining, according to the corresponding attribute value and the second disturbance parameter, a second probability value corresponding to each of the enhancement operations specifically includes:
and calculating a Bernoulli distribution result according to the attribute probability value in the attribute value corresponding to each enhancement operation and the second disturbance parameter so as to determine a second probability value.
TS9, the method of TS1, wherein the executing each of the target enhancement policies in turn based on the execution order and the included enhancement operations with corresponding attribute values to data enhance the original data comprises:
determining original data;
executing each target enhancement strategy on the original data in sequence based on the execution sequence so as to enhance the data;
the executing the target enhancement strategy on the original data specifically comprises the following steps:
and processing the original data according to at least one enhancement operation and a corresponding attribute value included in the target enhancement strategy to determine a corresponding target processing result.
TS10, the method of TS9, the attribute values comprising attribute probability values and amplitude values;
the processing the raw data according to at least one enhancement operation and a corresponding attribute value included in the target enhancement policy to determine a corresponding target processing result includes:
for each enhancement operation, determining a first processing result according to the attribute probability value, the amplitude value and the original data, wherein the first processing result is used for representing the possibility of data enhancement of the original data by the enhancement operation and the amplitude of the data enhancement;
determining a second processing result according to the attribute probability value and the original data, wherein the second processing result is used for representing the possibility that the original data is not processed by the enhancement operation;
and calculating the sum of the first processing result and the second processing result corresponding to each enhancement operation to determine a corresponding target processing result.
TS11, a data enhancement apparatus, the apparatus comprising:
a search space determination module, configured to determine an enhancement policy search space, where the enhancement policy search space includes a plurality of enhancement policies;
the strategy determining module is used for determining a preset number of target enhancement strategies in the enhancement strategy searching space according to a preset searching strategy;
the sequence determining module is used for determining the execution sequence of each target enhancement strategy;
an operation determining module, configured to determine, in a preset enhancement operation set, at least one enhancement operation corresponding to each target enhancement policy, where each enhancement operation has a corresponding attribute value;
and the data enhancement module is used for sequentially executing each target enhancement strategy based on the execution sequence and the included enhancement operation with the corresponding attribute value so as to perform data enhancement on the original data.
TS12, a computer readable storage medium storing computer program instructions which, when executed by a processor, implement a method as recited in any one of TS1-TS 10.
TS13, an electronic device comprising a memory for storing one or more computer program instructions and a processor, wherein the one or more computer program instructions are executed by the processor to implement a method as recited in any one of TS1-TS 10.
TS14, a computer program product comprising computer programs/instructions for execution by a processor to implement a method as described in any one of TS1-TS 10.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of data enhancement, the method comprising:
determining an enhancement strategy search space, wherein the enhancement strategy search space comprises a plurality of enhancement strategies;
determining a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy;
determining the execution sequence of each target enhancement strategy;
determining at least one enhancement operation corresponding to each target enhancement strategy in a preset enhancement operation set, wherein each enhancement operation has a corresponding attribute value;
and sequentially executing each target enhancement strategy based on the execution sequence and the included enhancement operation with the corresponding attribute value so as to perform data enhancement on the original data.
2. The method of claim 1, wherein determining a preset number of target enhancement strategies in the enhancement strategy search space according to a preset search strategy comprises:
determining a preference parameter and a first disturbance parameter corresponding to each enhancement strategy;
determining a first probability value corresponding to each enhancement strategy according to the preference parameter and the first disturbance parameter;
and determining a preset number of target enhancement strategies according to the corresponding first probability values.
3. The method of claim 2, wherein the determining a first probability value for each of the augmentation policies based on a preference parameter and a first perturbation parameter comprises:
determining a first intermediate value according to preference parameters and first disturbance parameters corresponding to all enhancement strategies in the enhancement strategy search space;
for each enhancement strategy, determining a second intermediate value according to the corresponding preference parameter and the first disturbance parameter;
and determining a first probability value corresponding to each enhancement strategy according to the ratio of the corresponding second intermediate value to the first intermediate value.
4. The method according to claim 2, wherein said determining a preset number of target enhancement strategies according to the corresponding first probability values comprises:
sorting the enhancement strategies according to the corresponding first probability values;
a predetermined number of enhancement strategies with the largest first probability value are determined as target enhancement strategies.
5. The method of claim 2, wherein determining the execution order of each of the target enhancement policies comprises:
sequencing the target enhancement strategies from large to small according to the corresponding first probability values;
and taking the sequencing sequence as the execution sequence of each target enhancement strategy.
6. The method according to claim 1, wherein the determining at least one enhancement operation corresponding to each of the target enhancement policies in a preset set of enhancement operations comprises:
determining the enhancement operation quantity corresponding to each target enhancement strategy;
and selecting a number of enhancement operations from the enhancement operation set according to a preset screening rule so as to determine the enhancement operation corresponding to each target enhancement strategy.
7. The method according to claim 6, wherein the selecting a number of enhancement operations from the set of enhancement operations according to a predetermined filtering rule to determine the enhancement operation corresponding to each of the target enhancement policies comprises:
sequentially determining the target enhancement strategy as the current target enhancement strategy according to the execution sequence;
determining a second disturbance parameter corresponding to each enhancement operation;
determining a second probability value corresponding to each enhancement operation according to the corresponding attribute value and a second disturbance parameter;
and selecting the enhancement operation with the number of enhancement operations as the enhancement operation corresponding to the current target enhancement strategy according to the corresponding second probability value.
8. The method of claim 1, wherein the executing each of the target enhancement policies in turn based on the execution order and the included enhancement operations with corresponding attribute values to perform data enhancement on the original data comprises:
determining original data;
executing each target enhancement strategy on the original data in sequence based on the execution sequence so as to enhance the data;
the executing the target enhancement strategy on the original data specifically comprises the following steps:
and processing the original data according to at least one enhancement operation and a corresponding attribute value included in the target enhancement strategy to determine a corresponding target processing result.
9. A computer readable storage medium storing computer program instructions, which when executed by a processor implement the method of any one of claims 1-8.
10. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any of claims 1-8.
CN202110227674.5A 2021-03-01 2021-03-01 Data enhancement method and device, storage medium and electronic equipment Pending CN113010762A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110227674.5A CN113010762A (en) 2021-03-01 2021-03-01 Data enhancement method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110227674.5A CN113010762A (en) 2021-03-01 2021-03-01 Data enhancement method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN113010762A true CN113010762A (en) 2021-06-22

Family

ID=76387222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110227674.5A Pending CN113010762A (en) 2021-03-01 2021-03-01 Data enhancement method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113010762A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537406A (en) * 2021-08-30 2021-10-22 重庆紫光华山智安科技有限公司 Method, system, medium and terminal for enhancing image automatic data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190354895A1 (en) * 2018-05-18 2019-11-21 Google Llc Learning data augmentation policies
CN111582375A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Data enhancement strategy searching method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190354895A1 (en) * 2018-05-18 2019-11-21 Google Llc Learning data augmentation policies
CN111582375A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Data enhancement strategy searching method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
任长宁, 马光胜, 王昊, 冯刚: "基于多目标演化算法的SOC设计空间搜索策略", 计算机工程, no. 06 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113537406A (en) * 2021-08-30 2021-10-22 重庆紫光华山智安科技有限公司 Method, system, medium and terminal for enhancing image automatic data

Similar Documents

Publication Publication Date Title
CN111031346B (en) Method and device for enhancing video image quality
US20220173987A1 (en) Distributed assignment of video analytics tasks in cloud computing environments to reduce bandwidth utilization
US20190279088A1 (en) Training method, apparatus, chip, and system for neural network model
US20230043174A1 (en) Method for pushing anchor information, computer device, and storage medium
CN109598307B (en) Data screening method and device, server and storage medium
CN113011337B (en) Chinese character library generation method and system based on deep meta learning
CN104854539A (en) Object searching method and device
CN110992365A (en) Loss function based on image semantic segmentation and design method thereof
CN114064242A (en) Method, device and storage medium for adjusting scheduling parameters
CN113010762A (en) Data enhancement method and device, storage medium and electronic equipment
CN112906800B (en) Image group self-adaptive collaborative saliency detection method
CN117271101B (en) Operator fusion method and device, electronic equipment and storage medium
CN110008215A (en) A kind of big data searching method based on improved KD tree parallel algorithm
CN112270384B (en) Loop detection method and device, electronic equipment and storage medium
US20200302278A1 (en) Method and device for determining a global memory size of a global memory size for a neural network
CN112861803A (en) Image identification method, device, server and computer readable storage medium
CN116737301A (en) Alignment method and device for layer elements
CN109165325B (en) Method, apparatus, device and computer-readable storage medium for segmenting graph data
CN114385876B (en) Model search space generation method, device and system
CN115470900A (en) Pruning method, device and equipment of neural network model
US11574118B2 (en) Template-based intelligent document processing method and apparatus
CN114358030A (en) Machine proofreading method and system after patent document translation
CN114003306A (en) Video memory optimization method, device, equipment and storage medium
CN114662568A (en) Data classification method, device, equipment and storage medium
CN108959237A (en) A kind of file classification method, device, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination