CN116912653A

CN116912653A - Model training method and device and electronic equipment

Info

Publication number: CN116912653A
Application number: CN202310822729.6A
Authority: CN
Inventors: 蔡鑫; 胡晶; 王兴科
Original assignee: China Telecom Technology Innovation Center; China Telecom Corp Ltd
Current assignee: China Telecom Technology Innovation Center; China Telecom Corp Ltd
Priority date: 2023-07-05
Filing date: 2023-07-05
Publication date: 2023-10-20

Abstract

A model training method, a device and an electronic device, wherein the method comprises the following steps: obtaining an initial picture containing preset information and a target Gaussian sample, performing forward noise-adding processing on the initial picture to obtain a first loss value corresponding to forward training, performing iterative training on a training model based on the first loss value and the target Gaussian sample, determining a second loss value corresponding to the training model and a preset function, stopping iterative training when the preset function converges, and taking the training model corresponding to the preset function as the target model. By the method, the initial picture is positively trained, so that the distribution difference between the output picture data and the initial picture can be clarified, and then the target Gaussian sample is subjected to denoising treatment, so that the distribution difference between the picture data and the real picture is clarified, and further the stability of a determined target model and the diversity of the target picture obtained based on the target model can be ensured.

Description

Model training method and device and electronic equipment

Technical Field

The present application relates to the field of network security technologies, and in particular, to a model training method and apparatus, and an electronic device.

Background

In network traffic data, there are picture data containing preset information, and in order to amplify the picture data, amplification of the picture data is usually achieved by using an countermeasure network technology (Generative Adversarial Networks, GAN), and a specific process of implementing amplification based on the GAN technology is as follows:

inputting initial picture data into a GAN model, extracting picture features of the initial picture data, performing feature dimension reduction on the picture features, and because the initial picture data does not comprise Gaussian noise, the Gaussian noise is statistical noise conforming to the positive too distribution, in order to enable the initial picture data to be more approximate to an original picture, the Gaussian noise needs to be added into the initial picture and is encoded, first picture data corresponding to the initial picture data is synthesized, then decoding is performed on the initial picture data based on a picture generator, and a picture sample set after enhancement is obtained.

Based on the method, the picture size is changed from small to large in the working process of the picture generator, and the picture with the changed size can lead to unstable GAN model, so that the diversity of pictures in the picture sample set is limited.

Disclosure of Invention

The application provides a model training method, a model training device and electronic equipment, wherein a target model with stability is trained by adding Gaussian noise in an initial picture and reducing Gaussian noise in a target sample, so that the accuracy of a target picture obtained based on the target model is ensured.

In a first aspect, the present application provides a model training method, the method comprising:

obtaining an initial picture containing preset information and a target Gaussian sample;

carrying out forward noise-adding processing on the initial picture to obtain a first loss value corresponding to forward training;

performing iterative training on a training model based on the first loss value and a target Gaussian sample, and determining a second loss value and a preset function corresponding to the training model;

and stopping iterative training when the preset function converges, and taking the training model corresponding to the preset function as a target model.

By the method, the training model is subjected to iterative training through the initial picture and the target Gaussian sample, so that the stability of the target model is ensured, and the diversity of the target picture obtained based on the target model is further ensured.

In one possible design, the forward noise-adding processing is performed on the initial picture to obtain a first loss value corresponding to forward training, including:

determining the noise-increasing time length of the initial picture;

and stopping increasing the Gaussian noise in response to the noise increasing time length reaching a preset time length, and determining a first loss value corresponding to the training model.

By the method, gaussian noise is added to the initial picture, and the distribution difference between the output picture data and the initial picture is determined, so that the stability of the training model can be ensured.

In one possible design, performing iterative training on a training model based on the first loss value and a target gaussian sample, and determining a second loss value and a preset function corresponding to the training model respectively, where the method includes:

noise reduction processing is carried out on the target Gaussian sample, and the noise reduction duration of the target Gaussian sample is determined;

and determining a second loss value and a preset function corresponding to the training model in response to the noise reduction time length reaching the preset time length.

Through the method, the target sample is subjected to noise reduction processing, the distribution difference between the picture data and the real picture is determined, and therefore stability of the determined target model and diversity of the target picture obtained based on the target model can be ensured.

In one possible design, determining the second loss value corresponding to the training model includes:

determining a training loss value after training of the training model;

and adding the training loss value and the first loss value to obtain the second loss value.

By the method, the training loss value and the first loss value are added to obtain the second loss value, and the accuracy of the determined second loss value is ensured.

In one possible design, after taking the training model corresponding to the preset function as a target model, the method further includes:

obtaining initial picture data, wherein the initial picture data is a sample conforming to standard Gaussian distribution;

inputting the initial picture data into the target model for denoising processing, and determining denoising duration of the initial picture data;

and determining at least one target picture corresponding to the initial picture data generated by the target model in response to the denoising time length reaching the preset time length.

By the method, at least one target picture corresponding to the initial picture data is generated through the determined target model, and the size of the target picture is consistent with that of the initial picture data, so that the diversity of the target picture determined by the target model can be ensured.

In a second aspect, the present application provides a model training apparatus, the apparatus comprising:

the acquisition module is used for acquiring an initial picture containing preset information and a target Gaussian sample;

the noise-adding module is used for carrying out forward noise-adding processing on the initial picture to obtain a first loss value corresponding to forward training;

the iteration module is used for carrying out iterative training on the training model based on the first loss value and the target Gaussian sample, and determining a second loss value and a preset function corresponding to the training model;

and the target module is used for stopping iterative training when the preset function converges, and taking the training model corresponding to the preset function as a target model.

In one possible design, the noise-adding module is specifically configured to determine a noise-adding duration of the initial picture, stop adding gaussian noise in response to the noise-adding duration reaching a preset duration, and determine a first loss value corresponding to the training model.

In one possible design, the iteration module is specifically configured to perform noise reduction processing on the target gaussian sample, determine a noise reduction duration of the target gaussian sample, and determine a second loss value and a preset function corresponding to the training model in response to the noise reduction duration reaching the preset duration.

In one possible design, the iteration module is further configured to determine a training loss value after training of the training model, and add the training loss value to the first loss value to obtain the second loss value.

In one possible design, the target module is specifically configured to obtain initial picture data, input the initial picture data into the target model for denoising, determine denoising duration of the initial picture data, and determine at least one target picture corresponding to the initial picture data generated by the target model in response to the denoising duration reaching the preset duration.

In a third aspect, the present application provides an electronic device, comprising:

a memory for storing a computer program;

and the processor is used for realizing the steps of the model training method when executing the computer program stored in the memory.

In a fourth aspect, a computer readable storage medium has a computer program stored therein, which when executed by a processor, implements the steps of a model training method as described above.

The technical effects of each of the first to fourth aspects and the technical effects that may be achieved by each aspect are referred to above for the technical effects that may be achieved by the first aspect or the various possible aspects of the first aspect, and are not repeated here.

Drawings

FIG. 1 is a flow chart of the steps of a model training method provided by the application;

FIG. 2 is a schematic diagram of a model training device according to the present application;

fig. 3 is a schematic structural diagram of an electronic device according to the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings. The specific method of operation in the method embodiment may also be applied to the device embodiment or the system embodiment. In the description of the present application, "a plurality of" means "at least two". "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. A is connected with B, and can be represented as follows: both cases of direct connection of A and B and connection of A and B through C. In addition, in the description of the present application, the words "first," "second," and the like are used merely for distinguishing between the descriptions and not be construed as indicating or implying a relative importance or order.

In the prior art, in order to amplify the picture data, the method is adopted, in which the initial picture data is input into a GAN model, the picture features of the initial picture data are extracted from the GAN model, then the picture features are subjected to feature dimension reduction, in order to make the initial picture data more approximate to an original picture, gaussian noise is added into the initial picture and coding is performed, the first picture data corresponding to the initial picture data are synthesized, then the initial picture data are decoded based on a picture generator to obtain an enhanced picture sample set, however, in the working process of the picture generator, the picture size is changed from small to large, the picture with the size change causes instability of the GAN model, and thus the diversity of pictures in the picture sample set is limited.

In order to solve the above-described problems, the present application provides a model training method for improving the stability of a target model, so that a diversified target picture can be obtained based on the target model. The method and the device according to the embodiments of the present application are based on the same technical concept, and because the principles of the problems solved by the method and the device are similar, the embodiments of the device and the method can be referred to each other, and the repetition is not repeated.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Referring to fig. 1, the application provides a model training method, which can improve the stability of a target model and the diversity of data enhancement based on the target model, and the implementation flow of the method is as follows:

step S1: and obtaining an initial picture containing preset information and a target Gaussian sample.

Because the model with data enhancement in the prior art is unstable, the diversity of the pictures generated based on the model is limited, the embodiment of the application adopts a model training method to obtain the model with stability, firstly, before the target model is obtained, an initial picture containing preset information and a target Gaussian sample are required to be obtained, and the preset information can be adjusted based on actual conditions, such as: key information, address information, and the like.

Step S2: and carrying out forward noise-adding processing on the initial picture to obtain a first loss value corresponding to forward training.

After determining an initial picture and a target Gaussian sample, in order to achieve the purpose of data enhancement, a multi-step Markov process is needed to be carried out on the initial picture to increase Gaussian noise, the Markov process is a random process, the noise increasing time length of the initial picture is recorded, when the noise increasing time length reaches a preset time length, the increase of the Gaussian noise is stopped, a first loss value corresponding to a training model is determined, and the first loss value represents the distribution difference between picture data after forward training and standard Gaussian noise.

Such as: the time corresponding to the initial picture input training model is marked as 0 time, the time when the Gaussian noise stops increasing is marked as T time, and in order to represent the relation between different times of the initial picture and the Gaussian noise, x can be used ₀ The function corresponding to q (x) is represented as follows:

based on the above function, wherein the parametersFor the variance adopted in each step, 0 is satisfied<＝β _t <1, deriving an arbitrary time t, < > and>wherein (1)>x ₀ Representing the data distribution condition corresponding to the picture at the moment 0, x _t Representing the data distribution condition corresponding to the picture at the time t, q _a Representing superimposed Gaussian noise distribution of the picture corresponding to the t moment and added with noise from the 0 moment, < >>The additive effect of noise from 0 to t is shown.

It should be noted that, the distribution condition of gaussian noise corresponding to the initial picture at the time T is close to the standard state distribution, and the training model can determine the first loss value at the time T.

By the method, the first loss value corresponding to forward training of the training model is determined, and the distribution difference between the output picture data reaching the preset time and the standard Gaussian noise is determined, so that the stability of the training model can be improved.

Step S3: and performing iterative training on the training model based on the first loss value and the target Gaussian sample, and determining a second loss value and a preset function corresponding to the training model.

After the training model is forward trained to determine the first loss value, in order to improve the accuracy of the distribution difference between the output data and the initial picture, iterative training is required to be performed on the training model based on the first loss value and the target sample, and the training times of the iterative training can be adjusted according to the actual situation, and since the process of each iterative training is consistent, the detailed explanation is performed by taking one iterative training as an example, and the specific iterative process is as follows:

the forward training is to increase Gaussian noise on the basis of an initial picture, reduce Gaussian noise on the basis of a target Gaussian sample in order to improve the stability of a training model in a complete process, reduce Gaussian noise through a multi-step Markov process in the iterative training, stop reducing Gaussian noise when the noise reduction duration reaches a preset duration, and determine a second loss value corresponding to the training model and a preset function.

In the process of noise reduction processing of the training model, in order to ensure stability of the training model, a training loss value corresponding to the training model after each training needs to be determined, and the first loss value and the training loss value are added to obtain a second loss value, where the second loss value represents the training model.

Such as: in the process of reducing the gaussian noise, if the noise reduction duration is in a period corresponding to 0 to T, since the gaussian noise is reduced in multiple steps, the time when the target gaussian sample is input can be recorded as the time T, until the time 0 is reached, the picture sample is output, at the time T, the target gaussian sample is a standard gaussian distribution sample, at the time 0, the gaussian noise removal in the target gaussian sample is finished, and the gaussian state distribution of the target gaussian sample at a certain time can be obtained based on the following function, which is specifically as follows:

p _θ (x _t-1 ,x _t )＝N(x _t-1 ；u _θ (x _t ,t),∑ _θ (x _t ,t))

in the above function, u _θ (x _t T) represents the mean value of the gaussian noise distribution at time t, Σ _θ (x _t T) represents the variance of the gaussian noise distribution at time t.

Through the method, the target sample is subjected to noise reduction, so that the picture data corresponding to the target sample is obtained, the picture data can be obtained based on the training model, and the stability of the training model is ensured.

Step S4: and stopping iterative training when the preset function converges, and taking the training model corresponding to the preset function as a target model.

After determining a preset function corresponding to training, in order to determine a target model with stability, whether the preset function is converged or not needs to be detected, and when the preset function is not converged, iterative training needs to be continued; and stopping iterative training when the preset function converges, and taking a training model corresponding to the current preset function as a target model.

After determining the target model, in order to realize enhancement of the image data, obtaining initial image data, wherein the initial image data is a sample conforming to standard Gaussian distribution, inputting the initial image data into the target model for denoising, determining denoising duration of the initial image data again when the denoising duration reaches a preset duration, representing that denoising of the target model is completed, and generating at least one target image corresponding to the initial image data by the target model.

Through the method, the training model is subjected to forward training, the distribution difference between the output picture data and the initial picture is reduced, and then the target Gaussian sample in the training model is subjected to denoising treatment, so that the distribution difference between the output picture data and the real picture is clarified, the stability of the target model is ensured, and the diversity of at least one target picture obtained based on the target model is ensured.

Based on the same inventive concept, the embodiment of the present application further provides a model training device, where the model training device is configured to implement a function of a model training method, and referring to fig. 2, the device includes:

an obtaining module 201, configured to obtain an initial picture including preset information and a target gaussian sample;

the noise-adding module 202 is configured to perform forward noise-adding processing on the initial picture, so as to obtain a first loss value corresponding to forward training;

the iteration module 203 is configured to perform iterative training on a training model based on the first loss value and a target gaussian sample, and determine a second loss value and a preset function corresponding to the training model;

and the target module 204 is configured to stop iterative training when the preset function converges, and take the training model corresponding to the preset function as a target model.

In one possible design, the noise-adding module 202 is specifically configured to determine a noise-adding duration of the initial picture, stop adding gaussian noise in response to the noise-adding duration reaching a preset duration, and determine a first loss value corresponding to the training model.

In one possible design, the iteration module 203 is specifically configured to perform noise reduction processing on the target gaussian sample, determine a noise reduction duration of the target gaussian sample, and determine a second loss value and a preset function corresponding to the training model in response to the noise reduction duration reaching the preset duration.

In one possible design, the iteration module 203 is further configured to determine a training loss value after training the training model, and add the training loss value to the first loss value to obtain the second loss value.

In one possible design, the target module 204 is specifically configured to obtain initial picture data, input the initial picture data into the target model for denoising, determine denoising duration of the initial picture data, and determine at least one target picture corresponding to the initial picture data generated by the target model in response to the denoising duration reaching the preset duration.

Based on the same inventive concept, the embodiment of the present application further provides an electronic device, where the electronic device may implement the function of the foregoing model training apparatus, and referring to fig. 3, the electronic device includes:

at least one processor 301, and a memory 302 connected to the at least one processor 301, a specific connection medium between the processor 301 and the memory 302 is not limited in the embodiment of the present application, and in fig. 3, the connection between the processor 301 and the memory 302 through the bus 300 is taken as an example. Bus 300 is shown in bold lines in fig. 3, and the manner in which the other components are connected is illustrated schematically and not by way of limitation. The bus 300 may be divided into an address bus, a data bus, a control bus, etc., and is represented by only one thick line in fig. 3 for convenience of illustration, but does not represent only one bus or one type of bus. Alternatively, the processor 301 may be referred to as a controller, and the names are not limited.

In an embodiment of the present application, the memory 302 stores instructions executable by the at least one processor 301, and the at least one processor 301 may perform a model training method as discussed above by executing the instructions stored in the memory 302. Processor 301 may implement the functions of the various modules in the apparatus shown in fig. 2.

The processor 301 is a control center of the apparatus, and may connect various parts of the entire control device using various interfaces and lines, and by executing or executing instructions stored in the memory 302 and invoking data stored in the memory 302, various functions of the apparatus and processing data, thereby performing overall monitoring of the apparatus.

In one possible design, processor 301 may include one or more processing units, and processor 301 may integrate an application processor and a modem processor, where the application processor primarily processes operating systems, user interfaces, application programs, and the like, and the modem processor primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 301. In some embodiments, processor 301 and memory 302 may be implemented on the same chip, and in some embodiments they may be implemented separately on separate chips.

The processor 301 may be a general purpose processor such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, which may implement or perform the methods, steps and logic blocks disclosed in embodiments of the application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a model training method disclosed in connection with the embodiments of the present application may be directly embodied as execution completion by a hardware processor, or may be executed in combination by hardware and software modules in the processor.

The memory 302 serves as a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 302 may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory), magnetic Memory, magnetic disk, optical disk, and the like. Memory 302 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 302 in embodiments of the present application may also be circuitry or any other device capable of performing memory functions for storing program instructions and/or data.

By programming the processor 301, the code corresponding to a model training method described in the previous embodiments may be solidified into a chip, thereby enabling the chip to perform a model training step of the embodiment shown in fig. 1 at run-time. How to design and program the processor 301 is a technology well known to those skilled in the art, and will not be described in detail herein.

Based on the same inventive concept, the embodiments of the present application also provide a storage medium storing computer instructions that, when executed on a computer, cause the computer to perform a model training method as previously discussed.

In some possible embodiments, the application provides that aspects of a model training method may also be implemented in the form of a program product comprising program code for causing a control apparatus to carry out the steps of a model training method according to the various exemplary embodiments of the application as described in the specification, when the program product is run on a device.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method of model training, comprising:

2. The method of claim 1, wherein performing forward noise-adding processing on the initial picture to obtain a first loss value corresponding to forward training, comprises:

determining the noise-increasing time length of the initial picture;

3. The method of claim 1, wherein iteratively training a training model based on the first loss value and a target gaussian sample, determining a second loss value and a preset function respectively corresponding to the training model, comprises:

4. The method of claim 3, wherein determining a second loss value for the training model comprises:

determining a training loss value after training of the training model;

5. The method of claim 1, further comprising, after taking the training model corresponding to the preset function as a target model:

6. A model training device, comprising:

7. The apparatus of claim 6, wherein the noise-adding module is specifically configured to determine a noise-adding duration of the initial picture, stop adding gaussian noise in response to the noise-adding duration reaching a preset duration, and determine a first loss value corresponding to a training model.

8. The apparatus of claim 6, wherein the iteration module is specifically configured to perform noise reduction processing on the target gaussian sample, determine a noise reduction duration of the target gaussian sample, and determine a second loss value and a preset function corresponding to the training model in response to the noise reduction duration reaching the preset duration.

9. An electronic device, comprising:

a memory for storing a computer program;

a processor for carrying out the method steps of any one of claims 1-5 when executing a computer program stored on said memory.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-5.