CN114067131A

CN114067131A - Data enhancement method based on multi-environment characteristics, terminal and storage medium

Info

Publication number: CN114067131A
Application number: CN202111340841.3A
Authority: CN
Inventors: 欧阳一村; 王和平; 罗富章; 莫家源; 徐波; 陈余泉
Original assignee: Maxvision Technology Corp
Current assignee: Maxvision Technology Corp
Priority date: 2021-11-12
Filing date: 2021-11-12
Publication date: 2022-02-18

Abstract

The application provides a data enhancement method based on multi-environment characteristics, a terminal and a storage medium, wherein the data enhancement method based on the multi-environment characteristics fully utilizes and extracts the characteristics of dimensionalities such as time, illumination, weather and the like in a training picture, and a characteristic library is established according to the characteristics; meanwhile, by adopting the Encoder-Decoder technology, the core problem of high difficulty in time and illumination labeling in the training picture is avoided, the diversity of the picture is greatly improved, and the precision of the model obtained by training the generated data is improved; the enhancement of any input picture under different conditions of time, illumination, weather and the like can be performed close to the actual situation.

Description

Data enhancement method based on multi-environment characteristics, terminal and storage medium

Technical Field

The present application belongs to the field of image processing technologies, and in particular, to a data enhancement method based on multi-environment features, a terminal, and a storage medium.

Background

With the continuous development of artificial intelligence, computer vision and hardware technologies, video image processing technologies have been widely applied to intelligent systems. In recent years, with the popularization of deep learning, the demand for data, especially for tagged high-quality data, has been sharply increased, and the cost of data acquisition and labeling greatly restricts the quantity and quality of data.

The significance of data enhancement lies in reducing the cost of data acquisition and labeling as much as possible, and the model can achieve better effect as much as possible on the basis of the existing data acquisition and labeling, so that the artificial intelligence can be more widely applied in a wider range, and the problem of data bottleneck existing in the application of the existing artificial intelligence is solved.

Common data enhancement methods include supervised data enhancement and unsupervised data enhancement methods. Among them are supervised data enhancement, such as geometric transformation based, color transformation based, etc. The geometric transformation class is to perform geometric transformation on the image, including various operations such as flipping, rotating, cropping, deforming, scaling, etc., whereas the geometric transformation class operation does not change the content of the image itself, and may be to select a part of the image or to redistribute pixels. If the content of the image itself is to be changed, data belonging to the class of color transformations is enhanced, commonly including noise, blurring, color transformations, erasures, fills, and so forth. Unsupervised data enhancement randomly generates pictures consistent with the distribution of the training data set by learning the distribution of data through a model, and represents the method GAN. The above common data enhancement methods, whether geometric transformation, color transformation, or GAN-based generation, are poor in application to weather, time, illumination, etc. in a real scene, because geometric transformation and color transformation methods are difficult to simulate or restore the influence of weather, time, illumination on a target in a real scene, GAN requires a large amount of labeled weather, time, illumination data to be added into training. For example, when sunlight is shining on a face or the face is in a backlight environment, since the collected data set or the enhanced data set only has less related data, the model training effect is poor, the accuracy is greatly reduced, and therefore corresponding processing needs to be specially performed for common problems such as a real scene.

Disclosure of Invention

An object of the embodiments of the present application is to provide a data enhancement method, a terminal and a storage medium based on multi-environment features, so as to solve the technical problems of less data and low model precision in the image processing process in a real scene in the prior art.

In order to achieve the purpose, the technical scheme adopted by the application is as follows: the data enhancement method based on the multi-environment characteristics comprises the following steps:

collecting a number of first target pictures with time, illumination and weather characteristics;

respectively inputting the first target picture to an Encoder of time, illumination and weather, correspondingly extracting characteristics of the time, the illumination and the weather, and correspondingly outputting a first characteristic diagram or vector of the time, the illumination and the weather;

saving the time, illumination and weather first feature map or vector of the first target picture to a feature library;

correspondingly inputting the time, illumination and weather first feature map or vector of the first target picture to a Decoder of the time, illumination and weather, and outputting the time, illumination and weather first feature map or vector as a second target picture;

returning and updating the weights of the Encoder and the Decoder and a first feature map or vector of time, illumination and weather in a feature library by calculating the distance between the first target picture and the second target picture as loss;

inputting a third target picture to the updated Encoder of time, illumination and weather, correspondingly extracting the characteristics of the time, the illumination and the weather, and correspondingly outputting a second characteristic diagram or vector of the time, the illumination and the weather;

replacing a second feature map or vector of time, illumination and weather with a first feature map or vector representing different time, illumination and weather in the updated feature library;

outputting picture data having different times, lighting, and weather.

Preferably, the first target picture is any picture without tag data of time, illumination and weather.

Preferably, the Encoder includes a time Encoder, an illumination Encoder and a weather Encoder, the time Encoder is used for extracting time characteristics of the first target picture, the illumination Encoder is used for extracting illumination characteristics of the first target picture, and the weather Encoder is used for extracting weather characteristics of the first target picture.

Preferably, the feature library includes a time feature library, an illumination feature library and a weather feature library, the time feature library is used for maintaining a first feature map or vector of time of the first target picture, the illumination feature library is used for maintaining a first feature map or vector of illumination of the first target picture, and the weather feature library is used for maintaining a first feature map or vector of weather of the first target picture.

Preferably, the method for enhancing data based on multi-environment features further comprises the steps of:

after calculating the distance between the first target picture and the second target picture as Loss, when the Loss is less than a rated value, the input of the first target picture is finished, the Encoder and Decoder training is finished, and the updating of the feature map or the vector in the time, illumination and weather feature library is finished.

and after the second feature map or the vector is extracted, saving the second feature map or the vector of the time, the illumination and the weather of the third target picture to a feature library.

The present application also provides a terminal including a processor for executing the multi-environment feature based data enhancement method as described above.

The present application also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the face attribute recognition method as described above.

Compared with the prior art, the data enhancement method, the terminal and the storage medium based on the multi-environment characteristics fully utilize and extract the characteristics of dimensionalities such as time, illumination, weather and the like in the training picture, and establish the characteristic library; meanwhile, by adopting the Encoder-Decoder technology, the core problem of high difficulty in time and illumination labeling in the training picture is avoided, the diversity of the picture is greatly improved, and the precision of the model obtained by training the generated data is improved; the enhancement of any input picture under different conditions of time, illumination, weather and the like can be performed close to the actual situation.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic perspective view illustrating a data enhancement method based on multiple environmental characteristics according to an embodiment of the present disclosure;

fig. 2 is a schematic diagram of a terminal and a storage medium according to an embodiment of the present disclosure.

Detailed Description

In order to make the technical problems, technical solutions and advantageous effects to be solved by the present application clearer, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element.

It will be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like, as used herein, refer to an orientation or positional relationship indicated in the drawings that is solely for the purpose of facilitating the description and simplifying the description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be considered as limiting the present application.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

Referring to fig. 1, a method for enhancing data based on multi-environment features according to an embodiment of the present application will now be described. The data enhancement method based on the multi-environment characteristics comprises the following steps:

step S1, a number of first target pictures with time, lighting, and weather characteristics are collected.

And step S2, respectively inputting the first target picture into an Encoder of time, illumination and weather, correspondingly extracting the characteristics of the time, the illumination and the weather, and correspondingly outputting a first characteristic diagram or vector of the time, the illumination and the weather.

Step S3, saving the first feature map or vector of the time, illumination and weather of the first target picture to the feature library.

Step S4, correspondingly inputting the first feature map or vector of the time, illumination and weather of the first target picture to the Decoder of the time, illumination and weather, and outputting the result as a second target picture.

And step S5, returning and updating the weights of the Encoder and the Decoder and the first feature map or vector of time, illumination and weather in the feature library by calculating the distance between the first target picture and the second target picture as loss.

And step S6, inputting the Encoder of the third target picture to the updated time, illumination and weather, correspondingly extracting the characteristics of the time, illumination and weather, and correspondingly outputting a second characteristic diagram or vector of the time, illumination and weather.

And step S7, replacing the second characteristic diagram or vector of the time, the illumination and the weather with the first characteristic diagram or vector representing different time, illumination and weather in the updated characteristic library.

In step S8, picture data with different times, lighting, and weather are output.

It is understood that the first target picture is any picture without tag data of time, light, and weather. The main purpose of steps S1 to S5 is to train encoders and decoders while updating the feature library. Each time a first target picture is input, the Encoder and the Decoder may be trained once through steps S1 to S5, and the feature library is updated once.

Firstly, extracting a characteristic graph or vector corresponding to time, illumination and weather by using an Encoder and storing the characteristic graph or vector in a characteristic library; then, the feature graph or vector output by the Encoder is used as the input of the next Decoder to restore the picture, and the loss of the Decoder output and the initial input picture is used for updating the weight; and finally, extracting the required time, illumination and weather in the feature library to replace the time, illumination and weather of the target, and generating more diversified picture data.

It should be noted that the Encoder is used to extract the time, illumination and weather features in the picture, and the Decoder takes a certain feature (for example, ten am) as an input to output the picture with the corresponding feature; thus, the Encoder is actually a plurality of common convolutional layer Conv stacks, and the Decoder is a plurality of common deconvolution DeConv stacks, which can be designed according to specific requirements.

Compared with the prior art, the data enhancement method based on the multi-environment characteristics fully utilizes and extracts the characteristics of dimensionalities such as time, illumination, weather and the like in the training picture, and establishes the characteristic library; meanwhile, by adopting the Encoder-Decoder technology, the core problem of high difficulty in time and illumination labeling in the training picture is avoided, the diversity of the picture is greatly improved, and the precision of the model obtained by training the generated data is improved; the enhancement of any input picture under different conditions of time, illumination, weather and the like can be performed close to the actual situation.

In another embodiment of the present application, the first target picture is any picture without tag data of time, light and weather.

It can be understood that, in order to ensure the diversity of the feature maps or vectors in the feature library, it is avoided that the first target picture should be randomly acquired due to data imbalance, which is beneficial to improve the model accuracy.

In another embodiment of the present application, the Encoder includes a time Encoder, an illumination Encoder and a weather Encoder, the time Encoder is used for extracting time characteristics of the first target picture, the illumination Encoder is used for extracting illumination characteristics of the first target picture, and the weather Encoder is used for extracting weather characteristics of the first target picture.

It can be understood that time, illumination and weather in the first target picture can be extracted as independent factors as features through the time Encoder, the illumination Encoder and the weather Encoder; the time Encoder, the illumination Encoder and the weather Encoder can operate simultaneously and are independent of each other, and the operating speed is improved.

In another embodiment of the present application, the feature library includes a time feature library, an illumination feature library, and a weather feature library, the time feature library is used for maintaining a first feature map or vector of time of the first target picture, the illumination feature library is used for maintaining a first feature map or vector of illumination of the first target picture, and the weather feature library is used for maintaining a first feature map or vector of weather of the first target picture.

It can be understood that the first feature maps or vectors of the time, illumination and weather of the first target picture are independently stored in three independent feature libraries, that is, the time feature library, the illumination feature library and the weather feature library can be independently updated.

In another embodiment of the present application, the method for enhancing data based on multiple environmental characteristics further includes:

and S5.1, when the Loss is less than the rated value, ending the input of the first target picture, ending the Encoder and Decoder training, and ending the updating of the characteristic diagram or the vector in the time, illumination and weather characteristic library.

It can be understood that the purpose of updating the encorder and the Decoder is that initially, the weights of the encorder and the Decoder are random, the effect of extracting picture time, illumination and weather features by the encorder is poor, and the effect of restoring the features to the picture by the Decoder is poor, so that training is needed, that is, the weights of the encorder and the Decoder are updated, so that the best possible effect is achieved, that is, loss is minimum.

and S6.1, storing the time, illumination and weather second feature map or vector of the third target picture to a feature library.

It will be appreciated that by continually updating the feature maps or vectors in the feature library, the diversity of the feature library can be continually increased. Even after the Encoder and Decode training is completed through steps S1 to S5, the feature library can be further enriched continuously during use.

Referring to fig. 2, the present application further provides a terminal including a processor for executing the multi-environment feature-based data enhancement method as described above.

It will be appreciated that the Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like, which is the control center of the terminal and connects the various parts of the overall terminal using various interfaces and lines.

The terminal-integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer-readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A data enhancement method based on multi-environment characteristics is characterized by comprising the following steps:

outputting picture data having different times, lighting, and weather.

2. The multi-environmental feature based data enhancement method of claim 1, wherein the first target picture is any picture without tag data of time, light and weather.

3. The multi-environmental feature based data enhancement method according to claim 1, wherein the encoders include a temporal Encoder for extracting temporal features of the first target picture, an illumination Encoder for extracting illumination features of the first target picture, and a weather Encoder for extracting weather features of the first target picture.

4. The multi-environmental feature based data enhancement method according to claim 3, wherein the feature library comprises a temporal feature library, an illumination feature library and a weather feature library, the temporal feature library is used for maintaining a first feature map or vector of time of the first target picture, the illumination feature library is used for maintaining a first feature map or vector of illumination of the first target picture, and the weather feature library is used for maintaining a first feature map or vector of weather of the first target picture.

5. The multi-environment feature based data enhancement method of claim 1, further comprising the steps of:

6. The multi-environment feature based data enhancement method of claim 1, further comprising the steps of:

7. A terminal, characterized in that the terminal comprises a processor for executing the multi-environment feature based data enhancement method according to any one of claims 1 to 6.

8. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the multi-environment feature based data enhancement method according to any one of claims 1 to 6.