CN111027568B

CN111027568B - Iterative neural network batch normalization system

Info

Publication number: CN111027568B
Application number: CN201911047484.4A
Authority: CN
Inventors: 辛淼; 程健
Original assignee: Zhongke Nanjing Artificial Intelligence Innovation Research Institute; Institute of Automation of Chinese Academy of Science
Current assignee: Zhongke Nanjing Artificial Intelligence Innovation Research Institute; Institute of Automation of Chinese Academy of Science
Priority date: 2019-10-30
Filing date: 2019-10-30
Publication date: 2023-04-18
Anticipated expiration: 2039-10-30
Also published as: CN111027568A

Abstract

The invention discloses an iterative neural network batch standardization system, which comprises a feature preprocessing unit, a neural network construction unit and a network batch standardization unit, wherein the feature preprocessing unit is used for preprocessing a plurality of neural networks; the characteristic preprocessing unit can perform characteristic preprocessing on the picture to be identified and remove interference information in the background; the neural network construction unit can train a learning network for identifying image content and further process the image; the network batch standardization unit can improve the training mode of the neural network and improve the convergence speed in the training process. The invention standardizes the training process of the neural network in batches, reduces the cost of constructing the neural network and improves the training speed of the neural network.

Description

Iterative neural network batch normalization system

Technical Field

The invention relates to a neural network optimization technology, in particular to an iterative neural network batch normalization system.

Background

With the continuous improvement of modern science and technology, intelligent artificial-imitation facilities in daily life are increasingly common and gradually become an indispensable part of public life of people, and the facilities greatly improve the convenience of daily life of residents. Wherein, what the effect is more obvious is the construction of artificial neural network. The artificial neural network exerts own force in various fields, regardless of a management system for a repeatability test in a scientific research laboratory or a license plate recognition instrument for roadside tracking and snapshotting an illegal vehicle.

A good artificial neural network not only needs a huge database to carry out data storage learning, but also needs to continuously store the identified picture information and further update the database, thereby ensuring the integrity and sustainability of the network construction.

The artificial neural network training process is carried out in units of batch (mini-batch). The number of samples contained in a batch is fixed, but the samples in a batch are randomly selected from the data set, each sample belonging to at least one category, and therefore the proportion of the categories of the samples contained in the respective batches is different.

The batch normalization method is a common component in artificial neural network training, and in a typical batch normalization method, the mean and variance of each batch are calculated independently, and the calculation of the mean and variance of one batch is not connected with the calculation of the previous batch. If the sample distribution difference between two adjacent batches is large, the convergence speed in the training process is slow.

Aiming at the problem, the invention discloses an iterative neural network batch normalization method.

Disclosure of Invention

The invention aims to: an iterative neural network batch normalization system is provided to solve the above problems.

The technical scheme is as follows: an iterative neural network batch normalization system comprises a feature preprocessing unit, a neural network construction unit and a network batch normalization unit, and can reduce the convergence speed of a learning model and enhance the cost performance of neural network construction through the construction process of batch normalization of the neural network;

the characteristic preprocessing unit is used for removing interference information in a background and reserving a main range to be detected in the picture by using an edge detection preprocessing method for the picture to be identified;

the neural network construction unit is used for establishing an image database on the basis of the pictures of domestic mainstream websites, so that a learning network for identifying image content is trained, and the picture information is further processed;

the method is characterized in that a network batch standardization unit improves a training mode of a neural network, and increases the connectivity between the mean value and the variance of two batches before and after training, so that the sample distribution difference between adjacent batches is reduced, the convergence rate in the training process is improved, and the method specifically comprises the following steps:

step 1, calculating the number distribution histogram H of each category sample in the first batch of samples ₁ Calculating the mean value and variance of the class sample, and using the mean value and variance as the initial mean value of the neural network training modelA value and an initial variance value;

step 2, repeating the steps, and continuously calculating the quantity distribution histogram H of each category sample in the ith and (i + 1) th batch samples _i+1 Calculating the mean value and the variance of the class sample, and calculating the histogram similarity of i and i + 1;

step 3, if the similarity between the category distribution histogram of the (i + 1) th batch of samples and the ith category distribution is greater than a threshold value theta, adopting the mean variance of the ith batch iteration as the mean value and the variance; if the similarity of the class distribution is smaller than theta, updating the mean variance;

and 4, step 4: and (5) carrying out correction batch normalization on the samples of the (i + 1) th batch.

According to an aspect of the present invention, the network batch normalization unit needs to define the correction mean and variance, and obtains a histogram H of sample distribution by counting the number of samples of each category in the 1 st batch of samples ₁ As an initial batch sample distribution histogram; average value mu is determined for the 1 st batch of samples ₁ Sum variance σ ₁ Both as the initial corrective mean

And the correction variance pick>

The calculation formula is as follows:

according to an aspect of the invention, the network lot size specification unit needs to compare the similarity graph of the sample distribution histogram by calculating the sample distribution histogram H of the (i + 1) th lot (i > 0) _i+1 Is prepared from H _i+1 Sample distribution histogram H of ith batch _i Calculating distribution similarity d (H) _i ,H _i+1 ) The concrete formula is as follows:

wherein the content of the first and second substances,

b is the b-th bar of the histogram, N is the total number of classes of all samples; d (H) _i ,H _i+1 ) The larger the value of (b), the higher the similarity, the maximum value is 1, and the minimum value is 0.

According to one aspect of the invention, the network batch specification unit presets a threshold value theta of the category distribution similarity in advance, the threshold value is modified according to the calculation requirement, and the specific range is that theta is more than 0 and less than 1; if the distribution degree d of the (i + 1) th batch _i+1 Theta, the mean value of correction currently used for this batch

And the correction variance pick>

Same as the last batch, i.e.:

if d is _i+1 And if not more than theta, updating the correction mean value and the correction variance as follows:

wherein lambda is a set correction coefficient, and the value range is as follows: lambda is more than 0 and less than 1.

According to one aspect of the invention, the mean and variance employed by the network batch specification unit are a corrective mean and a corrective variance; input value x at each layer in the neural network for the current batch of samples _i+1 The normalized input values obtained are:

wherein, e is a constant floating point number with a value of 1.0 × 2 ^-126 。

According to an aspect of the present invention, the feature preprocessing unit performs edge preprocessing on the picture to be processed, and directly ignores most of detail information in the picture, such as shadow, color change, texture change, and the like, to ensure the main scene in the picture to be highlighted.

According to one aspect of the invention, the feature pre-processing unit can form a feature distribution table of the feature pre-processing unit by re-classifying the features of the pre-processed pictures in the database, and the feature distribution table is compared in advance when the main information of the pictures is extracted, so that the accuracy of picture feature identification is improved.

Has the beneficial effects that: when the neural network training is carried out, the construction of the whole learning mode of each batch is considered, and meanwhile, the convergence rate of sample learning is enhanced by calculating the difference degree of the mean value and the variance between samples of adjacent batches, and the problem of mean value variance jitter is effectively solved; the scheme of image preprocessing can reduce the burden of neural network construction, and greatly improve the speed and quality of identifying the main characteristics of the image; overall, the calculation method for the batch normative neural network is low in cost and high in implementation degree.

Drawings

FIG. 1 is a diagram illustrating the operation of the network batch normalization unit according to the present invention.

Fig. 2 is a schematic diagram of the unit data transfer of the present invention.

FIG. 3 is a schematic flow diagram of a feature preprocessing unit of the present invention

Detailed Description

As shown in fig. 2, in this embodiment, an iterative neural network batch normalization system includes a feature preprocessing unit, a neural network construction unit, and a network batch normalization unit, and through a batch normalization neural network construction process, a convergence rate of a learning model can be reduced, and a neural network construction cost performance can be enhanced;

the neural network construction unit is used for establishing an image database on the basis of pictures of domestic mainstream websites, so that a learning network for identifying image content is trained, and the picture information is further processed;

the network batch normalization unit, as shown in fig. 1, can improve the training mode of the neural network, and increase the connectivity between the mean and variance of two batches before and after training, thereby reducing the sample distribution difference between adjacent batches and improving the convergence rate in the training process, and the specific steps are as follows:

step 1, calculating the number distribution histogram H of each category sample in the first batch of samples ₁ Calculating the mean value and the variance of the class sample, and taking the mean value and the variance as the initial mean value and the initial variance value of the neural network training model;

step 3, if the similarity between the class distribution histogram of the (i + 1) th batch of samples and the ith class distribution is greater than a threshold value theta, adopting the mean variance of the ith batch iteration as the mean and the variance; if the similarity of the class distribution is smaller than theta, updating the mean variance;

In a further embodiment, the network batch normalization unit needs to define the correction mean and variance, and obtains a histogram H of sample distribution by counting the number of samples of each category in the 1 st batch of samples ₁ As an initial batch sample distribution histogram; average value mu is determined for 1 st batch of samples ₁ Sum variance σ ₁ Both as initial corrective mean

And the correction variance pick>

The calculation formula is as follows:

/>

in a further embodiment, the network lot size normalization unit is required to compare the distribution histograms of the samples in the similarity maps by calculating the distribution histogram H of the sample of the (i + 1) th lot (i > 0) _i+1 Is prepared from H _i+1 Sample distribution histogram H of ith batch _i Calculating distribution similarity d (H) _i ,H _i+1 ) The concrete formula is as follows:

wherein the content of the first and second substances,

In a further embodiment, the network batch specification unit presets a threshold θ of the similarity of the class distribution in advance, the threshold is modified according to the calculation requirement, and the specific range is 0 < θ < 1; if the distribution degree d of the (i + 1) th batch _i+1 Theta, the mean value of correction currently used for this batch

And correcting variance +>

Same as the last batch, i.e.:

In a further embodiment, the mean and variance employed by the network batch specification unit are a corrective mean and a corrective variance; input value x at each layer in the neural network for the current batch of samples _i+1 The normalized input values obtained are:

In a further embodiment, as shown in fig. 3, the feature preprocessing unit performs edge preprocessing on the picture to be processed, and directly ignores most of the detail information in the picture, such as shadow, color change, texture change, and the like, to ensure the main scene in the picture to be highlighted.

In a further embodiment, when processing the background interference information of the input picture, the degree of retaining the background information can be changed by manually adjusting the threshold, and if it is clear before processing that the color information or the brightness change information is indispensable information in the subsequent processing, the degree of preprocessing can be reduced or the link of preprocessing the picture can be skipped directly, and the picture is directly transmitted into the neural network for subsequent recognition.

In a further embodiment, the feature preprocessing unit may form a feature distribution table of its own by performing feature reclassification after preprocessing the pictures of the database, and perform feature table classification comparison in advance when extracting the main information of the pictures, thereby improving the accuracy of picture feature identification.

In a further embodiment, the feature distribution table may directly use the general classification on the mainstream website, or may be customized for the usage field of the artificial neural network.

In summary, the present invention has the following advantages: by calculating the difference value between the mean value and the variance of two adjacent batch samples in the neural network, the threshold setting can be changed according to the construction requirement, so that the convergence speed in the learning process of the neural network is reduced, and the effect of batch standardized management of the neural network is achieved; the image preprocessing method can avoid the interference of information such as edge details in the image and the like on the theme feature identification, and further enhances the accuracy and speed of image theme classification. In general, the training mode of the traditional neural network construction is improved, the cost of the neural network construction is reduced and the training speed of the neural network is improved by standardizing the training process of the neural network in batches.

It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. The invention is not described in detail in order to avoid unnecessary repetition.

Claims

1. An iterative neural network batch normalization system comprises a feature preprocessing unit, a neural network construction unit and a network batch normalization unit;

step 3, if the similarity between the class distribution histogram of the (i + 1) th batch of samples and the ith class distribution is greater than a threshold value theta, adopting the mean variance of the ith batch iteration as the mean and the variance; if the category distribution similarity is smaller than theta, updating the mean variance;

and 4, step 4: and (4) carrying out correction batch normalization on the samples of the (i + 1) th batch.

2. The iterative neural network batch normalization system of claim 1, wherein the network batch normalization unit is required to define the mean and variance of correction, and obtains a histogram H of sample distribution by counting the number of samples of each category in the 1 st batch of samples ₁ As an initial batch sample distribution histogram; average value mu is determined for 1 st batch of samples ₁ Sum variance σ ₁ Both as the initial corrective mean

And the correction variance pick>

The calculation formula is as follows:

。

3. the iterative neural network batch normalization system of claim 1, wherein the network batchThe quantity specification unit needs to compare the similarity graphs of the sample distribution histograms by calculating the sample distribution histogram H of the (i + 1) th batch (i > 0) _i+1 Is prepared from H _i+1 Sample distribution histogram H of ith batch _i Calculating distribution similarity d (H) _i ，H _i+1 ) The concrete formula is as follows:

wherein the content of the first and second substances,

b is the b-th bar of the histogram, N is the total number of classes of all samples; d (H) _i ，H _i+1 ) The larger the value of (b), the higher the similarity, the maximum value is 1, and the minimum value is 0.

4. The iterative neural network batch normalization system of claim 1, wherein the network batch normalization unit presets a threshold θ of similarity of class distribution in advance, the threshold is modified according to calculation requirements, and the specific range is 0 < θ < 1; if the distribution degree d of the (i + 1) th batch _i+1 Theta, the mean value of correction currently used for this batch

And the correction variance pick>

As with the previous batch, i.e.:

5. The iterative neural network batch normalization system of claim 1, wherein the mean and variance employed by the network batch normalization unit are a corrective mean and a corrective variance; input value x at each layer in the neural network for the current batch of samples _i+1 The normalized input values obtained are:

wherein, epsilon is a constant floating point number with a value of 1.0 × 2 ^-126 。

6. The iterative neural network batch normalization system of claim 1, wherein the feature preprocessing unit edge-preprocesses the pictures to be processed.

7. The iterative neural network batch normalization system of claim 1, wherein the feature preprocessing unit forms a feature distribution table thereof by reclassifying features after preprocessing pictures of the database, and performs feature table classification comparison in advance when extracting main information of the pictures, so as to improve the accuracy of picture feature identification.