CN111951189B

CN111951189B - Data enhancement method for multi-scale texture randomization

Info

Publication number: CN111951189B
Application number: CN202010813012.1A
Authority: CN
Inventors: 井焜; 陈英鹏; 许野平; 刘辰飞
Original assignee: Synthesis Electronic Technology Co Ltd
Current assignee: Synthesis Electronic Technology Co Ltd
Priority date: 2020-08-13
Filing date: 2020-08-13
Publication date: 2022-05-06
Anticipated expiration: 2040-08-13
Also published as: CN111951189A

Abstract

The invention discloses a data enhancement method for multi-scale texture randomization, which splices 4 random training samples to form an output sample, and the output sample retains the characteristics of the 4 training samples, thereby increasing the sample characteristics and preventing overfitting during training; and randomly increasing a texture mask frame, comparing the self-checking overlapping area of the texture mask frame and the sample mark frame, if the overlapping area is smaller than a set threshold value, reserving the mask frame, and if the sample mark frame is an overlapping area mark frame, realizing the processing of the overlapping area. According to the invention, through carrying out multi-scale texture randomization data enhancement on the training sample, the preprocessing effect of the training data of the target detection task can be improved, and the recognition detection effect is improved.

Description

Data enhancement method for multi-scale texture randomization

Technical Field

The invention relates to the field of artificial intelligence, in particular to a computer vision typical target detection task training data preprocessing stage, and specifically relates to a data enhancement method for multi-scale texture randomization.

Background

In a real application scene, a large number of occlusion problems often exist. In the training samples of the target detection task, a large number of marked targets are overlapped, so that the shielded targets exist, and the characteristics of part of other targets exist in the training process, thereby influencing the recognition and detection effects of the targets.

A data enhancement method is proposed in the paper Improved reconstruction of volumetric Neural Networks with cut (https:// axiv. org/abs/1708.04552) to crop at random locations and areas of a certain size on an image. The method adds the occluded samples in the training as much as possible, but cannot be used for well managing the condition that a large number of occluded samples exist in the training samples.

A data enhancement method is proposed in a paper mixup, which is a book of radial music task minimizio (https:// axiv. org/abs/1710.09412), and two pictures are randomly selected to be superposed. In the target detection task, the samples with overlapping area overlapping proportion are not effectively distinguished. The method aims to generate more samples in a combined mode, and the condition that a large number of shielding samples exist in training samples cannot be well understood.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a multi-scale texture randomization data enhancement method, which increases sample characteristics, fills a random texture mask frame in a sample region and increases the effect of training data preprocessing of a target detection task.

In order to solve the technical problem, the sampling technical characteristics of the invention are as follows:

a multi-scale texture randomized data enhancement method comprises the following steps:

s01), selecting N training samples P = { P = { (P))₀,P₁,…,P_n-1And marker box information T = { [ X ] corresponding to N training samples₀,Y₀,W₀,H₀,L₀],[X₁,Y₁,W₁,H₁,L₁],…,[X_n-1,Y_n-1,W_n-1,H_n-1,L_n-1]}；

Wherein P contains picture information, e.g. P₀Contains (img 0, img _ w0, img _ h 0), img0 represents image P₀Img _ w0 denotes a picture P₀Img _ h0 represents the image P₀High, P₁、…、P_n-1The same as above;

t contains mark information of the image, [ X, Y, W, H, L ] is represented as a set of mark frame information, respectively representing the upper left corner (X, Y) of the mark frame, W is width, H is height, and L is the category of the frame;

s02), randomly selecting 4 samples from the training sample set and marking frame information corresponding to the 4 samples, and recording the 4 samples as P_tl、P_tr、P_bl、P_brAnd marking the mark frame information corresponding to 4 samples as X_tl,Y_tl,W_tl,H_tl,L_tl], [X_tr,Y_tr,W_tr,H_tr,L_tr], [X_bl,Y_bl,W_bl,H_bl,L_bl], [X_br,Y_br,W_br,H_br,L_br]；

S03), randomly generating 4 scaling factors, i.e., generating S = [ S0, S1, S2, S3], wherein S ranges between [0.5,1.0 ];

s04), and setting the sample after data enhancement as P_outSetting up P_outThe image information of (img, img _ w, img _ h), wherein img represents the data-enhanced image P_outImg _ w represents the picture P_outImg _ h represents the picture P_outThe height of (d);

s05), and setting the coordinates of the center point of the sample Pout after data enhancement as (xc, yc), then

xc = img_w / 2 + b （1），

yc = img_h / 2 + b （2），

Wherein b [ - (img _ w + img _ h)/16, (img _ w + img _ h)/16 ];

s06), when the input sample is P_tlTime, image img_tlMultiplying by a scaling factor s0, changing the image scale, and marking the output image as P_tl0And similarly, obtaining an output image through scaling of other three input samples and recording the output image as P_tr1、P_bl2、P_br3Output sample P_outIs P_tl0、P_tr1、P_bl2、P_br3Splicing in different scales, and converting corresponding mark frame information to be recorded as T_out；

S07), randomly generating n texture Mask boxes with different sizes, different shapes and different colors, and recording as Mask = [ m ]₀,m₁,…,m_n-1]；

S08), calculating an Overlap area of each generated mask frame and output sample mark frame, the Overlap area being denoted as Overlap = [ o =₀,o₁,…,o_n-1]；

S09), hypothesis randomGenerating a mask frame m_iAnd a certain mark frame t in the output sample_jOverlap, the area of the overlap region is area_iThen o_i = area_i / （w_j * h_j) Wherein the frame t is marked_jIs [ x ] as position information_j,y_j,w_j,h_j];

S10), value o when overlapping area_iGreater than a threshold value o_stThen the mask frame m is deleted_i。

Further, P_tl0、P_tr1、P_bl2、P_br3Splicing different scales to form an output sample P_outThe process comprises the following steps:

s61), placing the Ptl0 sample partial area at the upper left corner of Pout, and the concrete conversion formula is:

x1a = max(xc - img_tl0_w, 0)，

y1a = max(yc - img_tl0_h, 0)，

x2a = xc，

y2a = yc，

x1b = im_gtl0_w – (x2a – x1a)，

y1b = img_tl0_h – (y2a – y1a)，

x2b = img_tl0_w，

y2b = img_tl0_h，

img[y1a:y2a, x1a:x2a] = img_tl0[y1b:y2b, x1b:x2b]；

s62), adding P_tr1Partial area placement of sample P_outThe specific conversion formula of the upper right corner of the table is as follows:

x1a = xc，

y1a = max(yc - img_tr1_h, 0)，

x2a = min(xc + img_tr1_w, img_w)，

y2a = yc，

x1b = 0，

y1b = img_tr1_h – (y2a – y1a)，

x2b = min(img_tr1_w, x2a –x1a)，

y2b = img_tr1_h，

img[y1a:y2a, x1a:x2a] = img_tr1[y1b:y2b, x1b:x2b]；

s63), adding P_bl2Partial area placement P of sample_outThe specific conversion formula is:

x1a = max(xc – img_bl2_w, 0)，

y1a = yc，

x2a = xc，

y2a = min(img_h, yc + img_bl2_h)，

x1b = img_bl2_w – (x2a – x1a)，

y1b = 0，

x2b = max(xc, img_bl2_w)，

y2b = min(y2a – y1a ,img_bl2_h)，

img[y1a:y2a, x1a:x2a] = img_bl2[y1b:y2b, x1b:x2b]；

s64), adding P_br3Partial area placement P of sample_outThe specific conversion formula is as follows:

x1a = xc，

y1a = yc，

x2a = min(xc + img_br3_w, img_w)，

y2a = min(img_h, yc + img_br3_h)，

x1b = 0，

y1b = 0，

x2b = min(img_br3_w, x2a –x1a)，

y2b = min(y2a – y1a ,img_br3_h)，

img[y1a:y2a, x1a:x2a] = img_br3[y1b:y2b, x1b:x2b]。

further, o_stThe value of (d) was chosen to be 0.5.

The invention has the beneficial effects that: according to the invention, 4 random training samples are spliced to form an output sample, the output sample retains the characteristics of the 4 training samples, the sample characteristics can be increased, and overfitting during training is prevented; and randomly increasing a texture mask frame, comparing the self-checking overlapping area of the texture mask frame and the sample mark frame, if the overlapping area is smaller than a set threshold value, reserving the mask frame, and if the sample mark frame is an overlapping area mark frame, realizing the processing of the overlapping area. According to the invention, through carrying out multi-scale texture randomization data enhancement on the training sample, the preprocessing effect of the training data of the target detection task can be improved, and the recognition detection effect is improved.

Drawings

Fig. 1 is a schematic diagram of output samples formed by splicing 4 training samples in example 1.

Detailed Description

The invention is further described with reference to the following figures and specific examples.

Example 1

The embodiment discloses a data enhancement method for multi-scale texture randomization, which comprises the following steps:

Wherein P contains picture information, e.g. P₀Contains (img 0, img _ w0, img _ h 0), img0 represents image P₀Img _ w0 denotes a picture P₀Img _ h0 represents the image P₀High, P₁、…、P_n-1Has the same meaning as above;

t contains mark information of an image, as shown in FIG. 1, a plurality of mark frames exist in one training sample, namely one picture, so [ X, Y, W, H, L ] represents a group of mark frame information and respectively represents the upper left corner (X, Y) of the mark frame, W is width, H is height, and L is the category of the frame;

s02), randomly selecting 4 samples from the training sample set and marking frame information corresponding to the 4 samples, and recording the 4 samples as P_tl、P_tr、P_bl、P_brMarks corresponding to 4 samplesThe frame information is marked as [ X ]_tl,Y_tl,W_tl,H_tl,L_tl], [X_tr,Y_tr,W_tr,H_tr,L_tr], [X_bl,Y_bl,W_bl,H_bl,L_bl], [X_br,Y_br,W_br,H_br,L_br]；

S03), randomly generating 4 scale factors, namely generating S = [ S0, S1, S2, S3], wherein the range of S is between [0.5 and 1.0], namely S ∈ [0.5 and 1.0 ];

xc = img_w / 2 + b （1），

yc = img_h / 2 + b （2），

Wherein b [ - (img _ w + img _ h)/16, (img _ w + img _ h)/16 ];

s06), when the input sample is P_tlTime, image img_tlMultiplying by a scaling factor s0 to change the image scale, and recording the output image as P_tl0Similarly, the other three input samples are scaled to obtain an output image which is recorded as P_tr1、P_bl2、P_br3Output sample P_outIs P_tl0、P_tr1、P_bl2、P_br3Splicing in different scales, and converting corresponding mark frame information to be recorded as T_out；

S07), randomly generating n texture Mask frames with different sizes, different shapes and different colors, and recording the frames as Mask = [ m ]₀,m₁,…,m_n-1]；

S09), assuming random generation of mask frame m_iAnd a certain mark frame t in the output sample_jOverlap, the area of the overlap region is area_iThen o is_i = area_i / （w_j * h_j) Wherein the frame t is marked_jIs [ x ] as position information_j,y_j,w_j,h_j];

S10), value o when overlapping area_iGreater than a threshold value o_stThen the mask frame m is deleted_iGeneral o_stThe value of (d) was chosen to be 0.5.

As shown in FIG. 1, P_tl0、P_tr1、P_bl2、P_br3Splicing different scales to form an output sample P_outThe process comprises the following steps:

s61), placing the part area of Ptl0 samples at the upper left corner of Pout, and the concrete conversion formula is:

x1a = max(xc - img_tl0_w, 0)，

y1a = max(yc - img_tl0_h, 0)，

x2a = xc，

y2a = yc，

x1b = im_gtl0_w – (x2a – x1a)，

y1b = img_tl0_h – (y2a – y1a)，

x2b = img_tl0_w，

y2b = img_tl0_h，

img[y1a:y2a, x1a:x2a] = img_tl0[y1b:y2b, x1b:x2b]；

wherein, (x1a, y1a), (x2a, y2a) are images P_outThe coordinates of the upper left corner and the lower right corner of part A1, (x1b, y1b), (x2b, y2b) are respectively an image P_tl0Coordinates of the upper left corner and the lower right corner of part a2, and the last formula shows that image P is to be processed_tl0Is mapped to image P_outPart a 1.

x1a = xc，

y1a = max(yc - img_tr1_h, 0)，

x2a = min(xc + img_tr1_w, img_w)，

y2a = yc，

x1b = 0，

y1b = img_tr1_h – (y2a – y1a)，

x2b = min(img_tr1_w, x2a –x1a)，

y2b = img_tr1_h，

img[y1a:y2a, x1a:x2a] = img_tr1[y1b:y2b, x1b:x2b]；

wherein, (x1a, y1a), (x2a, y2a) are images P_outThe coordinates of the upper left corner and the lower right corner of part B1, (x1B, y1B), (x2B, y2B) are respectively an image P_tr1B2 part, the last formula representing image P_tr1Is mapped to image P_outPart B1.

S63), adding P_bl2Partial area placement of sample P_outThe specific conversion formula is:

x1a = max(xc – img_bl2_w, 0)，

y1a = yc，

x2a = xc，

y2a = min(img_h, yc + img_bl2_h)，

x1b = img_bl2_w – (x2a – x1a)，

y1b = 0，

x2b = max(xc, img_bl2_w)，

y2b = min(y2a – y1a ,img_bl2_h)，

img[y1a:y2a, x1a:x2a] = img_bl2[y1b:y2b, x1b:x2b]；

wherein, (x1a, y1a), (x2a, y2a) are images P_outThe coordinates of the upper left corner and the lower right corner of the C1 part of (x1b, y1b), (x2b, y2b) are respectively the image P_bl2C2, and the last formula represents image P_bl2Is mapped to image P_outPart C1.

S64), adding P_br3Partial area placement of sample P_outThe specific conversion formula is as follows:

x1a = xc，

y1a = yc，

x2a = min(xc + img_br3_w, img_w)，

y2a = min(img_h, yc + img_br3_h)，

x1b = 0，

y1b = 0，

x2b = min(img_br3_w, x2a –x1a)，

y2b = min(y2a – y1a ,img_br3_h)，

img[y1a:y2a, x1a:x2a] = img_br3[y1b:y2b, x1b:x2b];

wherein, (x1a, y1a), (x2a, y2a) are images P_outThe coordinates of the upper left corner and the lower right corner of the D1 part of (x1b, y1b), (x2b, y2b) are the image P_br3The coordinates of the upper left and lower right parts of part D2, and the last formula represents image P_br3Is mapped to image P_outPart D1.

According to the invention, 4 random training samples are spliced to form an output sample, the output sample retains the characteristics of the 4 training samples, the sample characteristics can be increased, and overfitting during training is prevented; and randomly increasing a texture mask frame, comparing the self-checking overlapping area of the texture mask frame and the sample mark frame, if the overlapping area is smaller than a set threshold value, reserving the mask frame, and if the sample mark frame is the overlapping area mark frame, realizing the processing of the overlapping area. According to the invention, through carrying out multi-scale texture randomization data enhancement on the training samples, the effect of training data preprocessing of the target detection task can be improved, and the recognition detection effect is improved.

The foregoing description is only for the purpose of illustrating the general principles and preferred embodiments of the present invention, and it is intended that modifications and substitutions be made by those skilled in the art in light of the present invention and that they fall within the scope of the present invention.

Claims

1. A data enhancement method of multi-scale texture randomization is characterized in that: the method comprises the following steps:

s05), and setting the data enhanced sample P_outThe coordinate of the center point is (xc, yc)Then, then

xc = img_w / 2 + b （1），

yc = img_h / 2 + b （2），

Wherein b [ - (img _ w + img _ h)/16, (img _ w + img _ h)/16 ];

s06), when the input sample is P_tlTime, image img_tlMultiplying by a scaling factor s0 to change the image scale, and recording the output image as P_tl0And similarly, obtaining an output image through scaling of other three input samples and recording the output image as P_tr1、P_bl2、P_br3Output sample P_outIs P_tl0、P_tr1、P_bl2、P_br3Splicing in different scales, and converting corresponding mark frame information to be recorded as T_out；

S09), assuming random generation of mask frame m_iAnd a certain mark frame t in the output sample_jOverlap, the area of the overlap region is area_iThen o_i = area_i / （w_j * h_j) Wherein the frame t is marked_jIs [ x ] as position information_j,y_j,w_j,h_j];

2. The method of multi-scale texture randomization of data enhancement as recited in claim 1, wherein: p_tl0、P_tr1、P_bl2、P_br3Splicing different scales to form an output sample P_outThe process comprises the following steps:

s61), willP_tl0Partial area placement of sample P_outThe specific conversion formula of the upper left corner of (1) is as follows:

x1a = max(xc - img_tl0_w, 0)，

y1a = max(yc - img_tl0_h, 0)，

x2a = xc，

y2a = yc，

x1b = im_gtl0_w – (x2a – x1a)，

y1b = img_tl0_h – (y2a – y1a)，

x2b = img_tl0_w，

y2b = img_tl0_h，

img[y1a:y2a, x1a:x2a] = img_tl0[y1b:y2b, x1b:x2b]；

x1a = xc，

y1a = max(yc - img_tr1_h, 0)，

x2a = min(xc + img_tr1_w, img_w)，

y2a = yc，

x1b = 0，

y1b = img_tr1_h – (y2a – y1a)，

x2b = min(img_tr1_w, x2a –x1a)，

y2b = img_tr1_h，

img[y1a:y2a, x1a:x2a] = img_tr1[y1b:y2b, x1b:x2b]；

s063), mixing P_bl2Partial area placement of sample P_outThe specific conversion formula is:

x1a = max(xc – img_bl2_w, 0)，

y1a = yc，

x2a = xc，

y2a = min(img_h, yc + img_bl2_h)，

x1b = img_bl2_w – (x2a – x1a)，

y1b = 0，

x2b = max(xc, img_bl2_w)，

y2b = min(y2a – y1a ,img_bl2_h)，

img[y1a:y2a, x1a:x2a] = img_bl2[y1b:y2b, x1b:x2b]；

s064), adding P_br3Partial area placement of sample P_outThe concrete conversion formula of the lower right corner of the table is as follows:

x1a = xc，

y1a = yc，

x2a = min(xc + img_br3_w, img_w)，

y2a = min(img_h, yc + img_br3_h)，

x1b = 0，

y1b = 0，

x2b = min(img_br3_w, x2a –x1a)，

y2b = min(y2a – y1a ,img_br3_h)，

img[y1a:y2a, x1a:x2a] = img_br3[y1b:y2b, x1b:x2b]。

3. the method of multi-scale texture randomization of data enhancement of claim 1, characterized in that: o_stThe value of (d) was chosen to be 0.5.