CN117521760A

CN117521760A - PyTorch-based zero interpolation upsampling method

Info

Publication number: CN117521760A
Application number: CN202210875352.6A
Authority: CN
Inventors: 余慧
Original assignee: Beijing Ingenic Semiconductor Co Ltd
Current assignee: Beijing Ingenic Semiconductor Co Ltd
Priority date: 2022-07-25
Filing date: 2022-07-25
Publication date: 2024-02-06

Abstract

The invention provides a PyTorch-based zero interpolation up-sampling method, which comprises the following steps: s1, based on a PyTorch definition class, __ init __ is set for parameter initialization, parameters transmitted in __ init __ are that kernel_h represents expansion multiples in a high direction, kernel_w represents expansion multiples in a wide direction, and mode represents an up-sampling interpolation mode, and zero is defaulted; s2, adding judgment in init, and judging the model type: s2.1, if the mode is not the zero mode, directly adopting a torch.nn.Upsample; s2.2, if zero, specific implementation is in forward; s2.2.1 first define a tensor of all 0 of the same size ([ n, c, self. Kernel_h, self. Kernel_w ]) as the target tensor; s2.2.2 the zero padding mode adopted by the method is right bottom padding, and the left top corner retains the original value, so that the value of the (i, j) position of the original input is taken at the corresponding position in a traversing way and assigned to (i x self-kernel_w, j x self-kernel_h).

Description

PyTorch-based zero interpolation upsampling method

Technical Field

The invention belongs to the technical field of neural networks, and particularly relates to a zero interpolation up-sampling method based on PyTorch.

Background

PyTorch is an open-source Python machine learning library, and is used for applications such as natural language processing based on Torch.

The Upsample class in the PyTorch library is defined as follows:

wherein:

size is the size to be output, data type is scroll: ([ optional D_out ], [ optional H_out ], W_out);

scale_factor is the magnification above height, width and depth. The data type can be int-indicating that the height, width and depth are all enlarged by the same multiple; or a scroll-specifying an expansion of height, width, and depth;

the mode up-sampling method comprises nearest neighbor (nearest), linear interpolation (linear), bilinear interpolation (bilinear), cubic linear interpolation (trilinear), and nearest neighbor (nearest) as default.

If align_filters is set to True, the pixels of the input image and output image corner points will be aligned, which is valid only when mode=linear, bilinear, or trilinear, defaulting to False.

Taking the upsampling method of PyTorch as an example, the pseudo code shown in FIG. 1 is two upsampling methods implemented on the basis of PyTorch in the command line of the Ubuntu terminal, and the nearest neighbor interpolation method and the bilinear interpolation method are respectively applied. The code will be explained in detail below, with reference to the drawings,

(1) A torch is adopted to randomly generate a tensor with four dimensions of 1,2 and a numerical range of 1-5, and the tensor is assigned to input;

(2) Printing the tensor, specifically as seen in fig. 1, is tensor ([ [ [ 1..2. ], [ 3..4. ] ] ]);

(3) The up-sampling class Upsample of the torch is called, the multiple scale_factor of parameter up-sampling is 2, and the mode (mode) adopted is nearest neighbor ('nearest');

(4) The upper Upsample is called, the input is the input defined above, and the output result is shown in figure 1;

(5) The up-sampling class Upsample of the torch is called, the multiple scale_factor of parameter up-sampling is 2, and the mode (mode) adopted is bilinear ('bilinear');

(6) The above Upsample is called, the input is the input defined above, and the output result is shown in FIG. 1.

The defects in the prior art are that:

the native PyTorch supports upsampling interpolation modes only nearest neighbor, linear (bilinear interpolation), bicubic (bicubic), trilinear (trilinear interpolation), and does not support other upsampling algorithms such as zero interpolation.

Furthermore, the common terminology in the prior art is as follows:

1. open source deep learning framework: including TensorFlow, keras, MXNet, pyTorch, CNTK, theano, caffe, deep learning4, lasagne, neon, etc. The most popular framework currently belongs to TensorFlow, keras, MXNet, pyTorch.

2. Upsampling, in a deep learning framework, can be simply understood as any technique that can bring the image to a higher resolution. The simplest way is resampling and interpolation: and carrying out rescale on the input image of the input picture to a desired size, calculating pixel points of each point, and interpolating other points by using interpolation methods such as bilinear interpolation.

3. Nearest neighbor interpolation algorithm, the simplest one, when the picture is enlarged, the missing pixel is generated by directly using the original color nearest to it, that is, the pixel beside it.

Ubuntu belongs to one of the release versions of the Linux system. Ubuntu is a Linux operating system that is primarily desktop applications.

Disclosure of Invention

In order to solve the above problems, an object of the present application is to: sometimes, for operator needs (such as low bit quantization) or multi-platform unification, an interpolation mode of zero is needed, and an upsampling method based on zero interpolation of PyTorch is provided.

Specifically, the invention provides a zero interpolation up-sampling method based on PyTorch, which comprises the following steps:

s1, based on a PyTorch definition class, __ init __ is set for parameter initialization, parameters transmitted in __ init __ are that kernel_h represents expansion multiples in a high direction, kernel_w represents expansion multiples in a wide direction, and mode represents an up-sampling interpolation mode, and zero is defaulted; expressed as:

s2, adding judgment in init, and judging the model type:

s2.1, if the mode is not the zero mode, directly adopting a torch.nn.Upsample; expressed as:

if mode！＝"zero":

self.upsample＝torch.nn.Upsample(scale_factor＝(kernel_h,kernel_w),

mode＝mode)

s2.2, if zero, specific implementation is in forward; forward is a sequential operation step for representing forward propagation and constructing a network layer, and when using the pyrerch, the forward function can be automatically called only by inputting a corresponding parameter into an object without calling the forward function during model training.

S2.2.1 it is sufficient to first define a tensor of all 0 of the same size ([ n, c, self.kernel_h, self.kernel_w ]) as the target tensor, since kernel_h represents the expansion factor in the high direction and kernel_w represents the expansion factor in the wide direction, it is just a direct multiplication with the corresponding factor;

s2.2.2, the zero padding mode adopted by the method is right bottom padding, and the left top corner keeps the original value, so that the value of the (i, j) position of the original input is taken at the corresponding position in a traversing way and assigned to (i x self-kernel_w, j x self-kernel_h);

expressed as:

in the method, a 2x3 tensor performs scale= (2, 2) zero mode upsampling, the size after upsampling becomes 4*6, (j×self.kernel_h, i×self.kernel_w) value corresponds to the value at the original input (j, i), and the rest positions are filled with 0 values, i.e. table 1 changes to table 2, as follows:

the default value of the parameter kernel_h is 2, and the default value of the parameter kernel_w is 2.

The method is a new extended upsampling method 'zero' realized in the python script of the Ubuntu terminal, wherein a torch default method can also be supported, and the parameter mode is used for selecting.

The mode represents the manner in which the up-sampled interpolation takes, default to "zero", and the method supports "zero", "nearest", "linear", "bilinear", "trilinear" modes, with several other modes being self-supporting by the torch itself, in addition to zero.

Assuming that the other mode employs a nearest:

step (1) is to adopt torch to arbitrarily generate a tensor with four dimensions of 1,2 and a numerical value range of 1-5, and assign the tensor to input; expressed as:

input＝torch.arange(1,5,dtype＝torch.float32).view(1,1,2,2) (1)

printing the tensor, namely, the tensor ([ [ [ 1..2. ], [ 3..4. ] ] ] ]); expressed as:

input (2)

tensor([[[[1.,2.],

[3.,4.]]]])

step (3) is to call up sampling type Upsample of the torch, and to give up sampling multiple scale_factor of 2, and the mode is nearest neighbor 'nearest'; expressed as:

m＝nn.Upsample(scale_factor＝2,mode＝'nearest') (3)

step (4) is to call the Upsample above, input the input defined in the previous step (1), and output the result as follows; expressed as:

m(input) (4)

tensor([[[[1.,1.,2.,2.],

[1.,1.,2.,2.],

[3.,3.,4.,4.],

[3.,3.,4.,4.]]]])。

assume that the other mode employs bilinear:

input＝torch.arange(1,5,dtype＝torch.float32).view(1,1,2,2) (1)

input (2)

tensor([[[[1.,2.],

[3.,4.]]]])

step (5) calling up-sampling type Upsample of the torch, and feeding up-sampling multiple scale_factor of parameters to be 2, wherein the mode adopted is bilinear'; expressed as:

m＝nn.Upsample(scale_factor＝2,mode＝'bilinear') (5)

step (6) is to call the Upsample above, input the input defined in the previous step (1), and output the result as follows, expressed as:

m(input) (6)

tensor([[[[1.0000,1.2500,1.7500,2.0000],[1.5000,1.7500,2.2500,2.5000],[2.5000,2.7500,3.2500,3.5000],[3.0000,3.2500,3.7500,4.0000]]]])。

thus, the present application has the advantages that: the PyTorch-based upsampling method is general and beneficial to quantization, is simple, and overcomes the defect that the PyTorch in the prior art does not support the upsampling algorithm of zero interpolation.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate and together with the description serve to explain the invention.

Fig. 1 is a pseudo code schematic of a prior art PyTorch.

Fig. 2 is a pseudo-code schematic of the method of the present invention.

FIG. 3 is a schematic flow chart of the method of the present invention.

Detailed Description

In order that the technical content and advantages of the present invention may be more clearly understood, a further detailed description of the present invention will now be made with reference to the accompanying drawings.

As shown in fig. 3, the present invention relates to a method for upsampling based on pyrosch's zero interpolation, said method comprising the steps of:

s2, adding judgment in init, and judging the model type:

if mode！＝"zero":

self.upsample＝torch.nn.Upsample(scale_factor＝(kernel_h,kernel_w),

mode＝mode)

s2.2.2, the zero padding mode adopted by the method is right bottom padding, and the left top corner keeps the original value, so that the value of the (i, j) position of the original input is taken at the corresponding position in a traversing way and assigned to (i x self-kernel_w, j x self-kernel_h); expressed as:

pseudo code as shown in fig. 2, concretely implemented:

the method is an expanded new upsampling method 'zero' realized in the python script of the Ubuntu terminal, wherein the torch default method can also be supported, and the selection is carried out through a parameter mode, and the following specific explanation codes are as follows:

first, the above method is described, and __ init __ is mainly used for parameter initialization, forward is a forward propagation, and the sequence operation steps of constructing a network layer are shown, when Pytorch is used, the forward function is not required to be called during model training, and only the corresponding parameter is required to be transferred into an object to be instantiated, so that the forward function can be automatically called.

The parameters entered in __ init __ above have a kernel_h representing the expansion factor in the high direction, a default value of 2, a kernel_w representing the expansion factor in the wide direction, a default value of 2, a mode representing the manner of up-sampling interpolation, a default "zero", where "zero", "nearest", "linear", "bilinear", "trilinear" are supported, except for the first, the latter ones being self-supporting by the torch itself. Therefore, the init is judged to directly use torch.nn.upsample if the "zero" mode is not adopted, and the specific implementation is in forward:

firstly, defining a tensor of which the size is equal to that of a target tensor ([ n, c, self.kernel_h, self.kernel_w) and of which the size is equal to that of the target tensor (since kernel_h represents expansion multiples in the high direction and kernel_w represents expansion multiples in the wide direction, the corresponding multiples are directly multiplied), wherein the idea is that the zero padding mode is adopted by the scheme of right bottom padding, the original value is reserved in the left upper corner, and therefore, the value of the (i, j) position of the original input is taken at the corresponding position in a traversing way and is assigned to the (i.kernel_w, j.kernel_h). The specific description is as follows:

taking a 2x3 tensor scale = (2, 2) zero mode upsampling as an example, the size after upsampling becomes 4*6, (j.self.kernel_h, i.self.kernel_w) values correspond to the values at the original inputs (j, i), the remaining positions are filled with 0 values, i.e. table 1 changes to table 2, as follows:

the method is a new extended upsampling method 'zero' realized in the python script of the Ubuntu terminal, wherein a torch default method can also be supported, and the parameter mode is used for selecting. The mode represents the manner in which the up-sampled interpolation takes, default to "zero", and the method supports "zero", "nearest", "linear", "bilinear", "trilinear" modes, with several other modes being self-supporting by the torch itself, in addition to zero.

As shown in fig. 1, assume that the other mode employs a nearest:

input＝torch.arange(1,5,dtype＝torch.float32).view(1,1,2,2) (1)

input (2)

tensor([[[[1.,2.],

[3.,4.]]]])

m＝nn.Upsample(scale_factor＝2,mode＝'nearest') (3)

step (4) is to call the Upsample above, input as the input defined in the previous step (1),

the output results are shown below; expressed as:

m(input) (4)

tensor([[[[1.,1.,2.,2.],[1.,1.,2.,2.],[3.,3.,4.,4.],[3.,3.,4.,4.]]]])。

assume that the other mode employs bilinear:

m＝nn.Upsample(scale_factor＝2,mode＝'bilinear') (5)

m(input) (6)

the above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, and various modifications and variations can be made to the embodiments of the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method of upsampling a zero interpolation based on pyrerch, the method comprising the steps of:

s2, adding judgment in init, and judging the model type:

if mode！＝"zero":

self.upsample＝torch.nn.Upsample(scale_factor＝(kernel_h,kernel_w),mode＝mode)

s2.2.2, the zero padding mode adopted by the method is right bottom padding, and the left top corner keeps the original value, so that the value of the (i, j) position of the original input is taken at the corresponding position in the cyclic traversal and assigned to (i x self_w, j x self.kernel_h);

expressed as:

2. the method of upsampling based on pyrotech's zero interpolation according to claim 1, wherein one 2x3 element is upsampled in the zero pattern of scale= (2, 2), the size after upsampling becomes 4*6, (j x self. Kernel_h, i x self. Kernel_w) value corresponds to the value at the original input (j, i), the rest of the positions are filled with 0 values, i.e. table 1 is changed to table 2, as follows:

3. the method of claim 1, wherein the parameter kernel_h is 2 and kernel_w is 2.

4. The method of claim 1, wherein the method is an extended new upsampling method "zero" implemented in the python script of the Ubuntu terminal, wherein the torch default method is also supportable, and the parameter mode is selected.

5. The method of claim 4, wherein the mode represents a default "zero" for the upsampled interpolation, and the method supports "zero", "nearest", "linear", "bilinear", "trilinear" modes, and several other modes are supported by the torch itself, except for zero.

6. The method of claim 5, wherein assuming that the other pattern uses nearest:

input＝torch.arange(1,5,dtype＝torch.float32).view(1,1,2,2) (1)

input (2)

tensor([[[[1.,2.],

[3.,4.]]]])

m＝nn.Upsample(scale_factor＝2,mode＝'nearest') (3)

7. the method of claim 5, wherein assuming that the other mode uses bilinear:

input＝torch.arange(1,5,dtype＝torch.float32).view(1,1,2,2) (1)

printing the tensor, namely, the tensor ([ [ [ 1..2. ], [ 3..4. ] ] ] ]); expressed as: input (2)

tensor([[[[1.,2.],

[3.,4.]]]])

m＝nn.Upsample(scale_factor＝2,mode＝'bilinear') (5)