WO2022255561A1 - 고효율 풀링 방법 및 이를 위한 장치 - Google Patents
고효율 풀링 방법 및 이를 위한 장치 Download PDFInfo
- Publication number
- WO2022255561A1 WO2022255561A1 PCT/KR2021/014771 KR2021014771W WO2022255561A1 WO 2022255561 A1 WO2022255561 A1 WO 2022255561A1 KR 2021014771 W KR2021014771 W KR 2021014771W WO 2022255561 A1 WO2022255561 A1 WO 2022255561A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pooling
- data
- row
- window
- column
- Prior art date
Links
- 238000011176 pooling Methods 0.000 title claims abstract description 378
- 238000000034 method Methods 0.000 title claims abstract description 80
- 239000011159 matrix material Substances 0.000 claims abstract description 6
- 238000004364 calculation method Methods 0.000 claims description 27
- 238000013527 convolutional neural network Methods 0.000 description 9
- 101001064468 Pseudozyma aphidis (strain ATCC 32657 / CBS 517.83 / DSM 70725 / JCM 10318 / NBRC 10182 / NRRL Y-7954 / St-0401) Lipase A Proteins 0.000 description 6
- 238000011160 research Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/544—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
Definitions
- the present invention relates to a method for performing pooling in a computing device and a hardware accelerator to which the method is applied.
- a CNN performs a plurality of operation steps including a pooling operation.
- US registered patent US10713816 proposes an object detection method using a deep CNN pooling layer as a feature.
- FIG. 1 illustrates a computational structure of a CNN according to an embodiment. It will be described with reference to FIG. 1 below.
- convolution layers 52 may be generated by performing a convolution operation using a plurality of kernels on the input image data 51 stored in the internal memory. Generating the convolution layers 52 may include performing a nonlinear operation (eg, ReLU, Sigmoid, or tanH) on a plurality of feature maps obtained as a result of performing the convolution operation.
- a nonlinear operation eg, ReLU, Sigmoid, or tanH
- pooling of the convolutional layers 52 may be performed to generate pooling layers 53 .
- Each convolutional layer 52 may include data that can be expressed in the form of an M*N matrix.
- a pooling window which is a window having a smaller dimension than the convolution layer 52, may be defined to perform pooling.
- the pooling window may have sizes of M p and N p in row and column directions, respectively.
- the pooling is an operation of generating a smaller number of data, for example, one data, from M p *N p data selected by overlapping the pooling window on the convolution layer.
- MAX pooling is an operation of selecting and outputting one of the largest values among the M p *N p pieces of data.
- average pooling is an operation of outputting an average value of the M p *N p pieces of data.
- Other pooling rules may be defined. The number of cases in which the pooling window may overlap the convolutional layer varies. According to embodiments, rules for moving the pooling window on the convolution layer may be restricted.
- the row direction stride of the pooling operation is referred to as S M
- the pooling window is referred to as a column of the convolution layer
- the stride in the column direction of the pooling operation may be referred to as S N (S M and S N are natural numbers).
- S M and S N are natural numbers.
- an array to be input to the neural network 54 may be generated by performing flattening on the pooling layers 53 .
- the array can then be input to the neural network 54 to generate an output from the neural network 54 .
- FIG. 1 shows one embodiment of a CNN
- various other examples of implementing CNNs exist.
- the pooling operation is used for the implementation of CNN in FIG. 1, the pooling operation may be used in other computing technology fields other than CNN.
- the amount of computation of the pooling operation tends to increase as the size of the pooling window increases, and tends to increase as the size of the stride decreases. Also, the smaller the size of the stride, the higher the tendency for the same operation to be repeated during the pooling operation.
- An object of the present invention is to provide a pooling method that reduces the amount of calculation for pooling operation.
- a pooling method of pooling input data 100 that can be expressed as a matrix into a pooling window 10 having a size of R p and C p in a row direction and a column direction, respectively, may be provided.
- the pooling method may include generating, by a computing device, temporary data 110 by pooling the input data 100 using a first pooling window 20 having a size of C p ; and generating, by the computing device, pooling data 200 by pooling the temporary data 110 using the second pooling window 30 having a size of R p .
- the first pooling window 20 is a window having sizes of 1 and C p in the row and column directions, respectively, and the second pooling windows 30 and 31 have sizes R p in the row and column directions, respectively. and a window of 1.
- the generating of the pooling data 200 may include transposing the temporary data 110 to generate transposed data 120 by the computing device; generating, by the computing device, second temporary data 130 by pooling the transpose data 120 using the second pulling windows 30 and 32; and generating, by the computing device, the pulling data 200 by transposing the second temporary data 130 .
- the first pooling window 20 is a window having sizes of 1 and C p in the row and column directions, respectively, and the second pooling windows 30 and 32 have sizes of 1 and 1 in the row and column directions, respectively. It may be a window of R p .
- the input data 100 has sizes of R and C in row and column directions, respectively, and the temporary data 110 has sizes of CC p +1 and R in row and column directions, respectively.
- the pooling data 200 has sizes of RR p +1 and CC p +1 in the row direction and the column direction, respectively, and the element pair ⁇ (i, j), (i, j + 1) of the input data 100 , ...
- (i, j + C P -1) ⁇ and pooled data by overlapping the first pooling window 20 are stored in elements (j, i) of the temporary data 110 (i is row index, j is a column index), the element pair ⁇ (i, j), (i, j+1), ..., (i, j+R p -1) ⁇ of the temporary data 110 and the above Data pooled by overlapping the second pooling windows 30 and 32 may be stored in elements (j, i) of the pooling data 200 (i is a row index and j is a column index).
- the first pooling window 20 is a window having sizes of 1 and C p in the row and column directions, respectively, and the second pooling windows 30 and 32 have sizes of 1 and 1 in the row and column directions, respectively. It may be a window of R p .
- the row-direction stride and the column-direction stride of the first pooling window may be 1 and 1, respectively.
- the row-direction stride and the column-direction stride of the second pooling window may be 1 and 1, respectively.
- the predetermined pulling may be any one of MAX pulling, MIN pulling, and Average pulling.
- the present invention is not limited by specific input/output characteristics of the pooling.
- the row-direction stride of the first pooling window is less than or equal to R p /2
- the column-direction stride of the first pooling window is less than or equal to C p /2
- the row-direction stride of the second pooling window is less than or equal to R p /2 or less
- the column-direction stride of the third pooling window may be less than or equal to C p /2.
- the pooling method may be executed by the computing device by software including command codes.
- the input data, the temporary data, the pulling data, the transpose data, and the second temporary data may be written to a memory space defined in a volatile memory of the computing device by the command code. Some of these may be stored in non-volatile memory of the computing device, if desired.
- a hardware accelerator performing a pooling method of pooling input data 100 that can be expressed as a matrix into a pooling window 10 having sizes of R p and C p in row and column directions, respectively.
- the hardware accelerator includes a control unit 40; internal memory 30; and a data operation unit 610.
- the control unit causes the data operation unit to generate temporary data 110 by pooling the input data 100 using a first pooling window 20 having a size of C p in a first time period. and causes the data operation unit to pool the temporary data 110 in a second time period after the first time period by using a second pooling window 30 having a size of R p . It is adapted to perform the step of generating data 200 .
- the input data 100 has sizes of R and C in row and column directions, respectively
- the temporary data 110 has sizes of CC p +1 and R in row and column directions, respectively.
- the pooling data 200 has sizes of RR p +1 and CC p +1 in the row direction and the column direction, respectively, and the element pair ⁇ (i, j), (i, j + 1) of the input data 100 , ...
- (i, j + C P -1) ⁇ and pooled data by overlapping the first pooling window 20 are stored in elements (j, i) of the temporary data 110 (i is row index, j is a column index), the element pair ⁇ (i, j), (i, j+1), ..., (i, j+R p -1) ⁇ of the temporary data 110 and the above Data pooled by overlapping the second pooling windows 30 and 32 may be stored in elements (j, i) of the pooling data 200 (i is a row index and j is a column index).
- the first pooling window 20 is a window having sizes of 1 and C p in the row and column directions, respectively, and the second pooling windows 30 and 32 have sizes of 1 and 1 in the row and column directions, respectively. It may be a window of R p .
- a computing device provided according to another aspect of the present invention includes the hardware accelerator; memory 11; and a bus 700 that is a data exchange passage between the memory and the hardware accelerator.
- FIG. 1 illustrates a computational structure of a CNN according to an embodiment.
- 2a to 2c illustrate a pooling method provided according to an embodiment.
- 3a to 3f show a pooling method provided according to an embodiment of the present invention.
- 4A to 4H show a pooling method provided according to another embodiment of the present invention.
- 5A to 5F show a pooling method provided according to another embodiment of the present invention.
- FIG. 6 illustrates an example of a configuration of hardware to which a pooling method provided according to an embodiment of the present invention can be applied.
- FIG. 7 shows a method of implementing the pooling method shown in FIG. 3 or 5 with the hardware shown in FIG. 6 .
- FIG. 8 illustrates an example of implementing the pooling method presented in FIG. 5 using a data operation unit having a pipeline operation structure.
- 9a to 9d show a method of implementing the pooling method shown in FIG. 4 with the hardware shown in FIG. 6 .
- FIGS. 2A to 2C illustrate a pooling method provided according to an embodiment.
- FIGS. 2A to 2C may be collectively referred to as FIG. 2 .
- FIG. 2A shows a pooling window 10 having sizes of R p and C p in row and column directions, respectively.
- a box drawn with a thick line in FIG. 2 represents the pooling window 10 .
- FIG. 2B shows a state in which the pulling window 10 overlaps the input data 100 to be pooled by the pulling window 10 of FIG. 2A.
- the input data 100 has sizes R and C in a row direction and a column direction, respectively.
- FIG. 2C shows the pooling data 200 obtained by pooling the input data 100 through the pooling window 10.
- the pooling data 200 has sizes of RR p +1 and CC p +1 in row and column directions, respectively.
- Figure 2b shows that the pooling window 10 is the elements of the input data 100 ⁇ (1,1), (1,2), (1,3), (2,1), (2,2), (2 ,3) ⁇ is shown.
- element (1,1) of the pooling data 200 is the elements ⁇ (1,1), (1,2), (1,3), (2,1), (2,2), (2,3) ⁇ may be equal to the largest value. That is, element (x,y) of the pooling data 200 is the elements ⁇ (x,y), (x,y+1), (x,y+2), (x+1) of the input data 100 ,y), (x+1,y+1), (x+1,y+2) ⁇ .
- FIGS. 3A to 3F show a pooling method provided according to an embodiment of the present invention.
- FIGS. 3A to 3F may be collectively referred to as FIG. 3 .
- the pooling method of FIG. 3 is intended to provide the same results as the pooling method of FIG. 2 and may provide the same results. It will be described with reference to FIG. 3 below.
- FIG. 3A shows a first pooling window 20 having a size of 1 and C p in a row direction and a column direction, respectively.
- the column direction size (C P ) of the first pooling window 20 is selected as the same value as the column direction size (C P ) of the pooling window 10 .
- boxes drawn with bold lines in FIGS. 3A and 3B represent the first pooling window 20
- boxes drawn with thick lines in FIGS. 3D and 3E represent the second pooling windows 30 and 31 .
- FIG. 3B shows a state in which the first pooling window 20 overlaps the input data 100 .
- the input data 100 of FIG. 3B is the same as the input data 100 of FIG. 2B.
- FIG. 3C shows temporary data 110 obtained by pooling input data 100 through the first pooling window 20 .
- the temporary data 110 has sizes of R and CC p +1 in row and column directions, respectively.
- FIG. 3B shows a state in which the first pulling window 20 covers elements ⁇ (1,1), (1,2), (1,3) ⁇ of the input data 100 .
- the pooling method by the first pooling window 20 and the pooling method by the pooling window 10 are the same.
- pooling by the pooling window 10 is MAX pooling
- pooling by the first pooling window 20 is also MAX pooling.
- the element (1,1) of the temporary data 110 will be equal to the largest value among the elements ⁇ (1,1), (1,2), (1,3) ⁇ of the input data 100.
- the element (x,y) of the temporary data 110 is the largest value among the elements ⁇ (x,y), (x,y+1), (x,y+2) ⁇ of the input data 100 can be the same as
- FIG. 3D shows the second pooling windows 30 and 31 having sizes R p and 1 in the row and column directions, respectively.
- the size R P of the second pooling windows 30 and 31 in the row direction is selected as the same value as the size R P of the pooling window 10 in the row direction.
- 3E shows the second pooling windows 30 and 31 overlapping the temporary data 110 .
- row direction Stride 1
- FIG. 3F shows the pooling data 200 obtained by pooling the temporary data 110 through the second pooling windows 30 and 31 .
- the pooling data 200 has sizes of RR p +1 and CC p +1 in row and column directions, respectively.
- the pooling method by the second pooling windows 30 and 31 and the pooling method by the pooling window 10 are the same.
- the pooling by the pooling window 10 is MAX pooling
- the pooling by the second pooling windows 30 and 31 is also MAX pooling.
- element (1,1) of the pooling data 200 may be equal to the largest value among the elements ⁇ (1,1), (2,1) ⁇ of the temporary data 110 . That is, element (x, y) of the pooling data 200 may be equal to the largest value among elements ⁇ (x, y), (x+1, y) ⁇ of the temporary data 110 .
- FIGS. 4A to 4H show a pooling method provided according to another embodiment of the present invention.
- FIGS. 4A to 4H may be collectively referred to as FIG. 4 .
- the pooling method of FIG. 4 is intended to provide the same results as the pooling method of FIG. 2 and may provide the same results. It will be described with reference to FIG. 4 below.
- FIG. 4A shows a first pooling window 20 having a size of 1 and C p in a row direction and a column direction, respectively.
- the column direction size (C P ) of the first pooling window 20 is selected as the same value as the column direction size (C P ) of the pooling window 10 .
- the first pooling window 20 shown in FIG. 4A is the same as the first pooling window 20 shown in FIG. 3A.
- boxes drawn with bold lines in FIGS. 4A and 4B represent the first pooling window 20
- boxes drawn with bold lines in FIGS. 4E and 4F represent the second pooling windows 30 and 32 .
- FIG. 4B shows a state in which the first pooling window 20 overlaps the input data 100 .
- the input data 100 of FIG. 4B is the same as the input data 100 of FIG. 2B.
- FIG. 4C illustrates temporary data 110 obtained by pooling input data 100 through the first pooling window 20 .
- the temporary data 110 has sizes of R and CC p +1 in row and column directions, respectively.
- 4B shows a state in which the first pulling window 20 covers elements ⁇ (1,1), (1,2), (1,3) ⁇ of the input data 100.
- FIGS. 4A, 4B, and 4C are identical to FIGS. 3A, 3B, and 3C, respectively.
- the pooling method by the first pooling window 20 and the pooling method by the pooling window 10 are the same.
- the transpose data 120 has sizes of CC p +1 and R in the row and column directions, respectively.
- FIG. 4E shows second pooling windows 30 and 32 having sizes of 1 and R p in the row direction and the column direction, respectively.
- the column-direction size (R P ) of the second pooling windows 30 and 32 is selected as the same value as the row-direction size (R P ) of the pooling window 10 .
- the second pooling windows 30 and 32 shown in FIG. 4E and the second pooling windows 30 and 31 shown in FIG. 3D have the following similarities and differences. That is, the sizes of the second pooling windows 30 and 32 shown in FIG. 4E and the second pooling windows 30 and 31 shown in FIG. 3D are the same as the size R P of the pooling window 10 in the row direction. They have one thing in common in that they are 1D arrays.
- the second pooling windows 30 and 32 shown in FIG. 4E extend along the column direction, whereas the second pooling windows 30 and 31 shown in FIG. 3D extend along the row direction. There is a difference.
- 4F shows a state in which the second pooling windows 30 and 32 overlap the transpose data 120 .
- FIG. 4G shows second temporary data 130 obtained by pooling transpose data 120 with second pooling windows 30 and 32 .
- the second temporary data 130 has CC p +1 and a size in a row direction and a column direction, respectively.
- the pooling method by the second pooling windows 30 and 32 and the pooling method by the pooling window 10 are the same.
- the pooling by the pooling window 10 is MAX pooling
- the pooling by the second pooling windows 30 and 32 is also MAX pooling.
- element (1,1) of the second temporary data 130 may be equal to the largest value among the elements ⁇ (1,1), (1,2) ⁇ of the transpose data 120. That is, element (x, y) of the second temporary data 130 may be equal to the largest value among the elements ⁇ (x, y), (x, y+1) ⁇ of the transpose data 120. .
- FIG. 4H shows pooling data 200 generated by transposing the second temporary data 130 . That is, element (x,y) of the second temporary data 130 has the same value as element (y,x) of the pulling data 200 .
- the pooling data 200 has sizes of RR p +1 and CC p +1 in row and column directions, respectively.
- FIGS. 5A to 5F show a pooling method provided according to another embodiment of the present invention.
- FIGS. 5A to 5F may be collectively referred to as FIG. 5 .
- the pooling method of FIG. 5 is intended to provide the same results as the pooling method of FIG. 2 and may provide the same results. It will be described with reference to FIG. 5 below.
- FIG. 5A shows a first pooling window 20 having a size of 1 and C p in a row direction and a column direction, respectively.
- the column direction size (C P ) of the first pooling window 20 is selected as the same value as the column direction size (C P ) of the pooling window 10 .
- the first pooling window 20 shown in FIG. 5A is the same as the first pooling window 20 shown in FIG. 3A.
- boxes drawn with bold lines in FIGS. 5A and 5B represent the first pooling window 20
- boxes drawn with thick lines in FIGS. 5D and 5E represent the second pooling windows 30 and 32 .
- 5B shows a state in which the first pooling window 20 overlaps the input data 100 .
- the input data 100 of FIG. 5B is the same as the input data 100 of FIG. 2B.
- 5B shows a state in which the first pooling window 20 overlaps elements ⁇ (1,1), (1,2), (1,3) ⁇ of the input data 100.
- the first pooling window 20 increases the row number (row index number) first than the column number (column index number) of the overlapping input data 100. While doing so, pooling using the first pooling window 20 may be performed. That is, pooling is performed while increasing the row number while the column number of the input data 100 overlapping the first pooling window 20 is fixed.
- the column number of the input data 100 overlapping the first pooling window 20 is fixed to ⁇ 1, 2, 3 ⁇ . Pooling using the first pooling window 20 may be performed a total of 5 times while increasing row numbers from 1 ⁇ 2 ⁇ 3 ⁇ 4 ⁇ 5. Next, the column number of the input data 100 overlapped by the first pooling window 20 is increased by stride 1 to ⁇ 2, 3, 4 ⁇ , and the row number is increased from 1 ⁇ 2 ⁇ 3 ⁇ 4 ⁇ 5 While doing so, pooling using the first pooling window 20 may be additionally performed a total of 5 times. This process can be repeated by increasing the column number.
- FIG. 5C illustrates temporary data 110 obtained by pooling input data 100 through the first pooling window 20 .
- a memory space for storing the temporary data 110 may be prepared in advance.
- the total number of rows of the memory space is the total number of states in which the first pooling window 20 may overlap a specific row of the input data 100 and can be the same
- the total number of rows in the memory space may be equal to the total number of cases in which the first pooling window 20 overlaps the input data 100 and is disposed along the length direction of the row. For example, in FIG.
- the first pulling window 20 when the first pulling window 20 overlaps the first row of the input data 100, the first pulling window 20 is the 11th element pair of the input data 100 ⁇ (1,1), (1,2), (1,3) ⁇ , the 12th element pair ⁇ (1,2), (1,3), (1,4) ⁇ , or the 13th element pair ⁇ ( 1,3), (1,4), (1,5) ⁇ . That is, since the total number of states in which the first pooling window 20 can overlap a specific row of the input data 100 is 3, the total number of rows in the memory space is 3. That is, the total number of rows of the temporary data 110 is three.
- the total number of columns of the memory space that is, the total number of columns of the temporary data 110 is determined by the first pooling window 20 overlapping a specific pair of columns of the input data 100. It can be equal to the total number of states. In other words, the total number of columns in the memory space may be equal to the total number of cases in which the first pooling window 20 overlaps the input data 100 and is disposed along the length direction of the column. For example, in FIG.
- the first pulling window 20 when the first pulling window 20 overlaps the first column, the second column, and the third column of the input data 100, the first pulling window 20 is the input data ( 100) of the 11th element pair ⁇ (1,1), (1,2), (1,3) ⁇ , the 21st element pair ⁇ (2,1), (2,2), (2,3) ⁇ , the 31st element pair ⁇ (3,1), (3,2), (3,3) ⁇ , the 41st element pair ⁇ (4,1), (4,2), (4,3) ⁇ , or It can be overlapped with the 51st element pair ⁇ (5,1), (5,2), (5,3) ⁇ . That is, since the total number of states in which the first pooling window 20 can overlap a pair of specific columns of the input data 100 is 5, the total number of columns in the memory space is 5. That is, the total number of columns of the temporary data 110 is five.
- element pairs ⁇ (i, j), (i, j+1), (i, j+2) ⁇ of the input data 100 are pooled by overlapping the first pooling window 20.
- One data is stored in elements (j, i) of the temporary data 110 .
- i is the row index and j is the column index.
- the values of each element of the temporary data 110 may be determined while increasing the column number prior to the row number of the temporary data 110 .
- the pooling method by the pooling window 10 of FIG. 2 is the same as the pooling method by the first pooling window 20 of FIG. 5 .
- the pooling method by the first pooling window 20 is also MAX pulling.
- the pooling method by the first pooling window 20 is also average pooling.
- the dimension and value of the temporary data 110 shown in FIG. 5c may be the same as the dimension and value of the transpose data 120 shown in FIG. 4d.
- a plurality of elements included in one specific row of the temporary data 110 shown in FIG. 5C are stored in memory such as a buffer, register, SRAM, or DRAM in a burst manner can be stored in That is, a plurality of elements included in one specific row of the temporary data 110 may be stored in one word of the memory.
- FIG. 5D shows second pooling windows 30 and 32 having sizes of 1 and R p in the row direction and the column direction, respectively.
- the column direction size (R P ) of the second pooling windows 30 and 32 is selected as the same value as the row direction size (R P ) of the pooling window 10 of FIG. 2 .
- the second pooling windows 30 and 32 shown in FIG. 5D and the second pooling windows 30 and 31 shown in FIG. 3D have the following similarities and differences. That is, both the second pooling windows 30 and 32 shown in FIG. 5D and the second pooling windows 30 and 31 shown in FIG. 3D have the same size as the size R P of the pooling window 10 in the row direction. They have one thing in common in that they are 1D arrays.
- the second pooling windows 30 and 32 shown in FIG. 5D extend along the column direction, whereas the second pooling windows 30 and 31 shown in FIG. 3D extend along the row direction. There is a difference.
- FIG. 5E shows a state in which the second pooling windows 30 and 32 overlap the eleventh elements ⁇ (1,1), (1,2) ⁇ of the temporary data 110 .
- the computing device in order to pool the temporary data 110 using the second pooling windows 30 and 32, stores the temporary data 110 in the memory. It is possible to read the temporary data 110 that has been done. In a preferred embodiment, another row may be read after all of the specific rows of the temporary data 110 have been read. This is because when the temporary data 110 is read from the memory, it can be read in word units. In this case, another row is read after all the specific rows of the temporary data 110 are read.
- the column number (column index number) is given priority over the row number (row index number) of the temporary data 110 in which the second pooling windows 30 and 32 overlap. It is possible to perform pooling using the second pooling windows 30 and 32 while increasing . That is, pooling is performed while increasing the column number while the row number of the temporary data 110 overlapping the second pooling windows 30 and 32 is fixed. In one embodiment, when reading the temporary data 110 from the memory, when another row is read only after all of the specific row is read, the second pooling windows 30 and 32 are moved by giving priority to columns over rows in this way. That it makes sense is easy to understand.
- pooling while moving the second pooling windows 30 and 32 along the length direction of the row can be done a total of 3 times.
- the second pooling windows 30 and 32 are set in the length direction of the rows. Pulling can be additionally performed a total of 3 times while moving along. This process can be repeated while increasing the row number.
- FIG. 5F shows the pooling data 200 obtained by pooling the temporary data 110 through the second pooling windows 30 and 32 .
- a memory space for storing the pulling data 200 may be prepared in advance.
- the total number of rows of the memory space is the total number of states in which the second pooling windows 30 and 32 may overlap a specific row of the temporary data 110. may be equal to the number of In other words, the total number of rows in the memory space may be equal to the total number of cases in which the second pooling windows 30 and 32 overlap the temporary data 110 and may be disposed along the length direction of rows. have. For example, in FIG. 5E , when the second pooling windows 30 and 32 overlap the first row of the temporary data 110, the second pulling windows 30 and 32 are the first rows of the temporary data 110.
- the total number of columns of the memory space is determined by the second pooling windows 30 and 32 overlapping a specific pair of columns of the temporary data 110. may be equal to the total number of possible states. In other words, the total number of columns in the memory space may be equal to the total number of cases in which the second pooling windows 30 and 32 overlap the temporary data 110 and may be disposed along the length direction of the column. For example, in FIG. 5E, when the second pooling windows 30 and 32 overlap the first and second columns of the temporary data 110, the second pooling windows 30 and 32 are temporary data 110.
- 11th element pair ⁇ (1,1), (1,2) ⁇ , 21st element pair ⁇ (2,1), (2,2) ⁇ , or 31st element pair ⁇ (3,1), ( 3,2) ⁇ can overlap. That is, since the total number of states in which the second pooling windows 30 and 32 can overlap a specific column pair of the temporary data 110 is 3, the total number of columns in the memory space is 3. That is, the total number of columns of the pooling data 200 is three.
- the pooling data 200 is directions have magnitudes of RR p +1 and CC p +1, respectively.
- data obtained by overlapping the element pair ⁇ (i, j), (i, j+1) ⁇ of the temporary data 110 and the second pooling windows 30 and 32 is pooled data ( 200) element (j, i).
- i is the row index
- j is the column index.
- the values of each element of the pooling data 200 may be determined while increasing the column number prior to the row number of the temporary data 110 .
- First condition A condition in which pooling is performed using the second pooling windows 30 and 32 while increasing the column number prior to the row number of the temporary data 110 in which the second pooling windows 30 and 32 overlap.
- the pooling method by the pooling window 10 of FIG. 2 is the same as the pooling method by the second pulling windows 30 and 32 of FIG. 5 .
- the pooling method by the second pooling windows 30 and 32 is also MAX pulling.
- the pooling method by the second pooling windows 30 and 32 is also average pooling.
- the data presented in FIG. 5f may be the same as the data presented in FIG. 4h.
- the pulling data 200 shown in FIG. 2C, the pulling data 200 shown in FIG. 3F, and the pulling data 200 shown in FIGS. 4H and 5F are identical to each other. That is, the same pooling by the existing method (Fig. 2), the present invention in the first embodiment (Fig. 3), the present invention in the second embodiment (Fig. 4), and the present invention in the third embodiment (Fig. 5). calculation results can be obtained.
- MAX pooling is shown for convenience of description, but other types of pooling such as MIN pooling or Average pooling may be used.
- a pooling window 10 having a size of (R p , C p ), a first pooling window 20 having a size of (1, C p ), and (1, R p ) or (R p , 1 )
- the amount of calculation for performing the pooling operation once with the second pooling window 30 having the size of CalA (R p , C p ), CalA (C p ), and CalA (R p ) can be respectively defined. have.
- Equation 1 the total amount of pooling operations to obtain the pooling data 200 is given by Equation 1.
- TCalA1 CalA(R p , C p ) * (RR p +1) * (CC p +1)
- Equation 2 the total amount of pooling operations to obtain the pooling data 200 is given by Equation 2.
- TCalA2 CalA(C p ) * (R) * (CC p +1) + CalA(R p ) * (RR p +1) * (CC p +1)
- FIG. 6 illustrates an example of a configuration of hardware to which a pooling method provided according to an embodiment of the present invention can be applied.
- FIG. 6 shows the main structure of some of computing devices implementing the pooling method in hardware.
- the computing device 1 includes a dynamic random access memory (DRAM) 130, a hardware accelerator 110, a bus 700 connecting the DRAM 130 and the hardware accelerator 110, and other devices connected to the bus 700.
- DRAM dynamic random access memory
- hardware accelerator 110 a hardware accelerator 110
- bus 700 connecting the DRAM 130 and the hardware accelerator 110
- other devices connected to the bus 700.
- the computing device 1 may further include a power supply unit, a communication unit, a main processor, a user interface, a storage unit, and peripheral units, which are not shown.
- the bus 700 may be shared by the hardware accelerator 110 and other hardware 99 .
- the hardware accelerator 110 includes a DMA unit (Direct Memory Access part) 20, a control unit 40, an internal memory 30, an input buffer 650, a data operation unit 610, and an output buffer 640 can do.
- DMA unit Direct Memory Access part
- Some or all of data temporarily stored in the internal memory 30 may be provided from the DRAM 130 through the bus 700 .
- the controller 40 and the DMA unit 20 may control the internal memory 30 and the DRAM 130 to move data stored in the DRAM 130 to the internal memory 30 .
- Data stored in the internal memory 30 may be provided to the data calculator 610 through the input buffer 650 .
- Output values generated by the operation of the data calculator 610 may be stored in the internal memory 30 via the output buffer 640 .
- the output values stored in the internal memory 30 may be written to the DRAM 130 under the control of the control unit 40 and the DMA unit 20 .
- the control unit 40 may collectively control the operations of the DMA unit 20, the internal memory 30, and the data operation unit 610.
- the data calculator 610 may perform a first calculation function during a first time period and a second calculation function during a second time period.
- a plurality of data calculation units 610 shown in FIG. 6 may be provided in the hardware accelerator 110 to perform operations requested by the control unit 40 in parallel.
- the data operation unit 610 may sequentially output the output data according to a given order according to time rather than outputting them all at once.
- FIG. 7 shows a method of implementing the pooling method shown in FIG. 3 or 5 with the hardware shown in FIG. 6 .
- Block diagram 410 and block diagram 420 respectively show the internal memory 30, the input buffer 650, the data operation unit 610, And it is to explain the operation performed in the output buffer 640.
- the input data 100 stored in the internal memory 30 may be provided to the input buffer 650 .
- the input data 100 stored in the input buffer 650 may be provided to the data calculation unit 610 .
- the data calculation unit 610 is configured to generate temporary data 110 shown in FIG. 3c or FIG. 5c.
- the temporary data 110 output by the data calculation unit 610 may be provided to the output buffer 640 .
- the temporary data 110 stored in the output buffer 640 may be provided to the internal memory 30 .
- the temporary data 110 stored in the output buffer 640 may be transferred to the external DRAM 130 and then loaded from the DRAM 130 to the output buffer 640 again.
- the temporary data 110 stored in the internal memory 30 may be provided to the input buffer 650 .
- the temporary data 110 stored in the input buffer 650 may be provided to the data calculation unit 610 .
- the data calculation unit 610 is configured to generate the pooling data 200 shown in FIG. 3F or FIG. 5F.
- the pulling data 200 output by the data calculation unit 610 may be provided to the output buffer 640 .
- the pulling data 200 stored in the output buffer 640 may be provided to the internal memory 30 .
- the pulling data 200 stored in the output buffer 640 may be transferred to the external DRAM 130 and then loaded from the DRAM 130 to the output buffer 640 again.
- FIG. 8 illustrates an example of implementing the pooling method presented in FIG. 5 using a data operation unit having a pipeline operation structure.
- the data calculating unit 610 shown in FIG. 6 may include the first data calculating unit 611 and the second data calculating unit 612 shown in FIG. 8 .
- the first data calculator 611 receives the input data 100 of FIG. 5 and outputs the temporary data 110 .
- the second data operator 612 receives the temporary data 110 output from the first data operator 611 and outputs the pulling data 200 of FIG. 5 .
- the first data operation unit 611 may first generate and output elements 1 and 1 among the temporary data 110 shown in FIG. 5C and may generate and output elements 3 and 5 last.
- the second data operator 612 may generate and output elements 1 and 1 first and generate and output elements 4 and 3 last among the pooling data 200 shown in FIG. 5F.
- the first time period for calculating the temporary data 110 and the second time period for calculating the pooling data 200 may overlap at least in part.
- FIG. 9 9a, 9b, 9c, and 9d included in this specification may be collectively referred to as FIG. 9 .
- FIG. 9 shows a method of implementing the pooling method shown in FIG. 4 with the hardware shown in FIG. 6 .
- 9a, 9b, 9c, and 9d show the internal memory 30 shown in FIG. 6 and the input during the first time period, the second time period, the third time period, and the fourth time period, respectively. Operations performed by the buffer 650, the data operation unit 610, and the output buffer 640 are described.
- the input data 100 stored in the internal memory 30 may be provided to the input buffer 650 .
- the input data 100 stored in the input buffer 650 may be provided to the data calculation unit 610 .
- the data calculation unit 610 is configured to generate temporary data 110 shown in FIG. 4C.
- the temporary data 110 output by the data calculation unit 610 may be provided to the output buffer 640 .
- the temporary data 110 stored in the output buffer 640 may be provided to the internal memory 30 .
- the temporary data 110 stored in the output buffer 640 may be transferred to the external DRAM 130 and then loaded from the DRAM 130 into the output buffer 640 again.
- the temporary data 110 stored in the internal memory 30 may be provided to the input buffer 650 .
- the temporary data 110 stored in the input buffer 650 may be provided to the data calculation unit 610 .
- the data operation unit 610 is configured to generate the transpose data 120 shown in FIG. 4D.
- the transpose data 120 output by the data operation unit 610 may be provided to the output buffer 640 .
- the transpose data 120 stored in the output buffer 640 may be provided to the internal memory 30 .
- the transpose data 120 stored in the output buffer 640 may be transferred to the external DRAM 130 and then loaded from the DRAM 130 to the output buffer 640 again.
- the transpose data 120 stored in the internal memory 30 may be provided to the input buffer 650 .
- the transpose data 120 stored in the input buffer 650 may be provided to the data calculator 610 .
- the data calculation unit 610 is configured to generate the second temporary data 130 shown in FIG. 4G.
- the second temporary data 130 output by the data calculation unit 610 may be provided to the output buffer 640 .
- the second entrance examination data 130 stored in the output buffer 640 may be provided to the internal memory 30 .
- the second temporary data 130 stored in the internal memory 30 may be provided to the input buffer 650 .
- the second temporary data 130 stored in the input buffer 650 may be provided to the data calculation unit 610 .
- the data calculation unit 610 is configured to generate the pooling data 200 shown in FIG. 4H.
- the pulling data 200 output by the data calculation unit 610 may be provided to the output buffer 640 .
- the pulling data 200 stored in the output buffer 640 may be provided to the internal memory 30 .
- the pulling data 200 stored in the output buffer 640 may be transferred to the external DRAM 130 and then loaded from the DRAM 130 to the output buffer 640 again.
- the present invention is a combination of next-generation intelligent semiconductor technology development (design)-artificial intelligence processor business, which is a research project supported by Open Edge Technology Co., Ltd. (project performing organization) and the Ministry of Science and ICT and the National Research Foundation of Korea Information and Communication Planning and Evaluation Institute. It was developed in the process of carrying out the research project development of a sensory-based context predictive mobile artificial intelligence processor (task number 2020001310, task number 2020-0-01310, research period 2020.04.01 ⁇ 2024.12.31).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Software Systems (AREA)
- Computational Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Optimization (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Neurology (AREA)
- Image Processing (AREA)
- Multi Processors (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
Abstract
Description
Claims (14)
- 행렬로 표현 가능한 입력 데이터(100)를 행방향과 열방향의 크기가 각각 Rp 및 Cp인 크기의 풀링 윈도우(10)로 풀링하는 풀링 방법으로서,컴퓨팅 장치가, Cp 크기의 제1풀링 윈도우(20)를 이용하여, 상기 입력 데이터(100)를 풀링하여 임시 데이터(110)를 생성하는 단계; 및상기 컴퓨팅 장치가, Rp 크기의 제2풀링 윈도우(30)를 이용하여, 상기 임시 데이터(110)를 풀링하여 풀링 데이터(200)를 생성하는 단계;를 포함하는,풀링 방법.
- 제1항에 있어서,상기 제1풀링 윈도우(20)는 행방향과 열방향의 크기가 각각 1 및 Cp인 윈도우이며,상기 제2풀링 윈도우(30, 31)는 행방향과 열방향의 크기가 각각 Rp 및 1인 윈도우인,풀링 방법.
- 제1항에 있어서,상기 풀링 데이터(200)를 생성하는 단계는,상기 컴퓨팅 장치가, 상기 임시 데이터(110)를 트랜스포즈하여 트랜스포즈 데이터(120)를 생성하는 단계;상기 컴퓨팅 장치가, 상기 제2풀링 윈도우(30, 32)를 이용하여, 상기 트랜스포즈 데이터(120)를 풀링하여 제2임시 데이터(130)를 생성하는 단계; 및상기 컴퓨팅 장치가, 상기 제2임시 데이터(130)를 트랜스포즈하여 상기 풀링 데이터(200)를 생성하는 단계;를 포함하는,풀링 방법.
- 제3항에 있어서,상기 제1풀링 윈도우(20)는 행방향과 열방향의 크기가 각각 1 및 Cp인 윈도우이며,상기 제2풀링 윈도우(30, 32)는 행방향과 열방향의 크기가 각각 1 및 Rp인 윈도우인,풀링 방법.
- 제1항에 있어서,상기 입력 데이터(100)는 행 방향과 열 방향으로 각각 R 및 C의 크기를 갖고,상기 임시 데이터(110)는 행 방향과 열 방향으로 각각 C-Cp+1 및 R의 크기를 가지며,상기 풀링 데이터(200)는 행 방향과 열 방향으로 각각 R-Rp+1 및 C-Cp+1의 크기를 가지며,상기 입력 데이터(100)의 엘리먼트 쌍 {(i, j), (i, j+1), ... (i, j+CP-1)}과 상기 제1풀링 윈도우(20)를 오버랩하여 풀링한 데이터는 상기 임시 데이터(110)의 엘리먼트 (j, i)에 저장되며(i는 행 인덱스, j는 열 인덱스),상기 임시 데이터(110)의 엘리먼트 쌍 {(i, j), (i, j+1), ..., (i, j+Rp-1)}와 상기 제2풀링 윈도우(30, 32)를 오버랩하여 풀링한 데이터는 풀링 데이터(200)의 엘리먼트 (j, i)에 저장되는(i는 행 인덱스, j는 열 인덱스),풀링 방법.
- 제5항에 있어서,상기 제1풀링 윈도우(20)는 행방향과 열방향의 크기가 각각 1 및 Cp인 윈도우이며,상기 제2풀링 윈도우(30, 32)는 행방향과 열방향의 크기가 각각 1 및 Rp인 윈도우인,풀링 방법.
- 제1항에 있어서, 상기 제1풀링 윈도우의 행방향 스트라이드 및 열방향 스트라이드는 각각 1 및 1인, 풀링 방법.
- 제1항에 있어서, 상기 제2풀링 윈도우의 행방향 스트라이드 및 열방향 스트라이드는 각각 1 및 1인, 풀링 방법.
- 제1항에 있어서, 상기 소정의 풀링은 MAX 풀링, MIN 풀링, 및 Average 풀링 중 어느 하나인, 풀링 방법.
- 제1항에 있어서,상기 제1풀링 윈도우의 행방향 스트라이드는 Rp/2 이하이고,상기 제1풀링 윈도우의 열방향 스트라이드는 Cp/2 이하이며,상기 제2풀링 윈도우의 행방향 스트라이드는 Rp/2 이하이고, 그리고상기 제3풀링 윈도우의 열방향 스트라이드는 Cp/2 이하인,풀링 방법.
- 행렬로 표현 가능한 입력 데이터(100)를 행방향과 열방향의 크기가 각각 Rp 및 Cp인 크기의 풀링 윈도우(10)로 풀링하는 풀링 방법을 수행하는 하드웨어 가속기로서,제어부(40);내무 메모리(30); 및데이터 연산부(610);를 포함하며,상기 제어부는,상기 데이터 연산부로 하여금, 제1시구간에, Cp 크기의 제1풀링 윈도우(20)를 이용하여, 상기 입력 데이터(100)를 풀링하여 임시 데이터(110)를 생성하는 단계를 수행하도록 되어 있고, 그리고상기 데이터 연산부로 하여금, 상기 제1시구간 이후의 제2시구간에, Rp 크기의 제2풀링 윈도우(30)를 이용하여, 상기 임시 데이터(110)를 풀링하여 풀링 데이터(200)를 생성하는 단계를 수행하도록 되어 있는,하드웨어 가속기.
- 제11항에 있어서,상기 입력 데이터(100)는 행 방향과 열 방향으로 각각 R 및 C의 크기를 갖고,상기 임시 데이터(110)는 행 방향과 열 방향으로 각각 C-Cp+1 및 R의 크기를 가지며,상기 풀링 데이터(200)는 행 방향과 열 방향으로 각각 R-Rp+1 및 C-Cp+1의 크기를 가지며,상기 입력 데이터(100)의 엘리먼트 쌍 {(i, j), (i, j+1), ... (i, j+CP-1)}과 상기 제1풀링 윈도우(20)를 오버랩하여 풀링한 데이터는 상기 임시 데이터(110)의 엘리먼트 (j, i)에 저장되며(i는 행 인덱스, j는 열 인덱스),상기 임시 데이터(110)의 엘리먼트 쌍 {(i, j), (i, j+1), ..., (i, j+Rp-1)}와 상기 제2풀링 윈도우(30, 32)를 오버랩하여 풀링한 데이터는 풀링 데이터(200)의 엘리먼트 (j, i)에 저장되는(i는 행 인덱스, j는 열 인덱스),하드웨어 가속기.
- 제12항에 있어서,상기 제1풀링 윈도우(20)는 행방향과 열방향의 크기가 각각 1 및 Cp인 윈도우이며,상기 제2풀링 윈도우(30, 32)는 행방향과 열방향의 크기가 각각 1 및 Rp인 윈도우인,하드웨어 가속기.
- 제11항 내지 제13항 중 어느 한 항의 상기 하드웨어 가속기;메모리(11); 및상기 메모리와 상기 하드웨어 가속기 간의 데이터 교환 통로인 버스(700);를 포함하는,컴퓨팅 장치.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202180100809.XA CN117730328A (zh) | 2021-06-04 | 2021-10-21 | 高效池化方法及其装置 |
EP21944314.0A EP4350581A1 (en) | 2021-06-04 | 2021-10-21 | High-efficiency pooling method and device therefor |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020210072676A KR102368075B1 (ko) | 2021-06-04 | 2021-06-04 | 고효율 풀링 방법 및 이를 위한 장치 |
KR10-2021-0072676 | 2021-06-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022255561A1 true WO2022255561A1 (ko) | 2022-12-08 |
Family
ID=80490198
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2021/014771 WO2022255561A1 (ko) | 2021-06-04 | 2021-10-21 | 고효율 풀링 방법 및 이를 위한 장치 |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP4350581A1 (ko) |
KR (1) | KR102368075B1 (ko) |
CN (1) | CN117730328A (ko) |
WO (1) | WO2022255561A1 (ko) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190038318A (ko) * | 2017-09-29 | 2019-04-08 | 인피니온 테크놀로지스 아게 | 콘볼루션 신경망 계산 처리량의 가속화 |
KR20190090858A (ko) * | 2016-12-09 | 2019-08-02 | 베이징 호라이즌 인포메이션 테크놀로지 컴퍼니 리미티드 | 데이터 관리를 위한 시스템들 및 방법들 |
US10713816B2 (en) | 2017-07-14 | 2020-07-14 | Microsoft Technology Licensing, Llc | Fully convolutional color constancy with confidence weighted pooling |
KR20210004229A (ko) * | 2019-07-03 | 2021-01-13 | 삼성전자주식회사 | 뉴럴 네트워크 프로세서를 구비하는 이미지 프로세싱 장치 및 이의 동작 방법 |
US20210073569A1 (en) * | 2018-05-30 | 2021-03-11 | SZ DJI Technology Co., Ltd. | Pooling device and pooling method |
KR20210036715A (ko) * | 2019-09-26 | 2021-04-05 | 삼성전자주식회사 | 뉴럴 프로세싱 장치 및 뉴럴 프로세싱 장치에서 뉴럴 네트워크의 풀링을 처리하는 방법 |
-
2021
- 2021-06-04 KR KR1020210072676A patent/KR102368075B1/ko active IP Right Grant
- 2021-10-21 EP EP21944314.0A patent/EP4350581A1/en active Pending
- 2021-10-21 CN CN202180100809.XA patent/CN117730328A/zh active Pending
- 2021-10-21 WO PCT/KR2021/014771 patent/WO2022255561A1/ko active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190090858A (ko) * | 2016-12-09 | 2019-08-02 | 베이징 호라이즌 인포메이션 테크놀로지 컴퍼니 리미티드 | 데이터 관리를 위한 시스템들 및 방법들 |
US10713816B2 (en) | 2017-07-14 | 2020-07-14 | Microsoft Technology Licensing, Llc | Fully convolutional color constancy with confidence weighted pooling |
KR20190038318A (ko) * | 2017-09-29 | 2019-04-08 | 인피니온 테크놀로지스 아게 | 콘볼루션 신경망 계산 처리량의 가속화 |
US20210073569A1 (en) * | 2018-05-30 | 2021-03-11 | SZ DJI Technology Co., Ltd. | Pooling device and pooling method |
KR20210004229A (ko) * | 2019-07-03 | 2021-01-13 | 삼성전자주식회사 | 뉴럴 네트워크 프로세서를 구비하는 이미지 프로세싱 장치 및 이의 동작 방법 |
KR20210036715A (ko) * | 2019-09-26 | 2021-04-05 | 삼성전자주식회사 | 뉴럴 프로세싱 장치 및 뉴럴 프로세싱 장치에서 뉴럴 네트워크의 풀링을 처리하는 방법 |
Also Published As
Publication number | Publication date |
---|---|
EP4350581A1 (en) | 2024-04-10 |
CN117730328A (zh) | 2024-03-19 |
KR102368075B1 (ko) | 2022-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018113239A1 (zh) | 一种卷积神经网络的数据调度方法、系统及计算机设备 | |
EP3639207A1 (en) | Device and method for processing convolution operation using kernel | |
WO2019164237A1 (ko) | 시스톨릭 배열을 이용하여 딥 러닝 연산을 수행하는 방법 및 장치 | |
WO2021230457A1 (en) | Learning method and learning device for training an object detection network by using attention maps and testing method and testing device using the same | |
WO2020204364A2 (ko) | 단어의 문맥 정보와 형태론적 정보를 고려한 단어 임베딩 방법 및 장치 | |
WO2022146050A1 (ko) | 우울증 진단을 위한 인공지능 연합학습 방법 및 시스템 | |
WO2022255561A1 (ko) | 고효율 풀링 방법 및 이를 위한 장치 | |
WO2022050719A1 (ko) | 사용자의 치매 정도 결정 방법 및 장치 | |
WO2022196945A1 (ko) | 개체군 분산 모사 모델 기반 개체군 분산 예측 장치 및 이를 이용한 개체군 분산 예측 방법 | |
WO2022145564A1 (ko) | 딥러닝 모델 서빙 최적화를 위한 모델 자동 경량화 방법 및 장치, 이를 이용한 클라우드 추론 서비스 제공 방법 | |
WO2020141720A1 (en) | Apparatus and method for managing application program | |
WO2021246586A1 (ko) | 하드웨어 가속기를 위한 파라미터를 메모리로부터 액세스하는 방법 및 이를 이용한 장치 | |
US20030182518A1 (en) | Parallel processing method for inverse matrix for shared memory type scalar parallel computer | |
WO2023120829A1 (ko) | 어레이 풀링 방법 및 이를 위한 장치 | |
WO2019198900A1 (en) | Electronic apparatus and control method thereof | |
WO2023177108A1 (en) | Method and system for learning to share weights across transformer backbones in vision and language tasks | |
WO2023085535A1 (ko) | 1차원 어레이 풀링 방법 및 이를 위한 장치 | |
WO2021137415A1 (ko) | 머신 러닝에 기반한 이미지 처리 방법 및 장치 | |
WO2022045449A1 (ko) | 하드웨어 가속기의 출력 데이터를 메모리에 저장하는 방법, 하드웨어 가속기의 입력 데이터를 메모리로부터 읽는 방법, 및 이를 위한 하드웨어 가속기 | |
WO2022004970A1 (ko) | 신경망 기반의 특징점 학습 장치 및 방법 | |
WO2019231254A1 (en) | Processor, electronics apparatus and control method thereof | |
WO2022250211A1 (ko) | 인티져 타입 데이터의 해상도를 증가시키는 연산방법 및 이를 적용한 장치 | |
WO2024136128A1 (ko) | 효율적인 연산 분할을 위한 텐서 변형 방법, 메모리 액세스 방법, 및 이를 위한 뉴럴 프로세싱 유닛 | |
WO2021172797A1 (ko) | 트랜스포즈드 콘볼루션 하드웨어 가속장치 | |
EP3746951A1 (en) | Processor, electronics apparatus and control method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21944314 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021944314 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021944314 Country of ref document: EP Effective date: 20240104 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180100809.X Country of ref document: CN |