CN113052760A

CN113052760A - Pooling method, chip, equipment and storage medium

Info

Publication number: CN113052760A
Application number: CN202110518017.6A
Authority: CN
Inventors: 周军; 周亮; 常亮; 王文强; 杨雨桐; 徐宁仪
Original assignee: University of Electronic Science and Technology of China; Chengdu Sensetime Technology Co Ltd
Current assignee: University of Electronic Science and Technology of China; Chengdu Sensetime Technology Co Ltd
Priority date: 2021-01-29
Filing date: 2021-05-12
Publication date: 2021-06-29
Also published as: CN112862667A; WO2022160703A1

Abstract

The application provides a pooling method, a chip, a device and a storage medium. The method can include obtaining a target feature map. And splitting the target characteristic diagram to obtain a plurality of sub-characteristic diagrams. At least part of the pixel values in the same pooling window in the target feature map are respectively in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map. And carrying out parallel processing on the pixels belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the target feature map.

Description

Pooling method, chip, equipment and storage medium

Technical Field

The present application relates to computer technology, and in particular, to a pooling method, chip, device, and storage medium.

Background

The pooling process is a process of downsampling an input feature map, reducing the number of features and simplifying the computational complexity of a convolutional network under the condition of keeping the invariance of the features in certain dimensions (such as rotation, translation and expansion). The pooling treatment may include two ways: firstly, averaging pooling, namely averaging characteristic values in a pooling window; second, max pooling, i.e., taking the maximum of the eigenvalues within the pooling window.

Pooling typically requires reliance on an artificial intelligence chip (hereinafter AI chip). Currently, there is a need for an accelerated method for pooling.

Disclosure of Invention

In view of the above, the present application discloses at least one pooling method, which may include: acquiring a target characteristic diagram; splitting the target characteristic diagram to obtain a plurality of sub-characteristic diagrams; wherein, at least part of the pixel values in the target feature map in the same pooling window are respectively in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map;

and carrying out parallel processing on the pixels belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the target feature map.

In some embodiments shown in the above, the parallel processing of the pixels belonging to different pooling windows in each sub-feature map to obtain the pooling result corresponding to the target feature map includes: and loading pixel values respectively included by the sub-feature maps into a shift register array, and performing parallel pooling processing on the pixel values at the same position in the sub-feature maps according to a pooling instruction to obtain a pooling result corresponding to the target feature map.

In some embodiments shown in the above, the parallel processing of pixel values belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the target feature map includes: determining the shift operation mode of the shift register array corresponding to each sub-feature map according to the position of the pixel value in the same pooling window in each sub-feature map; respectively loading the sub-feature maps into a displacement register array, executing displacement operation on pixel values stored by a displacement register in the displacement register array according to a displacement operation mode of the displacement register array determined for each sub-feature map, and parallelly obtaining partial pooling results corresponding to different pooling windows in the feature sub-map according to a pooling instruction; and determining a pooling result corresponding to the target feature map according to the partial pooling results of different pooling windows corresponding to each sub-feature map.

In some illustrated embodiments, the splitting the target feature map to obtain a plurality of sub-feature maps includes: determining pixel values at odd-numbered row and odd-numbered column positions in the target feature map as a first sub-feature map; determining pixel values at odd-numbered row and even-numbered column positions in the target feature map as a second sub-feature map; determining pixel values at even-numbered rows and odd-numbered columns in the target feature map as a third sub-feature map; and determining pixel values at even-numbered rows and even-numbered columns in the target feature map as a fourth sub-feature map.

In some illustrated embodiments, the loading, into the shift register array, pixel values included in the sub feature maps respectively, and performing parallel pooling processing on the pixel values in the same position in the sub feature maps according to a pooling instruction to obtain a pooled result shift register array corresponding to the target feature map includes: shifting each pixel value included in the first sub-feature map to at least a part of shift registers included in the shift register array; respectively moving each pixel value included in the second sub-feature graph to the partial shift registers, so that the computing kernel corresponding to each shift register performs pooling processing on the two received pixel values according to the pooling instruction, and a first pooling processing result is obtained; respectively moving each pixel value included in the third sub-feature map to the partial shift register, so that each computation core performs pooling processing on the first pooling processing result and the received pixel value according to the pooling instruction, and a second pooling processing result is obtained; respectively moving each pixel value included in the fourth sub-feature map to the partial shift register, so that each computation core performs pooling processing on the second pooling processing result and the received pixel value according to the pooling instruction, and a third pooling processing result is obtained; and outputting a third pooling result obtained by pooling the calculation cores respectively to obtain a pooling result corresponding to the target feature map.

In some embodiments shown, the pooling includes a maximum pooling; the pooling instruction includes comparing a maximum value therebetween; the shifting of the pixel values included in the first sub-feature map to at least a portion of the shift registers included in the shift register array, respectively, includes: shifting each pixel value included in the first sub-feature map to a first register of at least a portion of shift registers included in the shift register array; the moving each pixel value included in the second sub-feature map to the partial shift register so that each computing unit performs pooling processing on the two received pixel values according to the pooling instruction to obtain a first pooling processing result includes: and moving each pixel value included in the second sub-feature map to a second register of the partial shift register, so that each calculation unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the first pooling result.

In some embodiments, the moving the pixel values included in the third sub-feature map to the partial shift register respectively to make the computing units pool the first pooling result and the received pixel values according to the pooling instruction to obtain a second pooling result includes: moving each pixel value included in the third sub-feature map to a second register of the partial shift register, so that each calculation unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the second pooling result; the moving each pixel value included in the fourth sub feature map to the partial shift register so that each of the calculation units performs pooling processing on the second pooling processing result and the received pixel value according to the pooling instruction to obtain a third pooling processing result includes: and moving each pixel value included in the fourth sub-feature map to a second register of the partial shift register, so that each calculation unit acquires a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the third pooling result.

In some embodiments shown, the outputting a third pooling result obtained by pooling in the partial shift registers of each computing unit to obtain a pooling result corresponding to the target feature map includes: and outputting the numerical value stored in the first register of the partial shift register to obtain a pooling result corresponding to the target characteristic diagram.

In some embodiments shown in the above, the parallel processing of the pixels belonging to different pooling windows in each sub-feature map to obtain the pooling result corresponding to the target feature map includes: shifting each pixel value included in the first sub-feature map to at least a part of shift registers included in the shift register array; performing pooling operation on pixel values in any four displacement registers adjacent to each other up, down, left and right in the displacement register array according to the pooling instruction to obtain a first part of pooling processing result, and storing the first part of pooling processing result into a target displacement register at a preset position in the four displacement registers adjacent to each other up, down, left and right; moving each pixel value included in the second sub-feature map to the partial shift register; performing pooling operation on pixel values in any two vertically adjacent shift registers included in the shift register array according to the pooling instruction to obtain a second part of pooling processing result, and storing the second part of pooling processing result in the target shift register; moving each pixel value included in the third sub-feature map to the partial shift register; performing pooling operation on pixel values in any two left and right adjacent shift registers included in the shift register array according to the pooling instruction to obtain a third part of pooling processing result, and storing the third part of pooling processing result in the target shift register; and respectively moving each pixel value included in the fourth sub-feature map to the partial shift register, and moving the pixel value in each shift register in the partial shift register to the target shift register.

And obtaining a pooling result corresponding to the target feature map according to the first part pooling result, the second part pooling result, the third part pooling result and the fourth sub-feature map in each target shift register included in the shift register array.

In some embodiments shown, the pooling includes a maximum pooling; the pooling instruction includes comparing a maximum value therebetween; the preset positions comprise the left lower corner positions in any four displacement registers which are adjacent up, down, left and right; the target shift register comprises a shift register at the lower left corner of the four adjacent shift registers; the shifting of the pixel values included in the first sub-feature map to at least a portion of the shift registers included in the shift register array, respectively, includes: shifting each pixel value included in the first sub-feature map to a first register of at least a portion of shift registers included in the shift register array; the pooling operation of the pixel values in any four up-down, left-right and adjacent shift registers included in the shift register array according to the pooling instruction to obtain a first part of pooling processing results, and storing the first part of pooling processing results into a target shift register at a preset position in the four up-down, left-right and adjacent shift registers includes: moving the value in the first register of each first shift register in the partial shift registers to the second register of the second shift register at the right of the first shift register; storing the larger value in the first register and the second register in each second shift register into the first register of each second shift register; moving the value in the first register in each second shift register to a second register of a third shift register below the second shift register; storing a larger numerical value in the first register and the second register in each third shift register into the first register of each third shift register; moving the value in the first register in each third shift register to the second register of the target shift register at the left of the third shift register; and storing the larger numerical value in the first register and the second register in each target shift register as the first part pooling processing result into the first register of each target shift register.

In some embodiments, the moving the pixel values included in the second sub-feature map to the partial shift register respectively includes: moving each pixel value included in the second sub-feature map to a second register of the partial shift register; the pooling operation performed on the pixel values in any two vertically adjacent shift registers included in the shift register array according to the pooling instruction to obtain a second part of pooling processing result, and storing the second part of pooling processing result in the target shift register includes: moving the value in the second register of each first shift register in the partial shift registers to the third register of the target shift register below the first shift register; and storing the larger numerical value in the second register and the third register in each target shift register as the second part pooling processing result into the second register of each target shift register.

In some embodiments, the moving the pixel values included in the third sub-feature map to the partial shift register respectively includes: moving each pixel value included in the third sub-feature map to a third register of the partial shift register; the pooling operation performed on the pixel values in any two left and right adjacent shift registers included in the shift register array according to the pooling instruction to obtain a third part of pooling processing results, and storing the third part of pooling processing results in the target shift register includes: moving the value in the third register of each first shift register in the partial shift registers to the fourth register of the second shift register at the right of the first shift register; taking a larger numerical value in a third register and a fourth register in each second shift register as the third part pooling processing result, and storing the result into the third register of each second shift register; transferring the third part of the pooling processing result in the third register of the second shift register to the third register of the first shift register at the left side of the second shift register; and shifting the third part of pooling processing result in the third register of the first shift register to the third register of the target shift register below the first shift register.

In some embodiments, the moving the pixel values included in the fourth sub-feature map to the partial shift register respectively includes: moving each pixel value included in the fourth sub-feature map to a fourth register of the partial shift register; the shifting the pixel values in the shift registers of the partial shift registers to the target shift register includes: and moving the value in the fourth register of each first shift register in the partial shift registers to the fourth register of the target shift register below the first shift register.

In some embodiments, the obtaining a pooling result corresponding to the target feature map according to the first partial pooling result, the second partial pooling result, the third partial pooling result, and the pixel value corresponding to the fourth sub-feature map in each target shift register included in the shift register array includes: storing a larger numerical value in a first register and a second register in each target displacement register into the first register of each target displacement register; storing larger values in a first register and a third register of each target shift register into the first register of each target shift register; storing a larger numerical value in a first register and a fourth register in each target displacement register into the first register of each target displacement register; and outputting the numerical value stored in the first register of each target shift register to obtain a pooling result corresponding to the target characteristic diagram.

In some embodiments, a plurality of temporary registers are peripherally connected to the shift register array; the temporary register is used for storing pixel values overflowing the shift register array when carrying out numerical value shifting operation.

In some embodiments, at least some of the sub-feature maps include the same number of pixels as the number of shift registers included in the shift register array.

The present application also provides a pooling method, which may include: acquiring an original characteristic diagram; dividing the original characteristic graph into a plurality of target characteristic graphs;

pooling each target feature map according to a pooling method shown in any one of the foregoing embodiments to obtain pooling results corresponding to each target feature map; and outputting the pooling result corresponding to each target feature map to obtain the pooling result corresponding to the original feature map.

The application also provides a chip, which can comprise a controller; the controller is used for acquiring a target characteristic diagram; splitting the target characteristic diagram to obtain a plurality of sub-characteristic diagrams; wherein, at least part of the pixel values in the target feature map in the same pooling window are respectively in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map; and carrying out parallel processing on the pixels belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the target feature map.

In some embodiments shown, the controller is configured to: and loading the pixel values respectively included by the sub-feature maps into a shift register array, and performing parallel pooling processing on the pixel values at the same position in the sub-feature maps according to a pooling instruction to obtain a pooling result corresponding to the target feature map.

In some embodiments shown, the controller is configured to: determining the shift operation mode of the shift register array corresponding to each sub-feature map according to the position of the pixel value in the same pooling window in each sub-feature map; respectively loading the sub-feature maps into a displacement register array, executing displacement operation on pixel values stored by a displacement register in the displacement register array according to a displacement operation mode of the displacement register array determined for each sub-feature map, and parallelly obtaining partial pooling results corresponding to different pooling windows in the feature sub-map according to a pooling instruction; and determining a pooling result corresponding to the target feature map according to the partial pooling results of different pooling windows corresponding to each sub-feature map.

In some embodiments shown, the controller is configured to: determining pixel values at odd-numbered row and odd-numbered column positions in the target feature map as a first sub-feature map; determining pixel values at odd-numbered row and even-numbered column positions in the target feature map as a second sub-feature map; determining pixel values at even-numbered rows and odd-numbered columns in the target feature map as a third sub-feature map; and determining pixel values at even-numbered rows and even-numbered columns in the target feature map as a fourth sub-feature map.

In some embodiments shown, the controller is configured to: shifting each pixel value included in the first sub-feature map to at least a part of shift registers included in the shift register array; respectively moving each pixel value included in the second sub-feature graph to the partial shift registers, so that the computing kernel corresponding to each shift register performs pooling processing on the two received pixel values according to the pooling instruction, and a first pooling processing result is obtained; respectively moving each pixel value included in the third sub-feature map to the partial shift register, so that each computation core performs pooling processing on the first pooling processing result and the received pixel value according to the pooling instruction, and a second pooling processing result is obtained; respectively moving each pixel value included in the fourth sub-feature map to the partial shift register, so that each computation core performs pooling processing on the second pooling processing result and the received pixel value according to the pooling instruction, and a third pooling processing result is obtained; and outputting a third pooling result obtained by pooling the calculation cores respectively to obtain a pooling result corresponding to the target feature map.

In some embodiments shown, the pooling includes a maximum pooling; the pooling instruction includes comparing a maximum value therebetween; the controller is configured to: shifting each pixel value included in the first sub-feature map to a first register of at least a portion of shift registers included in the shift register array; and moving each pixel value included in the second sub-feature map to a second register of the partial shift register, so that each calculation unit acquires a maximum value of the values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the first pooling result.

In some embodiments shown, the controller is configured to: moving each pixel value included in the third sub-feature map to a second register of the partial shift register, so that each calculation unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the second pooling result; and moving each pixel value included in the fourth sub-feature map to a second register of the partial shift register, so that each calculation unit acquires a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the third pooling result.

In some embodiments shown, the controller is configured to: and outputting the numerical value stored in the first register of the partial shift register to obtain a pooling result corresponding to the target characteristic diagram.

The application also provides a chip, which can comprise a controller; the controller is used for acquiring an original characteristic diagram; dividing the original characteristic graph into a plurality of target characteristic graphs; pooling each target feature map according to a pooling method shown in any one of the foregoing embodiments to obtain pooling results corresponding to each target feature map;

and outputting the pooling result corresponding to each target feature map to obtain the pooling result corresponding to the original feature map.

The present application also proposes an electronic device comprising a chip as shown in any of the previous embodiments.

The present application also provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed by a controller to implement any of the pooling methods described above.

In the above scheme, at least part of pixel values in the target feature map within the same pooling window are respectively split into different sub-feature maps, and the pixels belonging to different pooling windows in each sub-feature map are processed in parallel to obtain pooling results corresponding to the target feature map, so that the chip pooling processing efficiency is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate one or more embodiments of the present application or technical solutions in the related art, the drawings needed to be used in the description of the embodiments or the related art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in one or more embodiments of the present application, and other drawings can be obtained by those skilled in the art without inventive exercise.

FIG. 1 is a schematic diagram of a shift register array according to the present application;

FIG. 2 is a schematic diagram of a PE shown in the present application;

FIG. 3 is a method flow diagram of one pooling method shown herein;

FIG. 4 is a schematic diagram illustrating a target feature map split according to the present application;

FIG. 5 is a schematic diagram illustrating a target feature map split according to the present application;

FIG. 6 is a schematic view of a pooling window shown in the present application;

FIG. 7 is a diagram illustrating shifting of pixel values for a first sub-feature image according to the present application;

FIG. 8 is a diagram illustrating shifting of pixel values for a second sub-feature image according to the present application;

FIG. 9 is a diagram illustrating shifting pixel values for a third sub-feature image according to the present application;

FIG. 10 is a method flow diagram of one pooling method shown herein.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It should also be understood that the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination," depending on the context.

The shift register array used by the AI chip is described first below.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a shift register array according to the present application.

The shift register array may include a plurality of shift registers arranged in a vertical and horizontal direction, each shift register may uniquely correspond to a computation core (hereinafter referred to as PE), and each PE is configured to perform an operation according to a value in the shift register. As shown in fig. 1, the shift register array may be considered to include a plurality of PEs arranged in a row and a column. Data can be moved between any two adjacent PEs (the shift registers corresponding to the PEs); each row of PEs may obtain data from a corresponding RAM (Random Access Memory).

Assuming that the size of the shift register array is 8 × 8; the size of the feature map to be input is also 8 × 8. When an operation of inputting the feature map into the shift register array is performed, the feature map may be divided into 8 rows of pixel values by a controller (e.g., an array controller), and the 8 rows of pixel values may be input into the RAM corresponding to each row PE, respectively. The controller can then transfer the pixel values in each RAM to the corresponding shift registers according to the position order of the pixel values in the feature map by a data transfer instruction, thereby completing the operation of inputting the feature map into the shift registers.

The PE may perform data operations on data in the shift register in response to an instruction.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a PE according to the present application.

As shown in fig. 2, the PE corresponding to the shift register may include a register, and an ALU (arithmetic and logic unit). The register may be a register obtained by dividing a storage space of the shift register. In some examples, the shift register may be configured as several registers (e.g., register 1 and register 2 shown in fig. 2) that can perform data movement with respect to each other according to actual requirements. The PE may perform arithmetic processing on data in a plurality of registers in accordance with an arithmetic instruction. The ALU user performs logical operations.

For example, when the PE receives an operation instruction such as addition or subtraction of a numerical value or a larger or smaller value, the numerical value stored in the register may be subjected to a correlation operation by the ALU.

In some examples, the shift register array in the embodiment of the present application includes a plurality of shift registers, where the shift registers correspond to the respective PEs, and each shift register (or each PE) includes a plurality of registers (e.g., a first register, a second register, a third register, a fourth register, etc., where the number of registers per PE is not limited).

The present application is directed to a pooling method. The method splits at least part of pixel values in the same pooling window in the target feature map into different sub-feature maps respectively, and processes the pixels belonging to different pooling windows in each sub-feature map in parallel to obtain pooling results corresponding to the target feature map, thereby improving the chip pooling processing efficiency.

Referring to fig. 3, fig. 3 is a flow chart illustrating a method of pooling according to the present application.

As shown in fig. 3, the method may include:

and S302, acquiring a target characteristic diagram.

The target feature map may be a feature map requiring pooling. In some examples, the target feature map may be a feature map that needs to be subjected to pooling after convolution processing. In some examples, the target feature map may be a target feature map obtained by performing convolution processing on each PE in the shift register array. It is understood that the target feature map may be stored in the RAM corresponding to each row PE.

S304, splitting the target characteristic diagram to obtain a plurality of sub-characteristic diagrams; at least part of the pixel values in the same pooling window in the target feature map are respectively in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map.

The pooling process typically includes a pooling window of a preset size and a preset size step size set according to the traffic demand.

Taking the pooling window of 2 × 2 and the step size of 2 as an example, when performing a pooling operation on the feature map, it can be understood that a pooling window of 2 × 2 size is formed by taking the first pixel value as the upper left corner element of the feature map. The pooling operation within the pooling window is then completed by taking the maximum value for each pixel value included within the pooling window. Thereafter, two pixel values are slid to the right of the first pixel value by a step size of 2, and a 2 × 2 pooling window is formed. Then, the pooling operation is performed again for the pixel values within the current pooling window. By analogy, after pooling operations are performed on all the pooling windows, the maximum pixel values output by each pooling window are combined, and then the pooling result corresponding to the characteristic diagram can be obtained.

The pooling window may comprise a number of pixel values. The pixel values may be at different locations of the pooling window. Take the pooling window as 2 x 2 as an example. The 4 pixel values within the pooling window may be at the top left corner, bottom left corner, top right corner and bottom right corner of the pooling window, respectively.

In some examples, in S304, a splitting method that satisfies a condition that at least some pixel values in the same pooling window in the target feature map are in different sub-feature maps and at least some pixel values in the same pooling window in the target feature map are in the same sub-feature map may be determined according to the size of the preset pooling window and the position distribution rule of each pixel value. And then splitting the target characteristic diagram according to the determined splitting method to obtain a plurality of sub-characteristic diagrams.

For example, when the pooling window is 2 x 2, it may be determined that each pixel value within the pooling window is in an odd row, an odd column, respectively; odd rows, even columns; even rows, odd columns; even rows, even columns. Therefore, the pixel values at odd row and odd column positions in the target feature map can be determined as a first sub-feature map;

determining pixel values at odd-numbered row and even-numbered column positions in the target feature map as a second sub-feature map;

determining pixel values at even-numbered rows and odd-numbered columns in the target feature map as a third sub-feature map;

and determining pixel values at even-numbered rows and even-numbered columns in the target feature map as a fourth sub-feature map.

Thus, the target feature map can be split to obtain sub-feature maps satisfying the condition that at least some of the pixel values in the same pooling window in the target feature map are in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map, so that 4 sub-feature maps are obtained.

In some examples, the correspondence of the pooling process with the splitting scheme may be maintained in advance. When the target feature map is split, parameters such as pooling windows and step lengths of pooling processing can be determined. And then, according to the determined parameters, inquiring the corresponding relation to obtain a corresponding splitting scheme, and splitting the target characteristic diagram according to the splitting scheme.

After the target feature map is split, S306 may be executed to perform parallel processing on the pixels belonging to different pooling windows in each sub-feature map, so as to obtain a pooling result corresponding to the target feature map.

In some examples, pixel values included in each of the sub-feature maps may be loaded into a shift register array, and the pooling result corresponding to the target feature map may be obtained by performing parallel pooling on pixel values at the same position in the sub-feature maps according to a pooling instruction.

The pooling instruction may include an instruction corresponding to a current pooling process. The instruction may be generated in advance. When the pooling process is maximum pooling, the pooling instruction may be a maximum value between the two compared. When the pooling process is an average pooling, the pooling instruction may be a summation or an averaging.

The pooling result may include a pooling result obtained by pooling the target feature map.

In some examples, when S306 is executed, the pixel values included in each sub feature map may be sequentially shifted to the shift registers included in the shift register array according to a preset shifting method, so that the pixel values at the same position in each sub feature map are shifted to the same shift register.

The preset shifting method may be to shift each pixel value to the shift register array according to the sequence of each pixel value in the target feature map. The transfer method is not particularly limited in the present application. It can be understood that the pixel value shifting of each sub-feature graph according to the same shifting method can ensure that the pixel values at the same position in each sub-feature graph are shifted to the same shift register.

Then, the PE corresponding to each shift register may perform parallel pooling on the received pixel values according to the pooling instruction to obtain a pooling result corresponding to the target feature map.

For example, the pooling process is taken as an example of the maximum pooling. When each shift register receives a new pixel value, the newly received pixel value can be compared with the stored pixel value through the PE corresponding to the newly received pixel value, the maximum value is obtained, and the maximum value is covered to the shift register. Therefore, when the input of each sub-feature map divided from the target feature map is completed, the shift register can include the maximum pixel value in each pooling window. And then, outputting the maximum pixel value stored in the register array, thereby obtaining the pooling result aiming at the target feature map.

In the above example, in the target feature map splitting result, all pixel values included in the same pooling window are in different sub-feature maps, at this time, the pixel values included in the sub-feature maps may be loaded into the shift register array, and according to the pooling instruction, the pixel values in the same position in the sub-feature maps are pooled in parallel to obtain a pooling result corresponding to the target feature map, so that the plurality of PEs corresponding to the shift register array may be used to perform the pooling operation of each pooling window in parallel, thereby improving the pooling efficiency.

In some examples, in step S306, the shift operation mode of the shift register array corresponding to each sub-feature map may be determined according to the positions of the pixel values in the same pooling window in each sub-feature map.

After the shift operation mode is determined, the sub feature maps may be loaded into a shift register array, and for each sub feature map, a shift operation may be performed on a pixel value stored in a shift register in the shift register array according to the shift operation mode of the shift register array determined for the sub feature map, then partial pooling results corresponding to different pooling windows in the feature sub map may be obtained in parallel according to the pooling instruction, and a pooling result corresponding to the target feature map may be determined according to the partial pooling results of each sub feature map corresponding to different pooling windows.

In the above example, in the target feature map splitting result, some pixel values in the same pooling window are in the same sub-feature map, and at this time, the pixel values in the same pooling window in each sub-feature map may be pooled first to obtain partial pooling results corresponding to different pooling windows in each sub-feature map; and then performing pooling again on partial pooling results corresponding to the same pooling window in each sub-feature map to obtain a final pooling result, so that a plurality of PEs corresponding to the shift register array can be used for performing pooling processing operation of each pooling window in parallel, and further the pooling processing efficiency is improved.

In some examples, in order to maximize the utilization of the PEs corresponding to the shift register array, and further improve the pooling efficiency, at least some of the sub-feature maps include the same number of pixels as the number of shift registers included in the shift register array.

In some examples, when the splitting strategy is determined, the splitting strategy may be determined according to the number of the shift registers included in the shift register array, so as to ensure that the number of pixels included in part of or all of the sub feature maps obtained by splitting the target feature map is consistent with the number of the shift registers.

Therefore, all PEs corresponding to the shift register can be subjected to pooling operation in parallel, so that the PEs corresponding to the shift register array are utilized to the maximum extent, and the pooling efficiency is further improved.

The following embodiments are described with reference to specific scenarios.

Scene one:

the target feature map size is 16 x 16, the pooling window size is 2 x 2, the step size is 2, the pooling process is a maximum pooling process, and the pooling instruction is the above-mentioned pooling instruction including comparing the maximum value between the two. The AI chip for process pooling includes a shift register array size of 8 x 8.

When the pooling operation is performed through the AI chip, the target feature map can be split to obtain four sub-feature maps.

In some examples, the pixel values at odd row and odd column positions in the target feature map may be determined as a first sub-feature map;

Referring to fig. 4, fig. 4 is a schematic diagram illustrating a target feature diagram according to the present application.

As shown in fig. 4, the target feature map is 16 × 16. Wherein, the black square indicates the pixel points in odd rows and odd columns in the target characteristic diagram; the dark gray squares indicate pixels in odd rows and even columns in the target feature map; the white square indicates pixel points in even rows and odd columns in the target feature map; the light gray squares refer to pixels in even rows and even columns of the target feature map.

And splitting the target characteristic diagram according to the splitting method to obtain first to fourth sub-characteristic diagrams. And the size of each sub-feature graph is 8 x 8, and is consistent with the size of the shift register array.

After the splitting is completed, the pixel values included in the first sub-feature graph can be respectively shifted to at least part of the shift registers corresponding to the shift register array according to a preset shifting method.

In some examples, the pixel values included in the first sub-feature map may be respectively shifted to the first registers of at least some of the shift registers included in the shift register array according to a predetermined shifting method. In some embodiments, each pixel value needs to be moved integrally among the shift registers in the register array, and then an idle shift register may be reserved at a preset position in the register array according to the moving direction and the moving step length so as to store the moved pixel value, and if each pixel value needs to be moved integrally by one step length rightward, then at least an idle shift register at the right side adjacent to the rightmost column of pixel values in each pixel value storage position needs to be reserved, and other moving manners are the same and are not described herein again.

Therefore, all pixel values included in the first sub-feature diagram can be shifted to all the shift registers included in the shift register array.

Then, according to the preset shifting method, the pixel values included in the second sub-feature map may be respectively shifted to the partial shift registers, so that the computation cores corresponding to the shift registers may perform pooling processing on the two received pixel values according to the pooling instruction, thereby obtaining a first pooling processing result.

In some examples, according to the preset shifting method, the pixel values included in the second sub-feature map may be respectively shifted to a second register of the partial shift register, so that each of the computation cores may obtain a maximum value of the values stored in the first register and the second register according to the pooling instruction, and store the maximum value in the first register as the first pooling result.

Therefore, the pixel values in the first sub-feature map and the second sub-feature map in the same pooling window can be compared to obtain the maximum value, and the maximum value is stored in the shift register.

Then, according to the preset moving method, the pixel values included in the third sub-feature map are respectively moved to the partial shift registers, so that each computing core performs pooling processing on the first pooling processing result and the received pixel values according to the pooling instruction, and a second pooling processing result is obtained.

In some examples, according to the preset shifting method, the pixel values included in the third sub-feature map are respectively shifted to the second registers of the partial shift registers, so that each compute kernel can obtain a maximum value of the values stored in the first register and the second register according to the pooling instruction, and store the maximum value in the first register as the second pooling result.

Therefore, the pixel values in the first sub-feature map, the second sub-feature map and the third sub-feature map in the same pooling window can be compared to obtain the maximum value, and the maximum value is stored in the shift register.

Then, according to the preset moving method, the pixel values included in the fourth sub-feature map are respectively moved to the partial shift registers, so that each computing core performs pooling processing on the second pooling processing result and the received pixel values according to the pooling instruction, and a third pooling processing result is obtained.

In some examples, according to the preset shifting method, the pixel values included in the fourth sub-feature map are respectively shifted to the second registers of the partial shift registers, so that each compute kernel can obtain a maximum value of the values stored in the first register and the second register according to the pooling instruction, and store the maximum value in the first register as the third pooling result.

Therefore, the pixel values in the first sub-feature map, the second sub-feature map, the third sub-feature map and the fourth sub-feature map in the same pooling window can be compared to obtain the maximum value, and the maximum value is stored in the shift register.

Finally, a third pooling result obtained by pooling in each of the computation cores may be output, and a pooling result corresponding to the target feature map may be obtained.

In some examples, the value stored in the first register of the partial shift register may be output to obtain a pooled result corresponding to the target feature map.

In this way, the maximum pooling process for the target feature map can be completed, and a corresponding pooling result can be obtained.

The order in which the sub feature maps are input to the shift register is only schematically described. Any input order may be used in a practical application.

When the pooling process is an average pooling, the specific process can refer to the above embodiment, and only the pooling instruction is different and will not be described in detail.

Scene two:

the target feature map size is 17 x 17, the pooling window size is 3 x 3, the step size is 2, the pooling process is a maximum pooling process, and the pooling instruction is the above-mentioned pooling instruction including comparing the maximum value between the two. The AI chips for progressive pooling included shift register arrays of size 9 x 9.

Please refer to fig. 5, fig. 5 is a schematic diagram illustrating a target feature graph according to the present application.

As shown in fig. 5, the target feature map is 17 × 17. Wherein, the black square indicates the pixel points in odd rows and odd columns in the target characteristic diagram; the dark gray squares indicate pixels in odd rows and even columns in the target feature map; the white square indicates pixel points in even rows and odd columns in the target feature map; the light gray squares refer to pixels in even rows and even columns of the target feature map.

And splitting the target characteristic diagram according to the splitting method to obtain first to fourth sub-characteristic diagrams. Wherein, the first sub-feature diagram is 9 × 9, the second sub-feature diagram is 9 × 8, the third sub-feature diagram is 8 × 9, and the fourth sub-feature diagram is 8 × 8. The first sub-feature map corresponds to the size of the shift register array.

Referring to fig. 6, fig. 6 is a schematic view of a pooling window shown in the present application. The pooling windows shown in fig. 6 include pooling windows when the target feature map is pooled with a pooling kernel size of 3 × 3 and a step size of 2.

As shown in fig. 6, a pooling window in the target feature map is indicated within the dashed box. The pooling window may include 4 black blocks, 2 dark gray blocks, 2 white blocks, and 1 light gray block. When performing pooling operation on the pooling window, according to the positions of the pixel values in the same pooling window in each sub-feature map, for each pooling window, S61-S62 may be executed first, and the maximum value among the four pixel values respectively located at the upper left corner, the upper right corner, the lower left corner and the lower right corner, that is, the first maximum value among the four pixel values adjacent to each other, i.e., the upper, lower, left and right, in the first sub-feature map, corresponding to the pooling window, is determined. Then, S63-S64 is executed to determine the maximum value of the two pixel values at the first row, the second column, and the third row, the second column, i.e. the second maximum value of the two pixel values adjacent up and down in the second sub-feature map corresponding to the pooling window. Then, S65-S66 are executed to determine the maximum value of the two pixel values in the first column of the second row and the third column of the second row, i.e. the third maximum value of the two pixel values adjacent to each other in the left and right of the third sub-feature map corresponding to the pooling window. And finally, determining the maximum value of the first maximum value, the second maximum value, the third maximum value and the pixel value with the middle position (namely the pixel value in the fourth sub-feature map corresponding to the pooling window), and taking the determined maximum value as the maximum pooling result for the pooling window.

In some examples, S61 may be executed to shift the pixel values included in the first sub-feature map to at least some shift registers included in the shift register array according to a predetermined shifting method.

In some examples, the pixel values included in the first sub-feature map may be respectively shifted to the first registers of at least some of the shift registers included in the shift register array according to a predetermined shifting method.

Then, S62 may be executed, where the computation core corresponding to the at least part of the shift registers may perform a pooling operation on pixel values in any four shift registers adjacent to each other, up, down, left, and right, included in the shift register array according to the pooling instruction to obtain a first part of pooling processing results, and store the first part of pooling processing results into a target shift register at a preset position in the four adjacent shift registers, up, down, left, and right.

In some examples, the preset position includes a lower left corner position in any four shift registers adjacent to each other up, down, left and right. It is understood that the above-mentioned schemes with other preset positions can refer to the present embodiment, and are not described in detail herein.

The target shift register comprises a shift register at the lower left corner of the four adjacent shift registers. Referring to fig. 7, fig. 7 is a schematic diagram illustrating shifting of pixel values of a first sub-feature image according to the present application. It should be noted that any four shift registers adjacent to each other in the shift register array may be regarded as a group of shift registers shown in fig. 7. Taking the second shift register in the set of shift registers shown in fig. 7 as an example, the first shift register may be in another set of shift registers, and the third shift register or the target shift register may be in another set of shift registers.

Fig. 7 only schematically illustrates the moving direction of pixel values in one set of shift registers, and the moving direction of pixel values in other sets of shift registers can be illustrated with reference to fig. 7 and will not be described in detail in this application.

As shown in fig. 7 (PE is not illustrated in fig. 7), in a group of shift registers, the shift register at the upper left corner can be regarded as a first shift register, the shift register at the right side of the first shift register can be regarded as a second shift register, the shift register at the lower side of the second shift register can be regarded as a third shift register, and the shift register at the left side of the third shift register can be regarded as the target shift register.

In fig. 7, S71 may move the value in the first register of each first shift register in the partial shift registers to the second register of the second shift register at the right side of the first shift register.

Then, each compute core may store the larger value of the first register and the second register of each second shift register into the first register of each second shift register.

S72, the value in the first register of the second shift registers can be moved to the second register of the third shift register below the second shift register.

Then, each compute core may store the larger value of the first register and the second register of each third shift register into the first register of each third shift register.

S73, the value in the first register of the third shift registers can be shifted to the second register of the target shift register at the left side of the third shift register.

Then, each compute kernel may store the larger value in the first register and the second register of each target shift register as the first part pooling processing result into the first register of each target shift register.

Therefore, the maximum value of the four pixel values at the upper left corner, the upper right corner, the lower left corner and the lower right corner in the same pooling window, namely the first maximum value of the four pixel values adjacent to the upper, lower, left and right in the first sub-feature map, can be stored in the target shift register.

Then, S63 is executed to shift the pixel values included in the second sub-feature map to the partial shift registers according to the predetermined shifting method.

In some examples, the pixel values included in the second sub-feature map may be respectively shifted to the second registers of the partial shift registers according to the predetermined shifting method.

Then, each compute kernel may execute S64, perform pooling operation on pixel values in any two vertically adjacent shift registers included in the shift register array according to the pooling instruction to obtain a second part pooling result, and store the second part pooling result in the target shift register.

Referring to fig. 8, fig. 8 is a schematic diagram illustrating shifting of pixel values of a second sub-feature image according to the present application. It should be noted that any two adjacent shift registers in the shift register array up and down can be regarded as a group of shift registers shown in fig. 8. Fig. 8 only schematically illustrates the moving direction of pixel values in one set of shift registers, and the moving direction of pixel values in other sets of shift registers can be illustrated with reference to fig. 8 and will not be described in detail in this application.

As shown in fig. 8 (the registers are not shown in fig. 8), the shift register in the upper position in the group of shift registers may be regarded as the first shift register, and the shift register below the first shift register may be regarded as the target shift register.

S81, the value in the second register of each first shift register in the partial shift registers is moved to the third register of the target shift register below the first shift register.

S82, each compute kernel may store the larger value in the second register and the third register of each target shift register as the result of the second part of pooling processing in the second register of each target shift register.

Therefore, the maximum value of the two pixel values respectively positioned in the first row, the second row and the third row, the second column and the third row in the same pooling window, namely the second maximum value of the two pixel values adjacent up and down in the second sub-feature map can be stored in the target shift register.

Then, S65 may be executed to shift the pixel values included in the third sub-feature map to the partial shift registers according to the predetermined shifting method.

In some examples, the pixel values included in the third sub-feature map are respectively shifted to the third registers of the partial shift registers according to the predetermined shifting method.

Then, each compute kernel may execute S66, perform pooling operation on pixel values in any two left and right adjacent shift registers included in the shift register array according to the pooling instruction to obtain a third part pooling result, and store the third part pooling result in the target shift register.

Referring to fig. 9, fig. 9 is a schematic diagram illustrating shifting of pixel values of a third sub-feature image according to the present application. It should be noted that any two adjacent left and right shift registers in the shift register array can be regarded as a group of shift registers shown in fig. 9. Fig. 9 only schematically illustrates the moving direction of pixel values in one set of shift registers, and the moving direction of pixel values in other sets of shift registers can be illustrated with reference to fig. 9 and will not be described in detail in this application.

As shown in fig. 9 (registers are not shown in fig. 9), the shift register in the upper left corner of the shift registers in the group may be regarded as the first shift register, the shift register in the right side of the first shift register may be regarded as the second shift register, the shift register in the lower side of the second shift register may be regarded as the third shift register, and the shift register in the left side of the third shift register may be regarded as the target shift register.

S91, the value in the third register of each first shift register in the partial shift registers is moved to the fourth register of the second shift register at the right side of the first shift register.

S92, each compute kernel may store the larger value in the third register and the fourth register of each second shift register as the result of the third part of pooling processing in the third register of each second shift register.

At this point, the maximum value of the two pixel values in the first column of the second row and the third column of the second row in the same pooling window, i.e. the third maximum value of the two pixel values adjacent to each other in the left and right in the third sub-feature map, may be stored in the second shift register.

Thereafter, the third maximum value (the third partial pooling result) may be shifted to the target shift register. Then, step 93 is executed to shift the third pooled result from the third register of the second shift register to the third register of the first shift register at the left side of the second shift register. S94, the third part of the pooling result in the third register of the first shift register is shifted to the third register of the target shift register under the first shift register.

In this way, the maximum value of the two pixel values in the first column in the second row and the third column in the second row in the same pooling window, that is, the third maximum value of the two pixel values adjacent to each other in the left and right in the third sub-feature map, can be stored in the target shift register.

In some examples, when data is moved from the second shift register to the target shift register, the data may be moved to a third shift register and then moved to the target shift register.

Then, S67 may be executed to shift the pixel values included in the fourth sub-feature map to the partial shift registers according to the predetermined shifting method.

In some examples, the pixel values included in the fourth sub-feature map may be respectively shifted to the fourth registers of the partial shift registers according to the preset shifting method.

Then, S68 may be executed to move the pixel values in the shift registers of the partial shift registers to the target shift register.

In some examples, the value in the fourth register of each first shift register in the partial shift registers may be moved to the fourth register of the target shift register below the first shift register.

To this end, the target shift register includes a first part of the pooling result (first maximum), a second part of the pooling result (second maximum), a third part of the pooling result (third maximum), and a pixel value in the middle of the pooling window (a pixel value corresponding to the fourth sub-feature map).

Finally, the first partial pooling result, the second partial pooling result, the third partial pooling result, and the pixel value corresponding to the fourth sub-feature map in each target shift register included in the shift register array may be compared, and a maximum value thereof may be output, so as to obtain a pooling result corresponding to the target feature map.

In some examples, each compute core may store a larger value of the first and second registers of each target shift register in the first register of each target shift register.

Then, the larger value of the first register and the third register of each target shift register may be stored in the first register of each target shift register.

Then, the larger value of the first register and the fourth register of each target shift register can be stored into the first register of each target shift register.

And finally, outputting the numerical value stored in the first register of each target shift register to obtain a pooling result corresponding to the target characteristic diagram.

In this way, the maximum value obtained by performing the maximum pooling for each pooling window may be stored in the target shift register, and the pooling result corresponding to the target feature map may be obtained by outputting the maximum value in each target shift register.

In some examples, to further increase pooling efficiency, a plurality of temporary registers are coupled to the periphery of the shift register. The temporary register is used for storing pixel values overflowing the shift register array when carrying out numerical value shifting operation.

Therefore, in the data moving process, the pixel values overflowing the shift register array do not need to be stored in the RAM, and only the overflowing pixel values need to be stored in the temporary register, so that the data moving efficiency is improved, and the pooling efficiency is further improved.

In some examples, the size of the feature map to be pooled may be larger than the size of the shift register array. The present application provides a pooling method. Referring to fig. 10, fig. 10 is a flow chart illustrating a method of pooling according to the present application.

As shown in fig. 10, the method may include:

and S1002, acquiring an original characteristic diagram.

And S1004, dividing the original feature map into a plurality of target feature maps.

S1006, performing pooling processing on each target feature map according to the pooling method shown in any of the above embodiments to obtain pooling results corresponding to each target feature map.

And S1008, outputting the pooling results corresponding to the target feature maps to obtain the pooling results corresponding to the original feature maps.

In the above scheme, the original feature maps may be divided to obtain a plurality of target feature maps, and then each target feature map is pooled according to the pooling method shown in any of the above embodiments to obtain a pooling result corresponding to each target feature map. And finally, outputting the pooling result corresponding to each target feature map to obtain the pooling result corresponding to the original feature map. Therefore, the efficient pooling processing of the original characteristic diagram larger than the shift register array can be realized.

The application also provides a chip. The chip may include a controller;

the controller is used for acquiring a target characteristic diagram;

splitting the target characteristic diagram to obtain a plurality of sub-characteristic diagrams; wherein, at least part of the pixel values in the target feature map in the same pooling window are respectively in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map;

In some illustrated embodiments, the controller is specifically configured to:

and loading pixel values respectively included by the sub-feature maps into a shift register array, and performing parallel pooling processing on the pixel values at the same position in the sub-feature maps according to a pooling instruction to obtain a pooling result corresponding to the target feature map.

In some illustrated embodiments, the controller is specifically configured to:

determining the shift operation mode of the shift register array corresponding to each sub-feature map according to the position of the pixel value in the same pooling window in each sub-feature map;

respectively loading the sub-feature maps into a displacement register array, executing displacement operation on pixel values stored by a displacement register in the displacement register array according to a displacement operation mode of the displacement register array determined for each sub-feature map, and parallelly obtaining partial pooling results corresponding to different pooling windows in the feature sub-map according to a pooling instruction;

and determining a pooling result corresponding to the target feature map according to the partial pooling results of different pooling windows corresponding to each sub-feature map.

In some illustrated embodiments, the controller is specifically configured to: determining pixel values at odd-numbered row and odd-numbered column positions in the target feature map as a first sub-feature map; determining pixel values at odd-numbered row and even-numbered column positions in the target feature map as a second sub-feature map; determining pixel values at even-numbered rows and odd-numbered columns in the target feature map as a third sub-feature map; and determining pixel values at even-numbered rows and even-numbered columns in the target feature map as a fourth sub-feature map.

In some illustrated embodiments, the controller is specifically configured to:

shifting each pixel value included in the first sub-feature map to at least a part of shift registers included in the shift register array;

respectively moving each pixel value included in the second sub-feature graph to the partial shift registers, so that the computing kernel corresponding to each shift register performs pooling processing on the two received pixel values according to the pooling instruction, and a first pooling processing result is obtained;

respectively moving each pixel value included in the third sub-feature map to the partial shift register, so that each computation core performs pooling processing on the first pooling processing result and the received pixel value according to the pooling instruction, and a second pooling processing result is obtained;

respectively moving each pixel value included in the fourth sub-feature map to the partial shift register, so that each computation core performs pooling processing on the second pooling processing result and the received pixel value according to the pooling instruction, and a third pooling processing result is obtained;

and outputting a third pooling result obtained by pooling the calculation cores respectively to obtain a pooling result corresponding to the target feature map.

In some embodiments shown, the pooling includes a maximum pooling; the pooling instruction includes comparing a maximum value therebetween; the controller is specifically configured to:

shifting each pixel value included in the first sub-feature map to a first register of at least a portion of shift registers included in the shift register array; and the number of the first and second groups,

and moving each pixel value included in the second sub-feature map to a second register of the partial shift register, so that each calculation unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the first pooling result.

In some illustrated embodiments, the controller is specifically configured to:

moving each pixel value included in the third sub-feature map to a second register of the partial shift register, so that each calculation unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the second pooling result; and the number of the first and second groups,

and moving each pixel value included in the fourth sub-feature map to a second register of the partial shift register, so that each calculation unit acquires a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the third pooling result.

In some illustrated embodiments, the controller is specifically configured to:

and outputting the numerical value stored in the first register of the partial shift register to obtain a pooling result corresponding to the target characteristic diagram.

In some illustrated embodiments, the controller is specifically configured to:

performing pooling operation on pixel values in any four displacement registers adjacent to each other up, down, left and right in the displacement register array according to the pooling instruction to obtain a first part of pooling processing result, and storing the first part of pooling processing result into a target displacement register at a preset position in the four displacement registers adjacent to each other up, down, left and right;

moving each pixel value included in the second sub-feature map to the partial shift register;

performing pooling operation on pixel values in any two vertically adjacent shift registers included in the shift register array according to the pooling instruction to obtain a second part of pooling processing result, and storing the second part of pooling processing result in the target shift register;

moving each pixel value included in the third sub-feature map to the partial shift register;

performing pooling operation on pixel values in any two left and right adjacent shift registers included in the shift register array according to the pooling instruction to obtain a third part of pooling processing result, and storing the third part of pooling processing result in the target shift register;

and respectively moving each pixel value included in the fourth sub-feature map to the partial shift register, and moving the pixel value in each shift register in the partial shift register to the target shift register.

In some embodiments shown, the pooling includes a maximum pooling; the pooling instruction includes comparing a maximum value therebetween; the preset positions comprise the left lower corner positions in any four displacement registers which are adjacent up, down, left and right; the target shift register comprises a shift register at the lower left corner of the four adjacent shift registers; the controller is specifically configured to:

shifting each pixel value included in the first sub-feature map to a first register of at least a portion of shift registers included in the shift register array;

the controller is specifically configured to:

moving the value in the first register of each first shift register in the partial shift registers to the second register of the second shift register at the right of the first shift register;

storing the larger value in the first register and the second register in each second shift register into the first register of each second shift register;

moving the value in the first register in each second shift register to a second register of a third shift register below the second shift register;

storing a larger numerical value in the first register and the second register in each third shift register into the first register of each third shift register;

moving the value in the first register in each third shift register to the second register of the target shift register at the left of the third shift register;

and storing the larger numerical value in the first register and the second register in each target shift register as the first part pooling processing result into the first register of each target shift register.

In some illustrated embodiments, the controller is specifically configured to:

moving each pixel value included in the second sub-feature map to a second register of the partial shift register;

the controller is specifically configured to:

moving the value in the second register of each first shift register in the partial shift registers to the third register of the target shift register below the first shift register;

and storing the larger numerical value in the second register and the third register in each target shift register as the second part pooling processing result into the second register of each target shift register.

In some illustrated embodiments, the controller is specifically configured to:

moving each pixel value included in the third sub-feature map to a third register of the partial shift register;

the controller is specifically configured to:

moving the value in the third register of each first shift register in the partial shift registers to the fourth register of the second shift register at the right of the first shift register;

taking a larger numerical value in a third register and a fourth register in each second shift register as the third part pooling processing result, and storing the result into the third register of each second shift register;

transferring the third part of the pooling processing result in the third register of the second shift register to the third register of the first shift register at the left side of the second shift register;

and shifting the third part of pooling processing result in the third register of the first shift register to the third register of the target shift register below the first shift register.

In some embodiments shown, the control appliance is used:

moving each pixel value included in the fourth sub-feature map to a fourth register of the partial shift register; the control device body is as follows:

and moving the value in the fourth register of each first shift register in the partial shift registers to the fourth register of the target shift register below the first shift register.

In some embodiments shown, the control appliance is used:

storing a larger numerical value in a first register and a second register in each target displacement register into the first register of each target displacement register;

storing larger values in a first register and a third register of each target shift register into the first register of each target shift register;

storing a larger numerical value in a first register and a fourth register in each target displacement register into the first register of each target displacement register;

and outputting the numerical value stored in the first register of each target shift register to obtain a pooling result corresponding to the target characteristic diagram.

The application also provides a chip. The chip may include a controller; the controller is used for acquiring an original characteristic diagram;

dividing the original characteristic graph into a plurality of target characteristic graphs;

pooling each target feature map according to a pooling method shown in any one of the foregoing embodiments to obtain pooling results corresponding to each target feature map;

The application also provides an electronic device, which comprises the chip shown in any one of the embodiments.

For example, the electronic device may be a smart terminal such as a mobile phone, or may be another device that has a camera and can perform image processing. Illustratively, when the electronic device performs pooling processing on the acquired image, the chip of the embodiment of the present application may be used to perform the pooling task.

Because the chip has higher pooling treatment efficiency and higher performance, the chip can assist in improving the treatment efficiency of the pooling task, thereby improving the performance of the electronic equipment.

One skilled in the art will recognize that one or more embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, 0xCD _00-ROM, optical storage, and the like) having computer-usable program code embodied therein.

"and/or" as recited herein means having at least one of two, for example, "a and/or B" includes three scenarios: A. b, and "A and B".

The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the data processing apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment.

The foregoing description of specific embodiments of the present application has been presented. Other embodiments are within the scope of the following claims. In some cases, the acts or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

Embodiments of the subject matter and functional operations described in this application may be implemented in the following: digital electronic circuitry, tangibly embodied computer software or firmware, computer hardware including the structures disclosed in this application and their structural equivalents, or a combination of one or more of them. Embodiments of the subject matter described in this application can be implemented as one or more computer programs, i.e., one or more units of computer program instructions, encoded on a tangible, non-transitory program carrier for execution by, or to control the operation of, a data processing chip. Alternatively or additionally, the program instructions may be encoded on an artificially generated propagated signal, e.g., a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode and transmit information to a suitable receiver chip for execution by the data processing chip. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of them.

The processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output. The processes and logic flows described above can also be performed by, and a chip can be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

Computers suitable for executing computer programs include, for example, general and/or special purpose microprocessors, or any other type of central processing unit. Generally, a central processing unit will receive instructions and data from a read-only memory and/or a random access memory. The basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer does not necessarily have such a device. Moreover, a computer may be embedded in another device, e.g., a mobile telephone, a Personal Digital Assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device such as a Universal Serial Bus (USB) flash drive, to name a few.

Computer-readable media suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including by way of example semiconductor memory devices (e.g., EPROM, EEPROM, and flash memory devices), magnetic disks (e.g., an internal hard disk or a removable disk), magneto-optical disks, and 0xCD _00ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.

Although this application contains many specific implementation details, these should not be construed as limiting the scope of any disclosure or of what may be claimed, but rather as merely describing features of particular disclosed embodiments. Certain features that are described in this application in the context of separate embodiments can also be implemented in combination in a single embodiment. In other instances, features described in connection with one embodiment may be implemented as discrete components or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system elements and components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Thus, particular embodiments of the subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. Further, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some implementations, multitasking and parallel processing may be advantageous.

The foregoing is merely a preferred embodiment of one or more embodiments of the present application and is not intended to limit the scope of the one or more embodiments of the present application, such that any modifications, equivalents, improvements and the like which come within the spirit and principle of one or more embodiments of the present application are included within the scope of the one or more embodiments of the present application.

Claims

1. A pooling method, characterized in that the method comprises:

acquiring a target characteristic diagram;

splitting the target characteristic diagram to obtain a plurality of sub-characteristic diagrams; wherein at least part of the pixel values in the target feature map within the same pooling window are respectively in different sub-feature maps, and the pixel values at the same position in each pooling window are in the same sub-feature map;

2. The method according to claim 1, wherein the parallel processing of the pixels belonging to different pooling windows in each sub-feature map to obtain the pooling result corresponding to the target feature map comprises:

and loading the pixel values respectively included by the sub-feature maps into a shift register array, and performing parallel pooling processing on the pixel values at the same position in the sub-feature maps according to a pooling instruction to obtain a pooling result corresponding to the target feature map.

3. The method according to claim 1, wherein the parallel processing of the pixel values belonging to different pooling windows in each sub-feature map to obtain the pooling result corresponding to the target feature map comprises:

4. The method according to any one of claims 1 to 3, wherein the splitting the target feature map to obtain a plurality of sub-feature maps comprises:

determining pixel values at odd-numbered row and odd-numbered column positions in the target feature map as a first sub-feature map;

determining pixel values at even-numbered row and odd-numbered column positions in the target feature map as a third sub-feature map;

and determining pixel values at even-numbered row and even-numbered column positions in the target feature map as a fourth sub-feature map.

5. The method according to claim 4, wherein the loading pixel values included in the sub feature maps into a shift register array, and performing parallel pooling processing on the pixel values in the same position in the sub feature maps according to a pooling instruction to obtain a pooled result shift register array corresponding to the target feature map comprises:

respectively moving each pixel value included in the first sub-feature map to at least part of the shift registers included in the shift register array;

respectively moving each pixel value included in the second sub-feature graph to the partial displacement registers, so that the computing kernel corresponding to each displacement register performs pooling processing on the two received pixel values according to the pooling instruction, and a first pooling processing result is obtained;

respectively moving each pixel value included in the third sub-feature graph to the partial shift register, so that each computation kernel performs pooling processing on the first pooling processing result and the received pixel value according to the pooling instruction, and a second pooling processing result is obtained;

respectively moving each pixel value included in the fourth sub-feature graph to the partial shift register, so that each computation kernel performs pooling processing on the second pooling processing result and the received pixel value according to the pooling instruction, and a third pooling processing result is obtained;

and outputting a third pooling result obtained by pooling the calculation cores respectively to obtain a pooling result corresponding to the target characteristic diagram.

6. The method of claim 5, wherein the pooling treatment comprises a maximum pooling treatment; the pooling instruction includes comparing a maximum value between the two;

the moving the pixel values included in the first sub-feature map to at least a part of the shift registers included in the shift register array includes:

respectively moving each pixel value included in the first sub-feature map to a first register of at least part of shift registers included in the shift register array;

the moving each pixel value included in the second sub-feature map to the partial shift register respectively so that each computing unit performs pooling processing on the two received pixel values according to the pooling instruction to obtain a first pooling processing result includes:

and respectively moving each pixel value included in the second sub-feature map to a second register of the partial shift register, so that each computing unit obtains the maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value as the first pooling processing result in the first register.

7. The method according to claim 6, wherein the moving the pixel values included in the third sub-feature map to the partial shift registers respectively, so that each computing unit performs pooling on the first pooling result and the received pixel values according to the pooling instruction to obtain a second pooling result comprises:

respectively moving each pixel value included in the third sub-feature map to a second register of the partial shift register, so that each computing unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value as the second pooling processing result in the first register;

the moving each pixel value included in the fourth sub-feature map to the partial shift register, so that each computing unit performs pooling processing on the second pooling processing result and the received pixel value according to the pooling instruction, to obtain a third pooling processing result, includes:

and respectively moving each pixel value included in the fourth sub-feature map to a second register of the partial shift register, so that each computing unit obtains the maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value in the first register as the third pooling processing result.

8. The method according to claim 7, wherein the outputting a third pooling result obtained by pooling in the partial shift register of each computing unit to obtain a pooling result corresponding to the target feature map comprises:

and outputting the numerical value stored in the first register of the partial displacement register to obtain a pooling result corresponding to the target characteristic diagram.

9. The method according to claim 4, wherein the parallel processing of the pixels belonging to different pooling windows in each sub-feature map to obtain the pooling result corresponding to the target feature map comprises:

respectively moving each pixel value included in the second sub-feature map to the partial shift register;

performing pooling operation on pixel values in any two vertically adjacent shift registers included in the shift register array according to the pooling instruction to obtain a second part of pooling processing result, and storing the second part of pooling processing result to the target shift register;

respectively moving each pixel value included in the third sub-feature map to the partial shift register;

performing pooling operation on pixel values in any two left and right adjacent displacement registers included in the displacement register array according to the pooling instruction to obtain a third part of pooling processing result, and storing the third part of pooling processing result to the target displacement register;

respectively moving each pixel value included in the fourth sub-feature graph to the partial shift register, and moving the pixel value in each shift register in the partial shift register to the target shift register;

and obtaining a pooling result corresponding to the target feature map according to the first part of pooling processing results, the second part of pooling processing results, the third part of pooling processing results and the fourth sub-feature map in each target shift register included in the shift register array.

10. The method of claim 9, wherein the pooling process comprises a maximum pooling process; the pooling instruction includes comparing a maximum value between the two; the preset positions comprise the lower left corner positions of any four adjacent displacement registers in the vertical direction, the horizontal direction and the horizontal direction; the target shift register comprises a shift register at the lower left corner of the four adjacent shift registers;

the pooling operation of the pixel values in any four displacement registers adjacent to each other up, down, left and right in the displacement register array is performed according to the pooling instruction to obtain a first part of pooling processing results, and the first part of pooling processing results are stored in target displacement registers at preset positions in the four displacement registers adjacent to each other up, down, left and right, and the pooling operation includes:

moving the value in the first register of each first shift register in the partial shift registers to the second register of the second shift register at the right side of the first shift register;

and storing a larger numerical value in a first register and a second register in each target shift register as the first part pooling processing result into the first register of each target shift register.

11. The method of claim 10, wherein the moving the pixel values included in the second sub-feature map into the partial shift registers respectively comprises:

respectively moving each pixel value included in the second sub-feature map to a second register of the partial shift registers;

the pooling operation of the pixel values in any two adjacent up-down shift registers included in the shift register array according to the pooling instruction to obtain a second part of pooling processing result, and storing the second part of pooling processing result in the target shift register includes:

moving the value in the second register of each first shift register in the partial shift registers to a third register of a target shift register below the first shift register;

and taking a larger numerical value in a second register and a third register in each target displacement register as the second part pooling processing result, and storing the result into the second register of each target displacement register.

12. The method of claim 11, wherein the moving the pixel values included in the third sub-feature map into the partial shift registers respectively comprises:

respectively moving each pixel value included in the third sub-feature map to a third register of the partial shift register;

the pooling operation of the pixel values in any two left and right adjacent shift registers included in the shift register array according to the pooling instruction to obtain a third part of pooling processing result, and storing the third part of pooling processing result in the target shift register includes:

moving the numerical value in the third register of each first shift register in the partial shift registers to the fourth register of the second shift register at the right side of the first shift register;

taking a larger numerical value in a third register and a fourth register in each second shift register as the third part pooling processing result, and storing the third partial pooling processing result in the third register of each second shift register;

transferring a third part of pooling processing results in a third register of the second shift register to a third register of the first shift register at the left of the second shift register;

and transferring the third part of pooling processing result in the third register of the first shift register to the third register of the target shift register below the first shift register.

13. The method of claim 12, wherein the moving the pixel values included in the fourth sub-feature map into the partial shift registers respectively comprises:

respectively moving each pixel value included in the fourth sub-feature map to a fourth register of the partial shift register;

the moving the pixel values in the shift registers of the partial shift registers to the target shift register includes:

14. The method according to claim 13, wherein obtaining the pooling result corresponding to the target feature map according to the first partial pooling result, the second partial pooling result, the third partial pooling result and the pixel value corresponding to the fourth sub-feature map in each target shift register included in the shift register array comprises:

and outputting the numerical value stored in the first register of each target displacement register to obtain a pooling result corresponding to the target characteristic graph.

15. The method of any of claims 9-14, wherein a plurality of temporary registers are coupled to the periphery of the shift register array; the temporary register is used for storing pixel values overflowing the displacement register array when carrying out numerical value shifting operation.

16. The method according to any of claims 1-15, wherein at least some of the sub-feature maps comprise a number of pixels corresponding to a number of shift registers of the shift register array.

17. A pooling method, characterized in that the method comprises:

acquiring an original characteristic diagram;

dividing the original feature map into a plurality of target feature maps;

pooling processing of each target feature map according to the pooling method of any of claims 1-16 to obtain pooling results corresponding to each target feature map;

18. A chip, wherein the chip comprises a controller;

the controller is used for acquiring a target characteristic diagram;

19. The chip of claim 18, wherein the controller is configured to:

20. The chip of claim 18, wherein the controller is configured to:

21. The chip of any one of claims 18 to 20, wherein the controller is configured to:

22. The chip of claim 21, wherein the controller is configured to:

23. The chip of claim 22, wherein the pooling process comprises a maximal pooling process; the pooling instruction includes comparing a maximum value between the two; the controller is configured to:

respectively moving each pixel value included in the first sub-feature map to a first register of at least part of shift registers included in the shift register array; and the number of the first and second groups,

24. The chip of claim 23, wherein the controller is configured to:

respectively moving each pixel value included in the third sub-feature map to a second register of the partial shift register, so that each computing unit obtains a maximum value of the numerical values stored in the first register and the second register according to the pooling instruction, and stores the maximum value as the second pooling processing result in the first register; and the number of the first and second groups,

25. The chip of claim 24, wherein the controller is configured to:

26. A chip, wherein the chip comprises a controller;

the controller is used for acquiring an original characteristic diagram;

dividing the original feature map into a plurality of target feature maps;

27. An electronic device comprising a chip according to any one of claims 18-25 or 26.

28. A computer-readable storage medium, on which a computer program is stored, which, when executed by a controller, implements the method of any one of claims 11 to 16 or 17.