WO2022160703A1 - Procédé de regroupement, et puce, dispositif, et support de stockage - Google Patents

Procédé de regroupement, et puce, dispositif, et support de stockage Download PDF

Info

Publication number
WO2022160703A1
WO2022160703A1 PCT/CN2021/115667 CN2021115667W WO2022160703A1 WO 2022160703 A1 WO2022160703 A1 WO 2022160703A1 CN 2021115667 W CN2021115667 W CN 2021115667W WO 2022160703 A1 WO2022160703 A1 WO 2022160703A1
Authority
WO
WIPO (PCT)
Prior art keywords
pooling
shift register
feature map
register
sub
Prior art date
Application number
PCT/CN2021/115667
Other languages
English (en)
Chinese (zh)
Inventor
周军
常亮
周亮
杨雨桐
Original Assignee
成都商汤科技有限公司
电子科技大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 成都商汤科技有限公司, 电子科技大学 filed Critical 成都商汤科技有限公司
Publication of WO2022160703A1 publication Critical patent/WO2022160703A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map

Definitions

  • the present application relates to computer technology, and in particular to a pooling method, chip, device and storage medium.
  • Pooling refers to down-sampling the input feature map, reducing the number of features and simplifying the computational complexity of convolutional networks while maintaining the invariance of features in certain dimensions (eg, rotation, translation, scaling). .
  • Pooling processing usually needs to rely on artificial intelligence chips (hereinafter referred to as AI chips).
  • AI chips artificial intelligence chips
  • the present application discloses at least one pooling method, and the method may include: acquiring a target feature map; splitting the above target feature map to obtain several sub-feature maps; wherein, the above target feature maps are in the same pooling At least some of the pixel values in the window are in different sub-feature maps respectively, and the pixel values in the same position in each pooling window are in the same sub-feature map; the pixels belonging to different pooling windows in each sub-feature map are processed in parallel to obtain the above The pooling result corresponding to the target feature map.
  • the above-mentioned parallel processing of pixels belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the above-mentioned target feature map includes: It is loaded into the shift register array, and according to the pooling instruction, the pixel values in the same position in the above-mentioned several sub-feature maps are pooled in parallel to obtain the pooling result corresponding to the above-mentioned target feature map.
  • the above-mentioned parallel processing of pixels belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the above-mentioned target feature map includes: according to the fact that each sub-feature map is in the same pooling window The position of the pixel value of the sub-feature map is determined, and the shift operation mode of the shift register array corresponding to each sub-feature map is determined; the above-mentioned sub-feature maps are respectively loaded into the shift register array, and for each sub-feature map, according to The shift operation mode of the shift register array determined by the sub-feature map performs a shift operation on the pixel values stored in the shift registers in the above-mentioned shift register array. Partial pooling results of different pooling windows; according to the partial pooling results of each sub-feature map corresponding to different pooling windows, determine the pooling results corresponding to the above target feature maps.
  • the above-mentioned splitting the above-mentioned target feature map to obtain several sub-feature maps includes: determining the pixel values at odd-numbered rows and odd-numbered column positions in the above-mentioned target feature map as the first sub-feature map ; The pixel value in the odd row, the even column position in the above-mentioned target feature map is determined as the second sub-feature map; The pixel value in the even row, the odd column position in the above-mentioned target feature map is determined as the third sub-feature map; The pixel values in the even-numbered rows and even-numbered columns in the above target feature map are determined as the fourth sub-feature map.
  • the pixel values respectively included in the several sub-feature maps are loaded into the shift register array, and the pixel values in the same position in the several sub-feature maps are pooled in parallel according to the pooling instruction processing to obtain a pooling result corresponding to the target feature map, including: moving each pixel value included in the first sub-feature map to at least part of the shift registers included in the shift register array; Each pixel value included in the feature map is respectively moved to the above-mentioned partial shift register, so that the calculation kernel corresponding to each shift register in the partial shift register pools the received two pixel values according to the above-mentioned pooling instruction.
  • the first pooling processing result is obtained; the pixel values included in the third sub-feature map are respectively moved to the partial shift registers, so that the calculation corresponding to each shift register in the partial shift registers
  • the kernel performs pooling processing on the above-mentioned first pooling processing result and the received pixel value according to the above-mentioned pooling instruction to obtain the second pooling processing result; each pixel value included in the above-mentioned fourth sub-feature map is respectively moved to the above-mentioned part In the shift register, so that the calculation kernel corresponding to each shift register in the partial shift register performs the pooling process on the above-mentioned second pooling processing result and the received pixel value according to the above-mentioned pooling instruction, and obtains the third. Pooling processing result; outputting a third pooling processing result obtained by performing pooling processing on the computing cores corresponding to each of the shift registers in the partial shift registers, and obtaining a pooling result corresponding to the above target feature map.
  • the above-mentioned pooling processing includes maximum pooling processing; the above-mentioned pooling instruction includes comparing the maximum value between the two; the above-mentioned moving each pixel value included in the above-mentioned first sub-feature map to the above-mentioned At least part of the shift registers included in the shift register array includes: moving each pixel value included in the first sub-feature map to the first registers of at least part of the shift registers included in the shift register array; Each pixel value included in the above-mentioned second sub-feature map is respectively moved to the above-mentioned partial shift register, so that the calculation kernel corresponding to each of the shift registers in the partial shift register, according to the above-mentioned pooling instruction, The pixel values are pooled to obtain a first pooling result, which includes: moving each pixel value included in the second sub-feature map to the second register of the partial shift register, so that the partial shift register and the partial shift register are respectively moved.
  • the pixel values included in the third sub-feature map are respectively moved to the partial shift registers, so that the calculation kernel corresponding to each shift register in the partial shift registers According to the above-mentioned pooling instruction, performing pooling processing on the above-mentioned first pooling processing result and the received pixel values to obtain a second pooling processing result, including: moving each pixel value included in the above-mentioned third sub-feature map to the above-mentioned In the second register of the partial shift register, so that the computing kernel corresponding to each shift register in the partial shift register obtains the value stored in the first register and the second register according to the pooling instruction.
  • the maximum value is stored in the above-mentioned first register as the above-mentioned second pooling processing result; the above-mentioned each pixel value included in the above-mentioned fourth sub-feature map is respectively moved to the above-mentioned part of the shift register, so that the The computing kernel corresponding to each shift register in the partial shift registers performs pooling processing on the above-mentioned second pooling processing result and the received pixel value according to the above-mentioned pooling instruction to obtain a third pooling processing result, including: Each pixel value included in the above-mentioned fourth sub-feature map is respectively moved to the second register of the above-mentioned partial shift register, so that the calculation kernel corresponding to each of the shift registers in the partial shift register obtains according to the above-mentioned pooling instruction.
  • the maximum value among the values stored in the first register and the second register is stored in the first register as the third pooling processing result.
  • the above-mentioned output is the third pooling processing result obtained by performing pooling processing on the computing kernels corresponding to each of the shift registers in the partial shift registers, to obtain a pool corresponding to the above-mentioned target feature map.
  • the pooling result includes: outputting the value stored in the first register of the partial shift register to obtain the pooling result corresponding to the target feature map.
  • the above-mentioned parallel processing of pixels belonging to different pooling windows in each sub-feature map to obtain a pooling result corresponding to the above-mentioned target feature map includes: combining each pixel included in the above-mentioned first sub-feature map The values are respectively moved to at least part of the shift registers included in the above-mentioned shift register array; according to the above-mentioned pooling instruction, the pixel values in any four up, down, left, and right adjacent shift registers included in the above-mentioned shift register array are pooled.
  • the pixel values included in the figure are respectively moved to the above-mentioned partial shift registers; according to the above-mentioned pooling instruction, the pixel values in any two up and down adjacent shift registers included in the above-mentioned shift register array are pooled to obtain The second part of the pooling processing result is stored in the above-mentioned target shift register; the pixel values included in the above-mentioned third sub-feature map are respectively moved to the above-mentioned partial shift register; according to the above The pooling instruction performs a pooling operation on the pixel values in any two left and right adjacent shift registers included in the above-mentioned shift register array to obtain a third part of the pooling processing result, and the above-mentioned third part of the pooling processing result Store to the above-ment
  • the second partial pooling processing result the third partial pooling processing result in each target shift register included in the above-mentioned shift register array, and the above-mentioned fourth sub-feature map, the above-mentioned target feature is obtained.
  • the above-mentioned pooling processing includes maximum pooling processing; the above-mentioned pooling instruction includes comparing the maximum value between the two; the above-mentioned preset position includes any four adjacent shift registers in the upper, lower, left and right.
  • the above-mentioned target shift register includes the shift register at the lower left corner position in the above-mentioned four adjacent shift registers; the above-mentioned each pixel value included in the above-mentioned first sub-feature map is respectively moved to the above-mentioned shift
  • At least part of the shift registers included in the register array include: moving each pixel value included in the first sub-feature map to the first registers of at least part of the shift registers included in the shift register array; The first part of the pooling processing result is obtained, and the above-mentioned first part of the pooling processing result is stored in the In the target shift register in the preset position among the above-mentioned four adjacent shift registers, including: moving the numerical value in the first register of each first shift register in the above-mentioned partial shift registers to the first shift register.
  • moving each pixel value included in the second sub-feature map to the partial shift register includes: moving each pixel value included in the second sub-feature map to the above-mentioned partial shift register.
  • the above-mentioned pooling operation is performed on the pixel values in any two upper and lower adjacent shift registers included in the above-mentioned shift register array according to the above-mentioned pooling instruction, and the second partial pooling is obtained.
  • the above-mentioned moving each pixel value included in the third sub-feature map to the partial shift register includes: moving each pixel value included in the third sub-feature map to the above-mentioned In the third register of the partial shift register; the above-mentioned pooling operation is performed on the pixel values in any two left and right adjacent shift registers included in the above-mentioned shift register array according to the above-mentioned pooling instruction, so as to obtain a third partial pooling operation.
  • moving each pixel value included in the fourth sub-feature map to the partial shift register includes: moving each pixel value included in the fourth sub-feature map to the aforementioned part In the fourth register of the shift register; the above-mentioned moving the pixel values in each of the shift registers in the above-mentioned partial shift registers to the above-mentioned target shift register includes: The values in the four registers are moved to the fourth register of the target shift register below the first shift register.
  • the first partial pooling processing result, the second partial pooling processing result, the third partial pooling processing result and the above-mentioned first partial pooling processing result in each target shift register included in the above-mentioned shift register array The pixel values corresponding to the four sub-feature maps are obtained to obtain the pooling result corresponding to the above target feature map, including: storing the larger value in the first register and the second register in each target shift register into each target shift register in the first register of each target shift register; store the larger value of the first register in each target shift register and the third register in the first register of each target shift register; store the first register in each target shift register
  • the larger value in the fourth register is stored in the first register of each target shift register; the value stored in the first register of each target shift register is output to obtain the pooling result corresponding to the above target feature map.
  • a plurality of temporary registers are connected to the periphery of the above-mentioned shift register array; the above-mentioned temporary registers are used to store the pixel values that overflow the above-mentioned shift register array when performing a numerical value transfer operation.
  • the number of pixels included in at least some of the sub-feature maps in the above-mentioned several sub-feature maps is consistent with the number of shift registers included in the above-mentioned shift register array.
  • the present application also proposes a pooling method, which may include: obtaining an original feature map; dividing the original feature map into several target feature maps; pooling each target feature map according to the pooling method shown in any of the foregoing embodiments After processing, the pooling results corresponding to each target feature map are obtained; the pooling results corresponding to each target feature map are output to obtain the pooling results corresponding to the above-mentioned original feature maps.
  • the application also proposes a chip, the chip may include a controller; the controller is used to obtain a target feature map; the target feature map is split to obtain several sub-feature maps; wherein, the target feature maps are in the same At least part of the pixel values in the pooling window are in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map; the pixels belonging to different pooling windows in each sub-feature map are processed in parallel, The pooling result corresponding to the above target feature map is obtained.
  • the above-mentioned controller is configured to: load the pixel values respectively included in the above-mentioned sub-feature maps into the shift register array; The values are pooled in parallel to obtain the pooling results corresponding to the above target feature maps.
  • the above-mentioned controller is configured to: determine the shift operation mode of the shift register array corresponding to each sub-feature map according to the positions of the pixel values in the same pooling window in each sub-feature map ; Load the above-mentioned several sub-feature maps into the shift register array respectively, and for each sub-feature map, shift the shift register array in the above-mentioned shift register array according to the shift operation mode of the shift register array determined for the sub-feature map.
  • Shift operation is performed on the pixel values stored in the register, and the partial pooling results corresponding to different pooling windows in the sub-feature map are obtained by parallel pooling processing according to the pooling instruction; the partial pooling results corresponding to different pooling windows are obtained according to each sub-feature map. , and determine the pooling result corresponding to the above target feature map.
  • the controller is configured to: determine the pixel values in odd rows and odd columns in the target feature map as the first sub-feature map; The pixel value of the position is determined as the second sub-feature map; the pixel value in the even-numbered row and odd-numbered column position in the above-mentioned target feature map is determined as the third sub-feature map; The pixel value is determined as the fourth sub-feature map.
  • the controller is configured to: move each pixel value included in the first sub-feature map to at least part of the shift registers included in the shift register array; The included pixel values are respectively moved to the above-mentioned partial shift registers, so that the calculation kernel corresponding to each shift register performs pooling processing on the received two pixel values according to the above-mentioned pooling instructions, and obtains the first pooling processing.
  • each pixel value included in the above-mentioned third sub-feature map is respectively moved to the above-mentioned partial shift registers, so that the calculation kernel corresponding to each of the shift registers in the partial shift registers can perform the above-mentioned pooling instruction according to the above-mentioned pooling instruction.
  • the first pooling processing result and the received pixel values are pooled to obtain the second pooling processing result; the pixel values included in the fourth sub-feature map are respectively moved to the above-mentioned partial shift registers, so as to be consistent with all the pixel values.
  • the computing kernel corresponding to each shift register in the partial shift registers performs pooling processing on the above-mentioned second pooling processing result and the received pixel value according to the above-mentioned pooling instruction to obtain the third pooling processing result;
  • the calculation kernels corresponding to each shift register in the partial shift registers respectively perform pooling processing to obtain the third pooling processing result, and obtain the pooling result corresponding to the above-mentioned target feature map.
  • the above-mentioned pooling processing includes maximum pooling processing; the above-mentioned pooling instruction includes comparing the maximum value between the two; the above-mentioned controller is configured to: combine the pixel values included in the above-mentioned first sub-feature map are respectively moved to the first registers of at least part of the shift registers included in the above-mentioned shift register array; and each pixel value included in the above-mentioned second sub-feature map is respectively moved to the second registers of the above-mentioned part of the shift registers, to Make the calculation kernel corresponding to each shift register in the partial shift register obtain the maximum value of the values stored in the first register and the second register according to the pooling instruction, and use the maximum value as the first A pooling result is stored in the above-mentioned first register.
  • the controller is configured to: move each pixel value included in the third sub-feature map to the second register of the partial shift register, so as to be consistent with the partial shift register.
  • the computing kernel corresponding to each shift register obtains the maximum value of the values stored in the first register and the second register according to the above-mentioned pooling instruction, and stores the above-mentioned maximum value as the result of the above-mentioned second pooling process in the above-mentioned No.
  • the controller is configured to: output the value stored in the first register of the partial shift register to obtain a pooling result corresponding to the target feature map.
  • the present application also proposes a chip, the chip may include a controller; the controller is used to obtain an original feature map; the original feature map is divided into several target feature maps; Each target feature map is pooled to obtain the pooling result corresponding to each target feature map; the pooling result corresponding to each target feature map is output to obtain the pooling result corresponding to the above-mentioned original feature map.
  • the present application also provides an electronic device, including the chip shown in any of the foregoing embodiments.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by the controller, any one of the pooling methods described above is implemented.
  • FIG. 1 is a schematic structural diagram of a shift register array according to an embodiment of the application
  • FIG. 2 is a schematic structural diagram of a PE shown in an embodiment of the application.
  • FIG. 3 is a flowchart of a pooling method according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a splitting process of a target feature map shown in an embodiment of the application.
  • FIG. 5 is a schematic diagram of a splitting process of a target feature map shown in an embodiment of the application.
  • FIG. 6 is a schematic diagram of a pooling window shown in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of shifting pixel values of a first sub-feature map according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of transferring pixel values for a second sub-feature map according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of transferring pixel values for a third sub-feature map according to an embodiment of the present application.
  • FIG. 10 is a flowchart of a pooling method according to an embodiment of the present application.
  • FIG. 1 is a schematic structural diagram of a shift register array shown in the present application.
  • the shift register array may include a plurality of shift registers arranged vertically and horizontally, and each shift register may uniquely correspond to a computing core (Processing Element, hereinafter referred to as PE), and each PE is used to perform operations according to the values in the shift register.
  • PE Computing Element
  • FIG. 1 it can be considered that the above-mentioned shift register array includes a plurality of PEs arranged vertically and horizontally. Data can be moved between any two adjacent PEs (shift registers corresponding to PEs); each row of PEs can obtain data from the corresponding RAM (Random Access Memory). Assume that the size of the above shift register array is 8*8; the size of the feature map that needs to be input is also 8*8.
  • the above-mentioned feature map may be split into 8 rows of pixel values by a controller (eg, an array controller, also referred to as a processor), and the 8 rows of pixel values are divided into 8 rows respectively.
  • the pixel value is input into the RAM corresponding to each row of PE.
  • the above-mentioned controller can move the pixel values in each RAM to the corresponding shift registers according to the position sequence of each pixel value in the above-mentioned feature map through the data moving instruction, so as to complete the input shift of the feature map. operation of the register.
  • the above-mentioned PE can perform data operation on the data in the shift register in response to the instruction.
  • the PE corresponding to the above-mentioned shift register may include a register, and an ALU (arithmetic and logic unit, arithmetic logic unit).
  • the above-mentioned register may be a register obtained by dividing the storage space of the above-mentioned shift register.
  • the above-mentioned shift register may be configured as several registers (for example, register 1 and register 2 shown in FIG. 2 ) that can move data between each other according to actual requirements.
  • the above-mentioned PE can perform arithmetic processing on the data in the multiple registers according to the arithmetic instruction.
  • the ALU user described above performs logical operations. For example, when the PE receives an operation instruction such as adding or subtracting a value or comparing the size, the above-mentioned ALU can perform a relevant operation on the value stored in the register.
  • the above-mentioned shift register array in the embodiments of the present application includes a plurality of shift registers, where the shift registers correspond to respective PEs, and each shift register (or each PE) includes a plurality of registers (For example: the first register, the second register, the third register, the fourth register, etc., the number of registers per PE is not limited here).
  • This application proposes a pooling method.
  • the method splits at least part of the pixel values in the same pooling window in the target feature map into different sub-feature maps respectively, and processes the pixels belonging to different pooling windows in each sub-feature map in parallel to obtain the above target feature
  • the pooling result corresponding to the graph can improve the efficiency of chip pooling processing, reduce the computational burden of the chip, and reduce the difficulty of chip design.
  • FIG. 3 is a flowchart of a pooling method shown in this application. As shown in FIG. 3 , the above-mentioned pooling method may include steps S302 to S306.
  • the above target feature map is a feature map that needs to be pooled.
  • the above target feature map may be a feature map that needs to be pooled after convolution processing.
  • the target feature map may be a target feature map obtained by performing convolution processing on each PE in the shift register array. It is understandable that, the above-mentioned target feature map may be stored in the RAM corresponding to each row of PE.
  • the pooling process usually includes a pooling window of a preset size and a step size of a preset size set according to business requirements.
  • the pooling window as 2*2 and the step size as 2 as an example, when performing the pooling operation on the feature map, it can be understood as starting from the first pixel value in the upper left corner of the feature map, and taking the first pixel value as the upper left corner.
  • the elements form a pooling window of size 2*2.
  • the pooling operation in the pooling window is completed.
  • slide two pixel values to the right of the first pixel value and form a 2*2 pooling window.
  • the pooling operation is performed on the pixel values in the current pooling window.
  • the maximum pixel value output by each pooling window is combined to obtain the pooling result corresponding to the above feature map.
  • the pooling window may include several pixel values. Each pixel value can be in a different position of the pooling window. Take the pooling window as 2*2 as an example. The 4 pixel values in the pooling window can be located in the upper left corner, lower left corner, upper right corner and lower right corner of the pooling window, respectively.
  • the sub-feature map of the splitting method under the condition that the pixel values in the same position in each pooling window are in the same sub-feature map.
  • the target feature map is split according to the determined splitting method to obtain several sub-feature maps.
  • the pooling window is 2*2
  • the above target feature map can be split, and at least some pixel values in the same pooling window in the above target feature map are obtained in different sub-feature maps respectively, and the pixels in the same position in each pooling window are obtained.
  • the split results with values in the same sub-feature map condition, and 4 sub-feature maps are obtained.
  • the correspondence between pooling and splitting schemes may be maintained in advance.
  • parameters such as the pooling window and step size of the pooling process can be determined first. Then, according to the determined parameters, the above-mentioned corresponding relationship is queried to obtain a corresponding splitting scheme, and the above-mentioned target feature map is split according to the above-mentioned splitting scheme.
  • S306 can be executed to perform parallel processing on pixels belonging to different pooling windows in each sub-feature map to obtain the pooling result corresponding to the above target feature map.
  • the pixel values respectively included in the above-mentioned several sub-feature maps can be loaded into the shift register array, and according to the pooling instruction, the pixel values in the same position in the above-mentioned several sub-feature maps can be pooled in parallel to obtain the same The pooling result corresponding to the above target feature map.
  • the above-mentioned pooling instruction may include an instruction corresponding to the current pooling process. This instruction can be pre-generated. When the pooling process is max pooling, the above-mentioned pooling instruction can compare the maximum value between the two. When the pooling process is average pooling, the above-mentioned pooling instructions may be summation or average value.
  • the above pooling result may include a pooling result obtained after the target feature map is pooled.
  • the pixel values included in each sub-feature map may be sequentially moved to the shift registers included in the shift register array according to a preset moving method, so that the pixels in the same position in each sub-feature map are moved. Pixel values are moved into the same shift register.
  • a preset moving method each pixel value is moved to the above-mentioned shift register array according to the order of each pixel value in the target feature map.
  • the above-mentioned transfer method is not particularly limited in this application. It is understandable that the pixel values of each sub-feature map are transferred according to the same transfer method to ensure that the pixel values in the same position in each sub-feature map are transferred to the same shift register.
  • the PE corresponding to each shift register can perform parallel pooling processing on the received pixel values according to the pooling instruction to obtain the pooling result corresponding to the above target feature map. For example, take the pooling process as max pooling.
  • each shift register receives a new pixel value, it can compare the newly received pixel value with the stored pixel value through its corresponding PE, obtain the maximum value, and cover the maximum value to the shift register. Therefore, after the input of each sub-feature map split from the target feature map is completed, the above-mentioned shift register can include the maximum pixel value in each pooling window. After that, output the maximum pixel value stored in the register array to obtain the pooling result for the above target feature map.
  • the target feature map split result all pixel values included in the same pooling window are in different sub-feature maps.
  • the pixel values respectively included in the above-mentioned sub-feature maps can be loaded into the shift register.
  • the pixel values in the same position in the above-mentioned several sub-feature maps are pooled in parallel to obtain the pooling result corresponding to the above-mentioned target feature map, so that multiple PEs corresponding to the shift register array can be used.
  • the pooling processing operations of each pooling window are performed in parallel, thereby improving the pooling processing efficiency, reducing the computational burden of the chip, and reducing the difficulty of chip design.
  • the shift operation mode of the shift register array corresponding to each sub-feature map may be determined according to the positions of the pixel values in the same pooling window in each sub-feature map.
  • the above-mentioned several sub-feature maps can be loaded into the shift register array respectively, and for each sub-feature map, according to the shift operation mode of the shift register array determined for the sub-feature map
  • the pixel values stored in the shift register array in the above-mentioned shift register array perform a shift operation, and then perform parallel pooling processing according to the pooling instruction to obtain partial pooling results corresponding to different pooling windows in the sub-feature map.
  • the graphs correspond to partial pooling results of different pooling windows, and determine the pooling results corresponding to the above target feature graphs.
  • some pixel values in the same pooling window are in the same sub-feature map.
  • the pixel values in the same pooling window in each sub-feature map can be pooled first.
  • the partial pooling results corresponding to different pooling windows in each sub-feature map are obtained; then the partial pooling results corresponding to the same pooling window in each sub-feature map are pooled again to obtain the final pooling result.
  • Multiple PEs corresponding to the shift register array can be used to perform the pooling processing operation of each pooling window in parallel, thereby improving the pooling processing efficiency, reducing the computational burden of the chip, and reducing the difficulty of chip design.
  • the number of pixels included in at least some of the sub-feature maps in the above-mentioned sub-feature maps is the same as the number of pixels included in the shift register array in the above-mentioned shift register array. The number is the same.
  • each sub-feature map obtained by splitting the target feature map contains some sub-feature maps or The number of pixels included in all the sub-feature maps is the same as the number of the above-mentioned shift registers.
  • Scenario 1 The size of the target feature map is 16*16, the size of the pooling window is 2*2, the step size is 2, the pooling process is maximum pooling process, and the pooling instruction includes comparing the maximum value between the two.
  • the size of the shift register array included in the AI chip for pooling is 8*8.
  • the above target feature map can be split to obtain four sub-feature maps.
  • pixel values in odd rows and odd columns in the target feature map may be determined as the first sub-feature map; pixel values in odd rows and even columns in the target feature map may be determined as the second sub-feature map sub-feature map; the pixel values in the even-numbered rows and odd-numbered columns in the above-mentioned target feature map are determined as the third sub-feature map; the pixel values in the even-numbered rows and even-numbered columns in the above-mentioned target feature map are determined as the fourth sub-feature picture.
  • FIG. 4 is a schematic diagram of a splitting process of a target feature map shown in this application.
  • the target feature map is 16*16.
  • black squares indicate pixels in odd rows and columns in the target feature map
  • dark gray squares indicate pixels in odd rows and even columns in the target feature map
  • white squares indicate pixels in even rows and odd columns in the target feature map
  • the light gray squares refer to the pixels in the even-numbered rows and even-numbered columns in the target feature map.
  • the target feature map is split according to the aforementioned splitting method to obtain the first to fourth sub-feature maps.
  • the size of each sub-feature map is 8*8, which is consistent with the size of the above-mentioned shift register array.
  • each pixel value included in the above-mentioned first sub-feature map can be moved to at least part of the shift registers corresponding to the above-mentioned shift register array according to the preset moving method.
  • each pixel value included in the first sub-feature map may be moved to the first registers of at least part of the shift registers included in the shift register array according to a preset moving method.
  • each pixel value needs to be moved as a whole among the shift registers in the shift register array, then according to the moving direction and the moving step size, an idle shift can be reserved at a preset position in the shift register array Register in order to store the moved pixel value. Assuming that each pixel value needs to be moved to the right by one step as a whole, then at least the free shift register on the right adjacent to the pixel value of the rightmost column in each pixel value storage location needs to be reserved. , and other moving methods are the same, and will not be repeated here. In this way, all the pixel values included in the first sub-feature map can be transferred to the shift registers included in the shift register array.
  • each pixel value included in the second sub-feature map can be respectively moved to the partial shift register according to the above-mentioned preset moving method, so that the computing kernel corresponding to the shift register can The received two pixel values are pooled to obtain a first pooling result.
  • each pixel value included in the second sub-feature map can be respectively moved to the second register of the partial shift register according to the above-mentioned preset moving method, so that each computing core can perform the pooling instruction according to the above-mentioned pooling instruction. , obtain the maximum value among the values stored in the first register and the second register, and store the maximum value in the first register as the result of the first pooling process. Therefore, the pixel values in the first sub-feature map and the second sub-feature map respectively in the same pooling window can be compared to obtain the maximum value and store it in the shift register.
  • each pixel value included in the above-mentioned third sub-feature map can be respectively moved to the above-mentioned partial shift registers according to the above-mentioned preset moving method, so that each computing kernel can process the above-mentioned first pooling process according to the above-mentioned pooling instruction
  • the result is pooled with the received pixel value to obtain a second pooling result.
  • each pixel value included in the above-mentioned third sub-feature map is respectively moved to the second register of the above-mentioned partial shift register, so that each computing core can be based on the above-mentioned pooling instruction, Obtain the maximum value among the values stored in the first register and the second register, and store the maximum value in the first register as the second pooling processing result. Therefore, the pixel values in the first sub-feature map, the second sub-feature map and the third sub-feature map respectively in the same pooling window can be compared to obtain the maximum value and store it in the shift register.
  • each pixel value included in the above-mentioned fourth sub-feature map can be respectively moved to the above-mentioned partial shift register according to the above-mentioned preset moving method, so that each computing kernel can process the above-mentioned second pooling process according to the above-mentioned pooling instruction.
  • the result is pooled with the received pixel value to obtain a third pooling result.
  • each pixel value included in the above-mentioned fourth sub-feature map is respectively moved to the second register of the above-mentioned partial shift register according to the above-mentioned preset moving method, so that each computing core can obtain the The maximum value among the values stored in the first register and the second register is stored in the first register as the third pooling processing result.
  • the pixel values in the first sub-feature map, the second sub-feature map, the third sub-feature map and the fourth sub-feature map in the same pooling window can be compared, and the maximum value can be obtained and stored in the shift in the register.
  • the third pooling processing result obtained by performing pooling processing in each computing core can be outputted to obtain the pooling result corresponding to the above-mentioned target feature map.
  • the value stored in the first register of the partial shift register can be output to obtain the pooling result corresponding to the target feature map. In this way, the maximum pooling process for the above target feature map can be completed, and the corresponding pooling result can be obtained.
  • the specific process may refer to the above-mentioned embodiment, only the pooling instructions are different, which will not be described in detail here.
  • Scenario 2 The size of the target feature map is 17*17, the size of the pooling window is 3*3, the step size is 2, the pooling process is the maximum pooling process, and the pooling command is the above-mentioned pooling command, including comparing the values between the two. maximum value.
  • the size of the shift register array included in the AI chip for pooling is 9*9.
  • the above target feature map can be split to obtain four sub-feature maps.
  • FIG. 5 is a schematic diagram of a splitting process of a target feature map shown in the present application.
  • the target feature map is 17*17.
  • black squares indicate pixels in odd rows and columns in the target feature map
  • dark gray squares indicate pixels in odd rows and even columns in the target feature map
  • white squares indicate pixels in even rows and odd columns in the target feature map
  • the light gray squares refer to the pixels in the even-numbered rows and even-numbered columns in the target feature map.
  • the target feature map is split according to the aforementioned splitting method to obtain the first to fourth sub-feature maps.
  • the first sub-feature map is 9*9
  • the second sub-feature map is 9*8
  • the third sub-feature map is 8*9
  • the fourth sub-feature map is 8*8.
  • the size of the above-mentioned first sub-feature map is consistent with the size of the above-mentioned shift register array.
  • FIG. 6 is a schematic diagram of a pooling window shown in this application.
  • the pooling window shown in FIG. 6 includes the pooling window when the pooling window size is 3*3 and the step size is 2 when the above target feature map is pooled.
  • the dashed box represents a pooling window in the target feature map.
  • the pooling window can include 4 black blocks, 2 dark gray blocks, 2 white blocks and 1 light gray block.
  • the maximum value among the four pixel values in the upper right corner, the lower left corner, and the lower right corner that is, the first maximum value among the four adjacent pixel values in the upper, lower, left, and right corners corresponding to the pooling window.
  • execute S63-S64 to determine the maximum value of the two pixel values in the first row, second column, and the third row, second column, that is, the pooling window corresponds to the upper and lower adjacent sub-feature maps in the second sub-feature map.
  • S61 can be executed to move each pixel value included in the first sub-feature map to at least part of the shift registers included in the shift register array according to a preset moving method.
  • each pixel value included in the first sub-feature map may be moved to the first registers of at least part of the shift registers included in the shift register array according to a preset moving method.
  • S62 can be executed, and the computing kernel corresponding to the above-mentioned at least part of the shift registers can perform a pooling operation on the pixel values in any four adjacent shift registers included in the above-mentioned shift register array according to the above-mentioned pooling instruction , obtain the first partial pooling processing result, and store the first partial pooling processing result in the target shift register at the preset position among the four adjacent shift registers.
  • the above-mentioned preset position includes the position of the lower left corner of any four adjacent shift registers. It is understandable that, for the solution in which the above-mentioned preset positions are other positions, reference may be made to this embodiment.
  • the above-mentioned target shift register includes a shift register located at the lower left corner among the above-mentioned four adjacent shift registers.
  • FIG. 7 is a schematic diagram of transferring pixel values of the first sub-feature map according to the present application. It should be noted that any four adjacent shift registers in the shift register array can be regarded as a group of shift registers shown in FIG. 7 . Taking the second shift register in the group of shift registers shown in FIG. 7 as an example, in another group of shift registers, it may be the first shift register, or the third shift register or the target shift register. register. FIG.
  • FIG. 7 only schematically illustrates the movement flow of pixel values in one group of shift registers, and the movement flow of pixel values in other groups of shift registers can be illustrated with reference to FIG. 7 , and will not be described in detail in this application.
  • the shift register at the upper left corner can be regarded as the first shift register
  • the shift register to the right of the first shift register The register can be regarded as the second shift register
  • the shift register below the second shift register can be regarded as the third shift register
  • the shift register to the left of the third shift register can be regarded as the above-mentioned target shift register .
  • each computing core can store the larger value in the first register and the second register in each second shift register into the first register in each second shift register.
  • the value in the first register of the second shift registers can be moved to the second register of the third shift register below the second shift register.
  • each computing core can store the larger value in the first register and the second register in each third shift register into the first register in each third shift register.
  • the value in the first register of the third shift registers can be moved to the second register of the target shift register on the left side of the third shift register.
  • each computing core can store the larger value in the first register and the second register in each target shift register as the above-mentioned first partial pooling processing result in the first register of each target shift register.
  • the maximum value of the four pixel values in the upper left corner, upper right corner, lower left corner and lower right corner of the same pooling window can be determined, that is, among the four adjacent pixel values in the first sub-feature map.
  • the first maximum value is stored in the aforementioned target shift register.
  • S63 can be executed to move each pixel value included in the second sub-feature map to the partial shift register according to the preset moving method.
  • each pixel value included in the second sub-feature map can be respectively moved to the second register of the partial shift register according to the preset moving method.
  • each computing core can execute S64, and according to the above-mentioned pooling instruction, perform a pooling operation on the pixel values in any two upper and lower adjacent shift registers included in the above-mentioned shift register array, to obtain a second partial pooling processing result , and store the above-mentioned second partial pooling processing result in the above-mentioned target shift register.
  • FIG. 8 is a schematic diagram of transferring pixel values of the second sub-feature map according to the present application. It should be noted that any two adjacent shift registers in the shift register array can be regarded as a group of shift registers shown in FIG. 8 .
  • FIG. 8 only schematically illustrates the movement flow of pixel values in one group of shift registers, and the movement flow of pixel values in other groups of shift registers can be illustrated with reference to FIG. 8 , and will not be described in detail in this application.
  • the shift register at the upper position in a group of shift registers can be regarded as the first shift register, and the shift register below the first shift register can be regarded as the first shift register. Think of it as the above target shift register.
  • each computing core may store the larger value in the second register and the third register in each target shift register as the result of the above-mentioned second partial pooling processing in the second register of each target shift register.
  • the maximum value of the two pixel values in the first row, second column, and the third row and second column position in the same pooling window can be determined, that is, the two adjacent top and bottom in the second sub-feature map.
  • the second largest of the pixel values is stored in the above-mentioned destination shift register.
  • each pixel value included in the third sub-feature map is respectively moved to the partial shift register according to the preset moving method.
  • each pixel value included in the third sub-feature map is respectively moved to the third register of the partial shift register according to the above-mentioned preset moving method.
  • each computing core can execute S66, and according to the above-mentioned pooling instruction, perform a pooling operation on the pixel values in any two left and right adjacent shift registers included in the above-mentioned shift register array, and obtain a third partial pooling processing result , and store the above-mentioned third partial pooling processing result in the above-mentioned target shift register.
  • FIG. 9 is a schematic diagram of transferring pixel values of the third sub-feature map according to the present application. It should be noted that any two left and right adjacent shift registers in the shift register array can be regarded as a group of shift registers shown in FIG. 9 . FIG. 9 only schematically illustrates the movement flow of pixel values in one group of shift registers. The movement flow of pixel values in other groups of shift registers can be illustrated with reference to FIG. 9 , and will not be described in detail in this application. As shown in FIG. 9 (the register is not shown in FIG.
  • the shift register in the upper left corner of a group of shift registers can be regarded as the first shift register, and the shift register to the right of the first shift register
  • the register can be regarded as the second shift register, the shift register below the second shift register can be regarded as the third shift register, and the shift register to the left of the third shift register can be regarded as the above-mentioned target shift register .
  • each computing core may store the larger value in the third register and the fourth register in each second shift register as the above-mentioned third partial pooling processing result in the third register of each second shift register.
  • the third maximum value of is stored in the above-mentioned second shift register.
  • the above-mentioned third maximum value (the result of the third partial pooling process) can be moved to the above-mentioned target shift register. That is, S93 is executed, and the third part of the pooling processing result in the third register of the second shift register is moved to the third register of the first shift register on the left side of the second shift register.
  • S94 Move the third part of the pooling processing result in the third register of the first shift register to the third register of the target shift register below the first shift register. Therefore, the maximum value of the two pixel values in the second row, the first column and the second row, the third column in the same pooling window, that is, the left and right adjacent pixel values in the third sub-feature map.
  • the third maximum value is stored in the aforementioned target shift register. In some examples, when data is moved from the second shift register to the target shift register, the data can also be moved to the third shift register first, and then moved to the above-mentioned target shift register.
  • each pixel value included in the fourth sub-feature map is respectively moved to the partial shift register according to the above-mentioned preset moving method.
  • each pixel value included in the fourth sub-feature map can be respectively moved to the fourth register of the partial shift register according to the above-mentioned preset moving method.
  • S68 can be executed to move the pixel values in each of the shift registers in the partial shift registers to the target shift register.
  • the values in the fourth register of each of the first shift registers in the above-mentioned partial shift registers may be moved to the fourth register of the target shift register below the first shift register.
  • the above target shift register includes the first part of the pooling processing result (the first maximum value), the second part of the pooling processing result (the second maximum value), and the third part of the pooling processing result (the third maximum value) ), and the pixel value in the middle of the pooling window (the pixel value corresponding to the fourth sub-feature map).
  • the first partial pooling processing result, the second partial pooling processing result, the third partial pooling processing result and the pixels corresponding to the fourth sub-feature map in each target shift register included in the shift register array can be Values are compared, and the maximum value among them is output to obtain the pooling result corresponding to the above target feature map.
  • each computing core may store the larger value of the first register and the second register of each target shift register into the first register of each target shift register.
  • the larger value of the first register and the third register of each target shift register may be stored in the first register of each target shift register.
  • the larger value of the first register and the fourth register of each target shift register may be stored in the first register of each target shift register.
  • the value stored in the first register of each target shift register is output, and the pooling result corresponding to the above target feature map is obtained.
  • the maximum value obtained by performing maximum pooling on each pooling window can be stored in the above target shift register, and by outputting the maximum value in each target shift register, the corresponding target feature map can be obtained. Pooling results.
  • a plurality of temporary registers are connected to the periphery of the above-mentioned shift register. Wherein, the temporary register is used for storing the pixel value overflowing the shift register array when the numerical value transfer operation is performed. Therefore, in the process of data transfer, it is not necessary to store the pixel values of the overflow shift register array in RAM, but only the overflow pixel values need to be stored in the temporary register, thereby improving the data transfer efficiency and thus the pooling efficiency.
  • FIG. 10 is a method flowchart of a pooling method shown in this application. As shown in Figure 10, the above method may include:
  • S1008 output the pooling result corresponding to each target feature map, and obtain the pooling result corresponding to the above-mentioned original feature map.
  • the original feature map can be divided into several target feature maps first, and then each target feature map can be pooled according to the pooling method shown in any of the above embodiments, and the pooling result corresponding to each target feature map can be obtained. . Finally, the pooling result corresponding to each target feature map is output, and the pooling result corresponding to the original feature map above is obtained.
  • efficient pooling of the above-mentioned original feature maps larger than the shift register array can be achieved.
  • the present application also proposes a chip.
  • the above-mentioned chip may include a controller; the above-mentioned controller is used to obtain a target feature map; the above-mentioned target feature map is split to obtain several sub-feature maps; wherein, at least some of the pixels in the same pooling window in the above-mentioned target feature map The values are in different sub-feature maps, and the pixel values in the same position in each pooling window are in the same sub-feature map; the pixels belonging to different pooling windows in each sub-feature map are processed in parallel to obtain the pool corresponding to the above target feature map. result.
  • the above-mentioned controller is configured to: load the pixel values respectively included in the above-mentioned several sub-feature maps into the shift register array, and, according to the pooling instruction, perform the processing for the pixels in the same position in the above-mentioned several sub-feature maps The values are pooled in parallel to obtain the pooling results corresponding to the above target feature maps.
  • the above-mentioned controller is configured to: determine the shift operation mode of the shift register array corresponding to each sub-feature map according to the positions of the pixel values in the same pooling window in each sub-feature map ; Load the above-mentioned several sub-feature maps into the shift register array respectively, and for each sub-feature map, shift the shift register array in the above-mentioned shift register array according to the shift operation mode of the shift register array determined for the sub-feature map.
  • Shift operation is performed on the pixel values stored in the register, and the partial pooling results corresponding to different pooling windows in the sub-feature map are obtained by parallel pooling processing according to the pooling instruction; the partial pooling results corresponding to different pooling windows are obtained according to each sub-feature map. , and determine the pooling result corresponding to the above target feature map.
  • the controller is configured to: determine the pixel values in odd rows and odd columns in the target feature map as the first sub-feature map; The pixel value of the position is determined as the second sub-feature map; the pixel value in the even-numbered row and odd-numbered column position in the above-mentioned target feature map is determined as the third sub-feature map; The pixel value is determined as the fourth sub-feature map.
  • the controller is configured to: move each pixel value included in the first sub-feature map to at least part of the shift registers included in the shift register array; The included pixel values are respectively moved to the above-mentioned partial shift registers, so that the calculation kernel corresponding to each shift register performs pooling processing on the received two pixel values according to the above-mentioned pooling instructions, and obtains the first pooling processing. Result; each pixel value included in the above-mentioned third sub-feature map is respectively moved to the above-mentioned partial shift register, so that each computing kernel can pool the above-mentioned first pooling processing result and the received pixel value according to the above-mentioned pooling instruction.
  • the above-mentioned pooling processing includes maximum pooling processing; the above-mentioned pooling instruction includes comparing the maximum value between the two; the above-mentioned controller is configured to: combine the pixel values included in the above-mentioned first sub-feature map are respectively moved to the first registers of at least part of the shift registers included in the above-mentioned shift register array; and each pixel value included in the above-mentioned second sub-feature map is respectively moved to the second registers of the above-mentioned part of the shift registers, to Make the calculation kernel corresponding to each shift register in the partial shift register obtain the maximum value of the values stored in the first register and the second register according to the pooling instruction, and use the maximum value as the first A pooling result is stored in the above-mentioned first register.
  • the controller is configured to: move each pixel value included in the third sub-feature map to the second register of the partial shift register, so as to be consistent with the partial shift register.
  • the computing kernel corresponding to each shift register obtains the maximum value of the values stored in the first register and the second register according to the above-mentioned pooling instruction, and stores the above-mentioned maximum value as the result of the above-mentioned second pooling process in the above-mentioned No.
  • the above-mentioned controller is configured to: output the value stored in the first register of the above-mentioned partial shift register, and obtain the pooling result corresponding to the above-mentioned target feature map.
  • the controller is configured to: move each pixel value included in the first sub-feature map to at least part of the shift registers included in the shift register array; Perform a pooling operation on the pixel values in any four adjacent shift registers included in the above-mentioned shift register array to obtain a first partial pooling processing result, and store the above-mentioned first partial pooling processing result in the above-mentioned four upper, lower, and lower
  • the left and right adjacent shift registers are located in the target shift register at the preset position; each pixel value included in the above-mentioned second sub-feature map is respectively moved to the above-mentioned partial shift register; according to the above-mentioned pooling instruction, the above-mentioned shift register is moved.
  • the second partial pooling processing result the third partial pooling processing result in each target shift register included in the above-mentioned shift register array, and the above-mentioned fourth sub-feature map, the above-mentioned target feature is obtained.
  • the above-mentioned pooling processing includes maximum pooling processing; the above-mentioned pooling instruction includes comparing the maximum value between the two; the above-mentioned preset position includes any four adjacent shift registers in the upper, lower, left and right.
  • the above-mentioned target shift register includes the shift register at the lower-left corner position in the above-mentioned four adjacent shift registers; the above-mentioned controller is used for: respectively moved to the first register of at least part of the shift registers included in the above-mentioned shift register array; the above-mentioned controller is used for: moving the value in the first register of each first shift register in the above-mentioned partial shift register to the first register In the second register of the second shift register on the right side of the shift register; store the larger value of the first register and the second register in each second shift register into the first register of each second shift register ; Move the numerical value in the first register of the above-mentioned second shift registers to the second register of the third shift register below the second shift register; The larger value in the two registers is stored in the first register of each third shift register; the value in the first register in each of the third shift registers is moved to the target shift on the left of the third shift register In the second register of the register; the larger value
  • the above-mentioned controller is used for: moving each pixel value included in the above-mentioned second sub-feature map to the second register of the above-mentioned partial shift register; the above-mentioned controller is used for: shifting the above-mentioned part The value in the second register of each first shift register in the register is moved to the third register of the target shift register below the first shift register; the second register and the third register in each target shift register are moved. The larger value is stored in the second register of each target shift register as the result of the above-mentioned second partial pooling process.
  • the above-mentioned controller is used for: moving each pixel value included in the above-mentioned third sub-feature map to the third register of the above-mentioned partial shift register; the above-mentioned controller is used for: shifting the above-mentioned part The value in the third register of each first shift register in the register is moved to the fourth register of the second shift register to the right of the first shift register; The larger value in the four registers is used as the result of the above-mentioned third part of the pooling processing, and is stored in the third register of each second shift register; the third part in the third register of the above-mentioned second shift register is pooled The result is moved to the third register of the first shift register on the left side of the second shift register; the third part of the pooling processing result in the third register of the first shift register is moved to the first shift register. in the third register of the destination shift register below the shift register.
  • the above-mentioned controller is used for: moving each pixel value included in the above-mentioned fourth sub-feature map to the fourth register of the above-mentioned partial shift register; the above-mentioned controller is used for: shifting the above-mentioned part The value in the fourth register of each first shift register in the registers is moved to the fourth register of the target shift register below the first shift register.
  • the above-mentioned controller is used to: store the larger value of the first register and the second register in each target shift register into the first register of each target shift register; store each target shift register The larger value in the first register and the third register in the shift register is stored in the first register of each target shift register; the larger value in the first register and the fourth register in each target shift register is stored Store the value in the first register of each target shift register; output the value stored in the first register of each target shift register, and obtain the pooling result corresponding to the above target feature map.
  • a plurality of temporary registers are connected to the periphery of the above-mentioned shift register array; the above-mentioned temporary registers are used to store the pixel values that overflow the above-mentioned shift register array when performing a numerical value transfer operation.
  • the number of pixels included in at least some of the sub-feature maps in the above-mentioned several sub-feature maps is consistent with the number of shift registers included in the above-mentioned shift register array.
  • the present application also proposes a chip.
  • the above-mentioned chip may include a controller; the above-mentioned controller is used to obtain an original feature map; the above-mentioned original feature map is divided into several target feature maps; and each target feature map is pooled according to the pooling method shown in any of the foregoing embodiments. , obtain the pooling result corresponding to each target feature map; output the pooling result corresponding to each target feature map, and obtain the pooling result corresponding to the above-mentioned original feature map.
  • the present application also provides an electronic device, which includes the chip shown in any of the foregoing embodiments.
  • the electronic device may be a smart terminal such as a mobile phone, or may be other devices that have a camera and can perform image processing.
  • the chip of the embodiment of the present application may be used to perform the pooling task. Since the above chip has higher pooling processing efficiency and higher performance, the use of this chip can assist in improving the processing efficiency of the pooling task, thereby improving the performance of electronic equipment.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by the controller, any one of the pooling methods described above is implemented.
  • one or more embodiments of the present application may be provided as a method, system or computer program product. Accordingly, one or more embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, one or more embodiments of the present application may employ a computer program implemented on one or more computer-usable storage media (including, but not limited to, disk storage, OxCD_00-ROM, optical storage, etc.) having computer-usable program code embodied therein form of the product.
  • computer-usable storage media including, but not limited to, disk storage, OxCD_00-ROM, optical storage, etc.
  • Embodiments of the subject matter and functional operations described in this application can be implemented in digital electronic circuits, in tangible embodiment of computer software or firmware, in computer hardware including the structures disclosed in this application and their structural equivalents, or in a combination of one or more.
  • Embodiments of the subject matter described in this application may be implemented as one or more computer programs, ie, one or more of computer program instructions encoded on a tangible, non-transitory program carrier for execution by or to control the operation of a data processing chip. multiple units.
  • the program instructions may be encoded on an artificially generated propagated signal, such as a machine-generated electrical, optical or electromagnetic signal, which is generated to encode and transmit information to a suitable receiver chip for interpretation by the data. Processing chip execution.
  • the computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a random or serial access memory device, or a combination of one or more of these.
  • the processes and logic flows described in this application can be performed by one or more programmable computers executing one or more computer programs to perform corresponding functions by operating on input data and generating output.
  • the processes and logic flows described above can also be performed by, and chips can also be implemented as, special purpose logic circuits, such as FPGAs (field programmable gate arrays) or ASICs (application specific integrated circuits).
  • Computers suitable for the execution of a computer program include, for example, general and/or special purpose microprocessors, or any other type of central processing unit.
  • the central processing unit will receive instructions and data from read only memory and/or random access memory.
  • the basic components of a computer include a central processing unit for implementing or executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to, one or more mass storage devices for storing data, such as magnetic, magneto-optical or optical disks, to receive data therefrom or to It transmits data, or both.
  • the computer does not have to have such a device.
  • the computer may be embedded in another device, such as a mobile phone, personal digital assistant (PDA), mobile audio or video player, game console, global positioning system (GPS) receiver, or a universal serial bus (USB) ) flash drives for portable storage devices, to name a few.
  • PDA personal digital assistant
  • GPS global positioning system
  • USB universal serial bus
  • Computer-readable media suitable for storage of computer program instructions and data include all forms of non-volatile memory, media, and memory devices including, for example, semiconductor memory devices (eg, EPROM, EEPROM, and flash memory devices), magnetic disks (eg, internal hard disks or memory devices). removable disk), magneto-optical disk and 0xCD_00ROM and DVD-ROM disks.
  • semiconductor memory devices eg, EPROM, EEPROM, and flash memory devices
  • magnetic disks eg, internal hard disks or memory devices. removable disk
  • magneto-optical disk 0xCD_00ROM and DVD-ROM disks.
  • the processor and memory may be supplemented by or incorporated in special purpose logic circuitry.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

L'invention concerne un procédé de regroupement, et une puce, un dispositif, et un support de stockage. Le procédé peut consister à : acquérir une carte de caractéristiques cibles ; diviser la carte de caractéristiques cibles pour obtenir une pluralité de sous-cartes de caractéristiques, au moins certaines valeurs de pixel, dans la même fenêtre de regroupement, dans la carte de caractéristiques cibles étant respectivement dans des sous-cartes de caractéristiques différentes, et des valeurs de pixel, dans les mêmes positions, à l'intérieur de fenêtres de regroupement étant dans la même sous-carte de caractéristiques ; et effectuer un traitement parallèle sur des pixels, qui appartiennent à différentes fenêtres de regroupement, dans chaque sous-carte de caractéristiques, de façon à obtenir un résultat de regroupement correspondant à la carte de caractéristiques cibles.
PCT/CN2021/115667 2021-01-29 2021-08-31 Procédé de regroupement, et puce, dispositif, et support de stockage WO2022160703A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110127626.9 2021-01-29
CN202110127626.9A CN112862667A (zh) 2021-01-29 2021-01-29 一种池化方法、芯片、设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022160703A1 true WO2022160703A1 (fr) 2022-08-04

Family

ID=75986898

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/115667 WO2022160703A1 (fr) 2021-01-29 2021-08-31 Procédé de regroupement, et puce, dispositif, et support de stockage

Country Status (2)

Country Link
CN (2) CN112862667A (fr)
WO (1) WO2022160703A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112862667A (zh) * 2021-01-29 2021-05-28 成都商汤科技有限公司 一种池化方法、芯片、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135556A (zh) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 基于脉动阵列的神经网络加速方法、装置、计算机设备及存储介质
US20200175313A1 (en) * 2018-12-03 2020-06-04 Samsung Electronics Co., Ltd. Method and apparatus with dilated convolution
US20200302215A1 (en) * 2017-10-25 2020-09-24 Nec Corporation Information processing apparatus, information processing method, and non-transitory computer readable medium
CN112862667A (zh) * 2021-01-29 2021-05-28 成都商汤科技有限公司 一种池化方法、芯片、设备和存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232665B (zh) * 2019-06-13 2021-08-20 Oppo广东移动通信有限公司 最大池化方法、装置、计算机设备及存储介质
CN110490813B (zh) * 2019-07-05 2021-12-17 特斯联(北京)科技有限公司 卷积神经网络的特征图增强方法、装置、设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200302215A1 (en) * 2017-10-25 2020-09-24 Nec Corporation Information processing apparatus, information processing method, and non-transitory computer readable medium
US20200175313A1 (en) * 2018-12-03 2020-06-04 Samsung Electronics Co., Ltd. Method and apparatus with dilated convolution
CN110135556A (zh) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 基于脉动阵列的神经网络加速方法、装置、计算机设备及存储介质
CN112862667A (zh) * 2021-01-29 2021-05-28 成都商汤科技有限公司 一种池化方法、芯片、设备和存储介质
CN113052760A (zh) * 2021-01-29 2021-06-29 成都商汤科技有限公司 一种池化方法、芯片、设备和存储介质

Also Published As

Publication number Publication date
CN113052760A (zh) 2021-06-29
CN112862667A (zh) 2021-05-28

Similar Documents

Publication Publication Date Title
EP3612936B1 (fr) Réduction de consommation d'énergie dans un processeur de réseau neuronal par saut d'opérations de traitement
US11966583B2 (en) Data pre-processing method and device, and related computer device and storage medium
US20200074288A1 (en) Convolution operation processing method and related product
CN109063825B (zh) 卷积神经网络加速装置
TWI811291B (zh) 深度學習加速器及加快深度學習操作的方法
TWI777442B (zh) 用於傳送資料之設備、方法及系統
US11294599B1 (en) Registers for restricted memory
US20160093343A1 (en) Low power computation architecture
CN111656390B (zh) 用于机器学习的图像变换
TW202145019A (zh) 用於加速分組卷積之高效硬體架構
US20200218777A1 (en) Signal Processing Method and Apparatus
CN110399972B (zh) 数据处理方法、装置及电子设备
WO2022160703A1 (fr) Procédé de regroupement, et puce, dispositif, et support de stockage
JP2020042774A (ja) 人工知能推論演算装置
US20170078670A1 (en) Analytics Assisted Encoding
JP7033507B2 (ja) ニューラルネットワーク用プロセッサ、ニューラルネットワーク用処理方法、および、プログラム
CN116415100A (zh) 业务处理方法、装置、处理器及计算设备
US11868873B2 (en) Convolution operator system to perform concurrent convolution operations
CN114154623A (zh) 基于卷积神经网络的数据移位处理方法、装置及设备
US10997277B1 (en) Multinomial distribution on an integrated circuit
CN112204585A (zh) 处理器、电子装置及其控制方法
CN114286990B (zh) 存储器中的辅助ai处理
WO2023132840A1 (fr) Traitement de trame d'image utilisant un apprentissage automatique
CN112256431B (zh) 代价聚合方法及装置、存储介质、终端
US20240127589A1 (en) Hardware friendly multi-kernel convolution network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21922298

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21922298

Country of ref document: EP

Kind code of ref document: A1