CN112085167A - Convolution processing method and device, multi-core DSP platform and readable storage medium - Google Patents

Convolution processing method and device, multi-core DSP platform and readable storage medium Download PDF

Info

Publication number
CN112085167A
CN112085167A CN202010951445.3A CN202010951445A CN112085167A CN 112085167 A CN112085167 A CN 112085167A CN 202010951445 A CN202010951445 A CN 202010951445A CN 112085167 A CN112085167 A CN 112085167A
Authority
CN
China
Prior art keywords
convolution processing
convolution
kernel
image
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010951445.3A
Other languages
Chinese (zh)
Inventor
何涛
施慧莉
杨峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leihua Electronic Technology Research Institute Aviation Industry Corp of China
Original Assignee
Leihua Electronic Technology Research Institute Aviation Industry Corp of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leihua Electronic Technology Research Institute Aviation Industry Corp of China filed Critical Leihua Electronic Technology Research Institute Aviation Industry Corp of China
Priority to CN202010951445.3A priority Critical patent/CN112085167A/en
Publication of CN112085167A publication Critical patent/CN112085167A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Processing (AREA)

Abstract

A convolution processing method comprising: dividing the image into a plurality of regions; each convolution processing core corresponds to one area, and the convolution processing is carried out on the part, located in the corresponding area, of each image layer of the image; and synthesizing the convolution processing results of all the layers. A convolution processing apparatus comprising: the image processing device comprises an area dividing module, a processing module and a display module, wherein the area dividing module is used for dividing an image into a plurality of areas; the convolution processing module enables each convolution processing core to correspond to one area, and each convolution processing core performs convolution processing on the part, located in the corresponding area, of each image layer of the image; and the comprehensive processing module is used for synthesizing the convolution processing results of all the layers. A multi-core DSP platform comprising: a plurality of convolution processing kernels; a memory storing a computer program configured to be capable of implementing the convolution processing method described above when executed by each convolution processing core. A computer-readable storage medium storing a computer program which, when executed, is capable of implementing the convolution processing method described above.

Description

Convolution processing method and device, multi-core DSP platform and readable storage medium
Technical Field
The application belongs to the technical field of convolutional neural network processing, and particularly relates to a convolutional processing method and device, a multi-core DSP platform and a computer readable storage medium.
Background
The Convolutional Neural Network (CNN) is widely applied to the field of computer vision such as image classification, target identification and positioning and the like, mainly comprises convolution processing, relates to a large number of convolution operations, can be used for carrying out parallel processing on the basis of data of different convolution kernels and different image layers, and is suitable for hardware with high parallelism, a GPU, a CPU and the like to carry out accelerated processing.
The principle of convolution processing is shown in fig. 1, and fig. 1 shows an example in which a single-channel image (1, W, H) is convolved with 3 convolution kernels to generate a 3-channel image (3, W, H), that is, a single-channel image with a map layer number of 1, a width of W, and a height of H is convolved with 3 convolution kernels to generate a 3-channel image with a map layer number of 3, a width of W, and a height of H.
The network architecture of the actual convolution processing is shown in fig. 2, and most of input images are multichannel images (C, W, H) with the number of image layers C, the width W and the height H, the number of convolution kernels is N, and after each convolution kernel and the multichannel image are convolved, multichannel summation processing is required to be performed to generate the multichannel images (N, W, H) with the number of image layers N, the width W and the height H.
At present, in order to improve the speed of convolution processing, the GPU mostly adopts an Nvidia cudnn library for acceleration, the CPU mostly adopts a multithreading parallel and MPI mode for multitask acceleration, the complex calculation optimization of convolution mostly converts multichannel image convolution into matrix operation, according to the design idea of the GPU and the CPU platform, the weight data and input of convolution processing should be stored in a fast access memory, but the fast access memory of the multi-core DSP platform is limited compared with the GPU and the CPU platform, and in most cases, the weight data and input of convolution processing are smaller than those of convolution processing, so that the acceleration of convolution processing on the multi-core DSP platform cannot be realized according to the design idea of the GPU and the CPU platform, and the speed of convolution processing on the multi-core DSP platform is greatly limited.
The present application has been made in view of the above-mentioned technical drawbacks.
It should be noted that the above background disclosure is only for the purpose of assisting understanding of the inventive concept and technical solutions of the present invention, and does not necessarily belong to the prior art of the present patent application, and the above background disclosure should not be used for evaluating the novelty and inventive step of the present application without explicit evidence to suggest that the above content is already disclosed at the filing date of the present application.
Disclosure of Invention
The present application is directed to a convolution processing method and apparatus, a multi-core DSP platform, and a computer readable storage medium, so as to overcome or alleviate at least one of the technical disadvantages of the known existing technology.
The technical scheme of the application is as follows:
one aspect provides a convolution processing method, including:
dividing the image into a plurality of regions;
each convolution processing core corresponds to one area, and the convolution processing is carried out on the part, located in the corresponding area, of each image layer of the image;
and synthesizing the convolution processing results of all the layers.
According to at least one embodiment of the present application, in the convolution processing method, the synthesizing the convolution processing result of each layer specifically includes:
superposing each convolution processing core to check the convolution processing result of each layer in the corresponding area;
and combining convolution processing results of the areas.
According to at least one embodiment of the present application, in the convolution processing method, each convolution processing core performs convolution processing on a portion of each layer located in a corresponding area, specifically:
each convolution processing core performs convolution processing on the part of each layer, which is positioned in the corresponding area, based on the plurality of convolution cores;
the step of superposing each convolution processing core to check the convolution processing result of each layer in the corresponding area specifically comprises the following steps:
corresponding to each convolution kernel, superposing each convolution processing kernel to check the convolution processing result of each layer in the corresponding area;
merging convolution processing results of the regions, specifically:
convolution processing results for the respective regions are combined corresponding to the respective convolution kernels.
According to at least one embodiment of the present application, the convolution processing method further includes:
and carrying out edge PADDING processing on the image.
According to at least one embodiment of the present application, the convolution processing method further includes:
a convolution kernel is introduced.
According to at least one embodiment of the present application, in the convolution processing method, the importing a convolution kernel specifically includes:
when the weight data quantity of the convolution kernel is less than the shared cache, the convolution kernel is led into the shared cache;
and when the weight data size of the convolution kernel is larger than the shared cache, the convolution kernel is led into the shared cache step by step in operation.
Another aspect provides a convolution processing apparatus including:
the image processing device comprises an area dividing module, a processing module and a display module, wherein the area dividing module is used for dividing an image into a plurality of areas;
the convolution processing module enables each convolution processing core to correspond to one area, and each convolution processing core performs convolution processing on the part, located in the corresponding area, of each image layer of the image;
and the comprehensive processing module is used for synthesizing the convolution processing results of all the layers.
In yet another aspect, a multi-core DSP platform comprises:
a plurality of convolution processing kernels;
a memory storing a computer program configured to be capable of implementing any of the above-described convolution processing methods when executed by the respective convolution processing cores.
Yet another aspect is a computer-readable storage medium storing a computer program that, when executed, is capable of implementing any of the above-described convolution processing methods.
Drawings
FIG. 1 is a schematic diagram of convolution processing a single-channel image;
FIG. 2 is a schematic diagram of a convolution processing multi-channel image;
fig. 3 is a schematic diagram of a convolution processing method according to an embodiment of the present application.
For the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; further, the drawings are for illustrative purposes, and terms describing positional relationships are limited to illustrative illustrations only and are not to be construed as limiting the patent.
Detailed Description
In order to make the technical solutions and advantages of the present application clearer, the technical solutions of the present application will be further clearly and completely described in the following detailed description with reference to the accompanying drawings, and it should be understood that the specific embodiments described herein are only some of the embodiments of the present application, and are only used for explaining the present application, but not limiting the present application. It should be noted that, for convenience of description, only the parts related to the present application are shown in the drawings, other related parts may refer to general designs, and the embodiments and technical features in the embodiments in the present application may be combined with each other to obtain a new embodiment without conflict.
In addition, unless otherwise defined, technical or scientific terms used in the description of the present application shall have the ordinary meaning as understood by one of ordinary skill in the art to which the present application belongs. The terms "upper", "lower", "left", "right", "center", "vertical", "horizontal", "inner", "outer", and the like used in the description of the present application, which indicate orientations, are used only to indicate relative directions or positional relationships, and do not imply that the devices or elements must have a specific orientation, be constructed and operated in a specific orientation, and when the absolute position of the object to be described is changed, the relative positional relationships may be changed accordingly, and thus, should not be construed as limiting the present application. The use of "first," "second," "third," and the like in the description of the present application is for descriptive purposes only to distinguish between different components and is not to be construed as indicating or implying relative importance. The use of the terms "a," "an," or "the" and similar referents in the context of describing the application is not to be construed as an absolute limitation on the number, but rather as the presence of at least one. The word "comprising" or "comprises", and the like, when used in this description, is intended to specify the presence of stated elements or items, but not the exclusion of other elements or items.
Further, it is noted that, unless expressly stated or limited otherwise, the terms "mounted," "connected," and the like are used in the description of the invention in a generic sense, e.g., connected as either a fixed connection or a removable connection or integrally connected; can be mechanically or electrically connected; they may be directly connected or indirectly connected through an intermediate medium, or they may be connected through the inside of two elements, and those skilled in the art can understand their specific meaning in this application according to the specific situation.
The present application is described in further detail below with reference to fig. 1 to 3.
One aspect provides a convolution processing method, including:
dividing the image into a plurality of regions;
each convolution processing core corresponds to one area, and the convolution processing is carried out on the part, located in the corresponding area, of each image layer of the image;
and synthesizing the convolution processing results of all the layers.
In some optional embodiments, in the convolution processing method, the synthesizing of the convolution processing result of each layer specifically includes:
superposing each convolution processing core to check the convolution processing result of each layer in the corresponding area;
and combining convolution processing results of the areas.
In some optional embodiments, in the convolution processing method, each convolution processing core performs convolution processing on a portion of each layer located in a corresponding area, specifically:
each convolution processing core performs convolution processing on the part of each layer, which is positioned in the corresponding area, based on the plurality of convolution cores;
the step of superposing each convolution processing core to check the convolution processing result of each layer in the corresponding area specifically comprises the following steps:
corresponding to each convolution kernel, superposing each convolution processing kernel to check the convolution processing result of each layer in the corresponding area;
merging convolution processing results of the regions, specifically:
convolution processing results for the respective regions are combined corresponding to the respective convolution kernels.
In some optional embodiments, the convolution processing method further includes:
performing edge PADDING processing on the image, wherein a specific formula is as follows:
y[m,n]=x[i,j],m=i+pad,n=j+pad
i∈[0,W-1],j∈[0,H-1],m∈[0,2*pad+W-1],n∈[0,2*pad+H-1]。
in some optional embodiments, the convolution processing method further includes:
a convolution kernel is introduced.
In some optional embodiments, in the above convolution processing method, the introducing a convolution kernel specifically includes:
when the weight data quantity of the convolution kernel is less than the shared cache, the convolution kernel is led into the shared cache;
and when the weight data size of the convolution kernel is larger than the shared cache, the convolution kernel is led into the shared cache step by step in operation.
For the convolution processing method disclosed in the above embodiment, it can be understood by those skilled in the art that the convolution processing method can be applied to a multi-core DSP, and each processing core on the multi-core DSP is used as a convolution processing core, and by splitting each layer convolution into a multi-region single convolution, each convolution processing core only needs to traverse a corresponding region, and weight data can be moved step by step, so that all operations of the convolution are performed on a fast access memory, and the number of direct access and DMA access times of an external memory is reduced.
In order to make it easier for those skilled in the art to understand and implement the convolution processing method of the multi-core DSP platform disclosed in the present application, the present application provides the following more specific embodiments:
importing 4 convolution kernels into a multi-kernel DSP platform, and importing the convolution kernels into a multi-kernel DSP shared cache when the weight data volume of the convolution kernels is less than that of the multi-kernel DSP shared cache; when the weight data volume of the convolution kernel is larger than the multi-core DSP shared cache, the convolution kernel is led into the multi-core DSP shared cache step by step during operation;
importing an image with the size of (3, W, H) into a multi-core DSP platform, namely, the image layer of the image is 3, the width is W and the height is H, and carrying out edge PADDING processing on the image;
the image is divided into a plurality of regions, the number of convolution processing kernels is 4, in this embodiment, the number of the multi-core DSP platform processing kernels is 4, that is, 4 convolution processing kernels are provided, and the multi-core DSP platform processing kernels comprise a convolution processing kernel 0, a convolution processing kernel 1, a convolution processing kernel 2 and a convolution processing kernel 3, that is, the image can be divided into 4 regions of A, B, C, D, the size of each region can be W H/4, and the parts of the 3 layers corresponding to the region A are A0, A1 and A2; the parts corresponding to the region B are B0, B1, B2; the parts corresponding to region C are C0, C1, C2; the parts corresponding to region D are D0, D1, D2;
each convolution processing kernel corresponds to an area, and it should be understood that the number of areas for dividing the image should not exceed the number of convolution processing kernels, so as to ensure that each area can have one convolution processing kernel corresponding to it, in this implementation, specifically, convolution processing kernel 0 corresponds to area a, convolution processing kernel 1 corresponds to area B, convolution processing kernel 2 corresponds to area C, and convolution processing kernel 3 corresponds to area D;
each convolution processing core performs convolution processing on the part of each layer located in the corresponding area based on a plurality of convolution checks, that is, 4 convolution processing cores perform convolution processing on the part of 3 layers located in the corresponding area based on 4 convolution checks, specifically, convolution processing core 0 performs convolution processing on parts a0, a1 and a2 of 3 layers located in the corresponding area a based on 4 convolution checks, convolution processing core 1 performs convolution processing on parts B0, B1 and B2 of 3 layers located in the corresponding area B based on 4 convolution checks, convolution processing core 2 performs convolution processing on parts C0, C1 and C2 of 3 layers located in the corresponding area C based on 4 convolution checks, and convolution processing core 3 performs convolution processing on parts D0, D1 and D2 of 3 layers located in the corresponding area D based on 4 convolution checks;
the convolution processing results of the respective image layers in the corresponding areas are superposed by each convolution processing core corresponding to the respective convolution kernel, that is, the convolution processing results of the respective image layers in the corresponding areas are superposed by 4 convolution processing cores corresponding to 4 convolution kernels, respectively, and a more specific expression may be assumed that the 4 convolution kernels are convolution kernel 0, convolution kernel 1, convolution kernel 2, and convolution kernel 3, where,
corresponding to convolution kernel 0, superimposed convolution processing kernel 0 is based on the convolution processing result of convolution kernel 0 on portions a0, a1, a2 of 3 layers located in corresponding regions a
Figure BDA0002677095990000081
The superimposition convolution processing kernel 1 is based on the convolution processing result of the convolution kernel 0 on the parts B0, B1 and B2 of the 3 layers in the corresponding region B
Figure BDA0002677095990000082
The superimposition convolution processing kernel 2 is based on the convolution processing result of the convolution kernel 0 to the parts C0, C1 and C2 of the 3 layers in the corresponding region C
Figure BDA0002677095990000083
The superimposition convolution processing kernel 3 is based on the convolution processing result of the convolution kernel 0 on the parts D0, D1, D2 of the 3 layers located in the corresponding region D
Figure BDA0002677095990000084
Corresponding to convolution kernel 1, superimposed convolution processing kernel 0 is based on the convolution processing result of convolution kernel 1 for the portions a0, a1, a2 of the 3 layers located in the corresponding region a
Figure BDA0002677095990000085
The superimposition convolution processing kernel 1 is based on the convolution processing result of the convolution kernel 1 on the parts B0, B1 and B2 of the 3 layers in the corresponding region B
Figure BDA0002677095990000086
The superposition convolution processing kernel 2 is based on the convolution processing result of the convolution kernel 1 to the parts C0, C1 and C2 of the 3 layers positioned in the corresponding area C
Figure BDA0002677095990000087
The superimposition convolution processing kernel 3 is based on the convolution processing result of the convolution kernel 1 on the parts D0, D1, D2 of the 3 layers located in the corresponding region D
Figure BDA0002677095990000088
Corresponding to the convolution kernel 2, the superimposed convolution processing kernel 0 is based on the convolution processing result of the convolution kernel 2 for the portions a0, a1, a2 of the 3 layers located in the corresponding area a
Figure BDA0002677095990000089
The superimposition convolution processing kernel 1 is based on the convolution processing result of the convolution kernel 2 on the parts B0, B1, and B2 of the 3 layers located in the corresponding region B
Figure BDA00026770959900000810
The superimposition convolution processing kernel 2 is based on the convolution processing result of the convolution kernel 2 on the parts C0, C1 and C2 of the 3 layers in the corresponding region C
Figure BDA00026770959900000811
The superimposition convolution processing kernel 3 is based on the convolution processing result of the convolution kernel 2 on the parts D0, D1, D2 of the 3 layers located in the corresponding region D
Figure BDA00026770959900000812
Corresponding to the convolution kernel 3, the superimposed convolution processing kernel 0 is based on the convolution processing result of the convolution kernel 3 for the portions a0, a1, a2 of the 3 layers located in the corresponding region a
Figure BDA0002677095990000091
The superimposition convolution processing kernel 1 is based on the convolution processing result of the convolution kernel 3 for the parts B0, B1, B2 of the 3 layers located in the corresponding region B
Figure BDA0002677095990000092
The superimposition convolution processing kernel 2 is based on the convolution processing result of the convolution kernel 3 on the parts C0, C1 and C2 of the 3 layers located in the corresponding region C
Figure BDA0002677095990000093
The superimposition convolution processing kernel 3 is based on the convolution processing result of the convolution kernel 3 for the portions D0, D1, D2 of the 3 layers located in the corresponding region D
Figure BDA0002677095990000094
Combining convolution processing results of the respective regions, i.e., corresponding to 4 convolution kernels, corresponding to the respective convolution kernels, respectively combining convolution processing results of the respective regions, i.e., each convolution kernel of each region and a superposition of convolution processing results of 3 layers located in the region, wherein,
the result of the convolution process corresponding to convolution kernel 0, A region is
Figure BDA0002677095990000095
The convolution processing result of the B region is
Figure BDA0002677095990000096
The result of the convolution processing of the C region is
Figure BDA0002677095990000097
The convolution processing result of the D region is
Figure BDA0002677095990000098
Combining the convolution processing results of the respective regions, i.e. combining
Figure BDA0002677095990000099
Thereby obtaining a convolution processing result of the convolution kernel 0;
the result of the convolution process corresponding to the convolution kernel 1, A region is
Figure BDA00026770959900000910
The convolution processing result of the B region is
Figure BDA00026770959900000911
The result of the convolution processing of the C region is
Figure BDA00026770959900000912
The convolution processing result of the D region is
Figure BDA00026770959900000913
Combining the convolution processing results of the respective regions, i.e. combining
Figure BDA00026770959900000914
Thereby obtaining a convolution processing result of the convolution kernel 1;
the result of the convolution process corresponding to the convolution kernel 2, area A, is
Figure BDA00026770959900000915
The convolution processing result of the B region is
Figure BDA00026770959900000916
The result of the convolution processing of the C region is
Figure BDA00026770959900000917
The convolution processing result of the D region is
Figure BDA00026770959900000918
Combining the convolution processing results of the respective regions, i.e. combining
Figure BDA00026770959900000919
Thereby obtaining the convolution processing result of the convolution kernel 2;
the result of the convolution processing corresponding to the convolution kernel 3, A region, is
Figure BDA0002677095990000101
The convolution processing result of the B region is
Figure BDA0002677095990000102
The result of the convolution processing of the C region is
Figure BDA0002677095990000103
The convolution processing result of the D region is
Figure BDA0002677095990000104
Combining the convolution processing results of the respective regions, i.e. combining
Figure BDA0002677095990000105
Thereby obtaining a convolution processing result of the convolution kernel 3.
Each convolution processing core corresponds to one area, and the convolution processing is carried out on the part, located in the corresponding area, of each image layer of the image;
another aspect provides a convolution processing apparatus including:
the image processing device comprises an area dividing module, a processing module and a display module, wherein the area dividing module is used for dividing an image into a plurality of areas;
the convolution processing module enables each convolution processing core to correspond to one area, and each convolution processing core performs convolution processing on the part, located in the corresponding area, of each image layer of the image;
and the comprehensive processing module is used for synthesizing the convolution processing results of all the layers.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
For the apparatus disclosed in the above embodiment, since it corresponds to the method disclosed in the above embodiment, the description is simple, and specific relevant points can be described with reference to the method part, and the technical effect can also refer to the technical effect of the method part, which is not described herein again.
Furthermore, those skilled in the art should also realize that the various modules, units, and units of the apparatus disclosed in the embodiments of the present application can be implemented by electronic hardware, computer software, or a combination of both, and that for the sake of clarity only explaining the interchangeability of hardware and software, the functions described herein are generally implemented by hardware or software, and that depending on the particular application and design constraints imposed on the technical solution, those skilled in the art can choose different ways to implement the described functions for each particular application and its practical constraints, but such implementation should not be considered as beyond the scope of the present application.
Yet another aspect provides a multi-core DSP platform comprising:
a plurality of convolution processing kernels;
a memory storing a computer program configured to be capable of implementing any of the above-described convolution processing methods when executed by the respective convolution processing cores.
In some alternative embodiments, the memory may include various forms of computer-readable storage media, such as volatile memory, which may be random access memory, RAM, and/or cache memory, and/or non-volatile memory, which may be read-only memory, ROM, a hard disk, flash memory, and so forth. The memory may store thereon a computer program that is executed by the processor to implement the functions of the embodiments of the present application and/or other desired functions, and may store various application programs and various data.
It should be noted that, for clarity and conciseness of representation, not all the constituent units of the multi-core DSP platform are given in the foregoing embodiments, and in order to implement the necessary functions of the multi-core DSP platform, a person skilled in the art may provide and set other constituent units not shown according to specific needs.
For the multi-core DSP platform disclosed in the foregoing embodiments, since the convolution processing core can implement any of the foregoing methods when executing the computer program stored in the memory thereof, the technical effects of the foregoing methods can be referred to accordingly, and no further description is given here.
Yet another aspect is a computer-readable storage medium storing a computer program that, when executed, is capable of implementing any of the above-described convolution processing methods.
In some alternative embodiments, the computer-readable storage medium may include a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, a random access memory RAM, a read only memory ROM, an erasable programmable read only memory EPROM, a portable compact disc read only memory CD-ROM, a flash memory, or any combination of the above, as well as other suitable storage media.
Having thus described the present application in connection with the preferred embodiments illustrated in the accompanying drawings, it will be understood by those skilled in the art that the scope of the present application is not limited to those specific embodiments, and that equivalent modifications or substitutions of related technical features may be made by those skilled in the art without departing from the principle of the present application, and those modifications or substitutions will fall within the scope of the present application.

Claims (9)

1. A convolution processing method, comprising:
dividing the image into a plurality of regions;
each convolution processing core corresponds to one area, and the convolution processing is carried out on the part, located in the corresponding area, of each image layer of the image;
and synthesizing the convolution processing results of all the layers.
2. The convolution processing method according to claim 1,
the convolution processing result of each layer is specifically:
superposing each convolution processing core to check the convolution processing result of each layer in the corresponding area;
and combining convolution processing results of the areas.
3. The convolution processing method according to claim 2,
each convolution processing core performs convolution processing on the part of each layer, which is located in the corresponding area, specifically:
each convolution processing core performs convolution processing on the part of each layer, which is positioned in the corresponding area, based on the plurality of convolution cores;
the step of superposing each convolution processing core to check the convolution processing result of each layer in the corresponding area specifically comprises the following steps:
corresponding to each convolution kernel, superposing each convolution processing kernel to check the convolution processing result of each layer in the corresponding area;
merging convolution processing results of the regions, specifically:
convolution processing results for the respective regions are combined corresponding to the respective convolution kernels.
4. The convolution processing method according to claim 1,
further comprising:
and carrying out edge PADDING processing on the image.
5. The convolution processing method according to claim 1,
further comprising:
a convolution kernel is introduced.
6. The convolution processing method according to claim 5,
the importing convolution kernel specifically includes:
when the weight data quantity of the convolution kernel is less than the shared cache, the convolution kernel is led into the shared cache;
and when the weight data size of the convolution kernel is larger than the shared cache, the convolution kernel is led into the shared cache step by step.
7. A convolution processing apparatus, comprising:
the image processing device comprises an area dividing module, a processing module and a display module, wherein the area dividing module is used for dividing an image into a plurality of areas;
the convolution processing module enables each convolution processing core to correspond to one area, and each convolution processing core performs convolution processing on the part, located in the corresponding area, of each image layer of the image;
and the comprehensive processing module is used for synthesizing the convolution processing results of all the layers.
8. A multi-core DSP platform, comprising:
a plurality of convolution processing kernels;
a memory storing a computer program configured to enable the convolution processing method of any one of claims 1 to 6 when executed by the respective convolution processing cores.
9. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed, is capable of implementing the convolution processing method of any one of claims 1 to 6.
CN202010951445.3A 2020-09-11 2020-09-11 Convolution processing method and device, multi-core DSP platform and readable storage medium Pending CN112085167A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010951445.3A CN112085167A (en) 2020-09-11 2020-09-11 Convolution processing method and device, multi-core DSP platform and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010951445.3A CN112085167A (en) 2020-09-11 2020-09-11 Convolution processing method and device, multi-core DSP platform and readable storage medium

Publications (1)

Publication Number Publication Date
CN112085167A true CN112085167A (en) 2020-12-15

Family

ID=73737461

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010951445.3A Pending CN112085167A (en) 2020-09-11 2020-09-11 Convolution processing method and device, multi-core DSP platform and readable storage medium

Country Status (1)

Country Link
CN (1) CN112085167A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347847A1 (en) * 2018-05-09 2019-11-14 Massachusetts Institute Of Technology View generation from a single image using fully convolutional neural networks
CN110473137A (en) * 2019-04-24 2019-11-19 华为技术有限公司 Image processing method and device
CN111199273A (en) * 2019-12-31 2020-05-26 深圳云天励飞技术有限公司 Convolution calculation method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347847A1 (en) * 2018-05-09 2019-11-14 Massachusetts Institute Of Technology View generation from a single image using fully convolutional neural networks
CN110473137A (en) * 2019-04-24 2019-11-19 华为技术有限公司 Image processing method and device
CN111199273A (en) * 2019-12-31 2020-05-26 深圳云天励飞技术有限公司 Convolution calculation method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US10909418B2 (en) Neural network method and apparatus
US11468301B2 (en) Method and apparatus for performing operation of convolutional layer in convolutional neural network
CN109871936B (en) Method and apparatus for processing convolution operations in a neural network
EP3591572A1 (en) Method and system for automatic chromosome classification
EP3489862A1 (en) Method and apparatus for performing operation of convolutional layers in convolutional neural network
Huang The paradigmatic crisis in Chinese studies: Paradoxes in social and economic history
US10482177B2 (en) Deep reading machine and method
CN106779057B (en) Method and device for calculating binary neural network convolution based on GPU
TW202032579A (en) Method, apparatus and device for detecting lesion, and storage medium
Levinthal et al. Common-fate grouping as feature selection
US20220092325A1 (en) Image processing method and device, electronic apparatus and storage medium
CN112001923B (en) Retina image segmentation method and device
JP7104546B2 (en) Information processing equipment, information processing method
EP3651080A1 (en) Electronic device and control method thereof
CN112085167A (en) Convolution processing method and device, multi-core DSP platform and readable storage medium
CN105830160B (en) For the device and method of buffer will to be written to through shielding data
CN114121269A (en) Traditional Chinese medicine facial diagnosis auxiliary diagnosis method and device based on face feature detection and storage medium
CN111340790B (en) Bounding box determination method, device, computer equipment and storage medium
CN106681590B (en) Method and device for displaying screen content of driving recording device
Castro et al. Opencnn: a winograd minimal filtering algorithm implementation in cuda
Agarwal Predictive analysis in health care system using AI
Li et al. Research on object detection of PCB assembly scene based on effective receptive field anchor allocation
CN104765459B (en) The implementation method and device of pseudo operation
CN116400964A (en) Multithreading lock-free data processing method and related equipment
JP2021517310A (en) Processing for multiple input datasets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination