CN114120082A - Image acceleration convolution calculation method, system, equipment and readable storage medium - Google Patents

Image acceleration convolution calculation method, system, equipment and readable storage medium Download PDF

Info

Publication number
CN114120082A
CN114120082A CN202111393744.0A CN202111393744A CN114120082A CN 114120082 A CN114120082 A CN 114120082A CN 202111393744 A CN202111393744 A CN 202111393744A CN 114120082 A CN114120082 A CN 114120082A
Authority
CN
China
Prior art keywords
matrix
shift register
convolution
pixel data
outputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111393744.0A
Other languages
Chinese (zh)
Inventor
杨柯
吴新春
孙彪
朱书霖
成鑫才
李德鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ningbo Handa Information Technology Co ltd
Southwest Jiaotong University
Original Assignee
Ningbo Handa Information Technology Co ltd
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo Handa Information Technology Co ltd, Southwest Jiaotong University filed Critical Ningbo Handa Information Technology Co ltd
Priority to CN202111393744.0A priority Critical patent/CN114120082A/en
Publication of CN114120082A publication Critical patent/CN114120082A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image acceleration convolution calculation method, a system, equipment and a readable storage medium, comprising the following steps: step S1: acquiring original pixel data through a camera, and converting the original pixel data into a first matrix of m × n, wherein one matrix point corresponds to one pixel point in the original pixel data; step S2: outputting the first matrix to an FIFO module, and performing first-in first-out sequencing to obtain a second matrix; step S3: outputting the second matrix to a reading control module, and performing zero filling on the second matrix to obtain a matrix to be multiplied; step S4: and outputting the matrix to be multiplied to an operation module, and performing convolution operation on the matrix to be multiplied and the convolution kernel matrix. The calculation speed is improved through convolution operation, the convolution neural network is accelerated through the FPGA, and the FPGA has the characteristics of high speed and parallelism and is very suitable for hardware acceleration of the neural network.

Description

Image acceleration convolution calculation method, system, equipment and readable storage medium
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method, a system, a device, and a readable storage medium for image acceleration convolution calculation.
Background
With the development of artificial intelligence, machine learning is involved in many fields, and is applied to various industries, including medical treatment, security and the like. Deep learning has also been rapidly developed in recent years as the leading branch of the machine learning field. The convolutional neural network model is an algorithm model widely applied to deep learning, has unique advantages in image processing, and is usually used as a backbone in an image feature extraction model. However, as the complexity of the convolutional neural network is continuously increased, the data size of the image information is huge, but the computer resources are limited, so that the speed of image feature processing is slow.
Therefore, there is a need for a convolution calculation method, system, device and readable storage medium capable of speeding up image processing.
Disclosure of Invention
In order to solve the existing problems, the invention provides an image acceleration convolution calculation method, a system, equipment and a readable storage medium, the calculation speed is improved through convolution operation, the convolution neural network is accelerated through an FPGA, the FPGA has the characteristics of high speed and parallelism, the method is very suitable for hardware acceleration of the neural network, and the problem of low image processing speed in the prior art is solved.
In a first aspect, the present invention provides a method for calculating an accelerated convolution of an image, including the following steps: step S1: acquiring original pixel data through a camera, and converting the original pixel data into a first matrix of m × n, wherein one matrix point corresponds to one pixel point in the original pixel data; step S2: outputting the first matrix to an FIFO module, and performing first-in first-out sequencing to obtain a second matrix; step S3: outputting the second matrix to a reading control module, and performing zero filling on the second matrix to obtain a matrix to be multiplied; step S4: and outputting the matrix to be multiplied to an operation module, and performing convolution operation on the matrix to be multiplied and the convolution kernel matrix. The calculation speed is improved through convolution operation, and the convolution neural network is accelerated through the FPGA.
In some embodiments of the present application, in step S1, one pixel synchronization clock transmits one pixel value in the order from left to right and from top to bottom.
In some embodiments of the present application, in step S2, the FIFO module includes a read-write address control circuit and a dual-port RAM, and the width of the FIFO is the bit width of the original pixel data and the depth is the lateral resolution of the original pixel data × 2.
In some embodiments of the present application, in step S3, the method further includes: s31: when the FIFO module is detected to be not empty, outputting a 0 value; s32: sending n read requests, outputting the received data and outputting two 0 values; s33: repeating the step S32(m-1) times; s34: and sending n read requests, outputting the received data and outputting a 0 value.
In some embodiments of the present application, when detecting that the FIFO module is not empty, the read control module completes sending a value of 0, or sends a read request to the FIFO module, and forwards pixel data output by the FIFO module; and each time the FIFO module read request is sent, the counter is increased by one, the count value returns to the read control module, and the read control module judges whether to send an output 0 value or send the read request according to the count value.
In some embodiments of the present application, the read control module only performs zero padding before and after each row of the second matrix.
In some embodiments of the present application, the operation module includes a single-port read-only memory mirror image and a shift register, the depth of the single-port read-only memory mirror image is 9, an initial value is convolution parameters S1-S9 in a convolution kernel matrix, the shift register includes a first shift register, a second shift register, a third shift register, a fourth shift register and a fifth shift register, the widths of the first shift register and the second shift register are pixel data bit widths, and the depth is n + 2; the widths of the third shift register, the fourth shift register and the fifth shift register are pixel data bit widths, the depth of the third shift register, the fourth shift register and the fifth shift register is 3, the matrix to be multiplied enters the first shift register and the fifth shift register firstly, the output end of the first shift register is connected with the input ends of the second shift register and the third shift register, and the output end of the second shift register is connected with the input end of the fourth shift register.
In a second aspect, an image acceleration convolution computing system is further provided, which includes a camera, configured to collect original pixel data and convert the original pixel data into a first matrix of m × n, where one matrix point corresponds to one pixel point in the original pixel data; the FIFO module converts the first matrix into a second matrix through a first-in first-out sequence; the reading control module is used for zero padding the second matrix to obtain a matrix to be multiplied; and the operation module is used for performing convolution operation on the to-be-multiplied matrix and the convolution kernel matrix.
In a third aspect, there is also provided an image accelerated convolution computing apparatus, including: a memory for storing a computer program; a processor for implementing the steps of the image accelerated convolution calculation method as described above when executing the computer program.
In a fourth aspect, a readable storage medium is also provided, the readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the image accelerated convolution calculation method as described above.
The invention has the beneficial effects that: the invention provides an image acceleration convolution calculation method, a system, equipment and a readable storage medium, comprising the following steps: step S1: acquiring original pixel data through a camera, and converting the original pixel data into a first matrix of m × n, wherein one matrix point corresponds to one pixel point in the original pixel data; step S2: outputting the first matrix to an FIFO module, and performing first-in first-out sequencing to obtain a second matrix; step S3: outputting the second matrix to a reading control module, and performing zero filling on the second matrix to obtain a matrix to be multiplied; step S4: and outputting the matrix to be multiplied to an operation module, and performing convolution operation on the matrix to be multiplied and the convolution kernel matrix. The calculation speed is improved through convolution operation, the convolution neural network is accelerated through the FPGA, and the FPGA has the characteristics of high speed and parallelism and is very suitable for hardware acceleration of the neural network.
Drawings
FIG. 1 is a diagram of an original pixel matrix of the present invention;
FIG. 2 is a diagram of a zero-padding matrix of the present invention;
FIG. 3 is a convolution kernel matrix diagram of the present invention;
FIG. 4 is a schematic of the convolution of the present invention;
FIG. 5 is a block diagram of the system of the present invention;
FIG. 6 is a circuit diagram of the FIFO module according to the present invention;
FIG. 7 is a circuit diagram of a read control module according to the present invention;
FIG. 8 is a flow chart of a read control module system according to the present invention;
FIG. 9 is a diagram of a multiplication matrix according to the present invention;
FIG. 10 is a block diagram of the operational module of the present invention;
FIG. 11 is a Shift RAM data flow diagram of the present invention;
fig. 12 is a submatrix diagram of the present invention.
Detailed Description
The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments, not all embodiments, of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and for simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and therefore, should not be considered as limiting the present invention. Furthermore, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or including indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of the described features. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the present application, the word "exemplary" is used to mean "serving as an example, instance, or illustration. Any embodiment described herein as exemplary is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the invention. In the following description, details are set forth for the purpose of explanation. It will be apparent to one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and processes are not shown in detail to avoid obscuring the description of the invention with unnecessary detail. Thus, the present invention is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles disclosed herein.
At present, in order to obtain an image with the same size as that before convolution calculation, a method is generally adopted to expand the periphery of an original image, generally, a row or a column of zero elements are added to the periphery of the original image, and then convolution operation is performed, so that an output image with the same size as that of the original image can be obtained, and original image information is not lost. The resolution of the image is m x n. In fact, an m × n matrix, as shown in fig. 1; fig. 1 is a bit original pixel matrix, in order to keep the output image and the input image after the convolution operation the same size, the original image is not directly convolved, but a circle of zero values is formed around the original image, so as to obtain a zero-padding matrix as shown in fig. 2; fig. 3 shows a convolution kernel matrix with a convolution kernel size of 3 × 3, corresponding to a 3 × 3 matrix, where the values in the convolution kernel matrix are convolution parameters. The zero-padding matrix and the convolution kernel matrix are convolved to obtain an output matrix, as shown in fig. 4.
Example 1: please refer to fig. 5; the invention discloses an image acceleration convolution calculation method, which comprises the following steps: step S1: acquiring original pixel data through a camera, and converting the original pixel data into a first matrix of m × n, wherein one matrix point corresponds to one pixel point in the original pixel data; step S2: outputting the first matrix to an FIFO module, and performing first-in first-out sequencing to obtain a second matrix; step S3: outputting the second matrix to a reading control module, and performing zero filling on the second matrix to obtain a matrix to be multiplied; step S4: and outputting the matrix to be multiplied to an operation module, and performing convolution operation on the matrix to be multiplied and the convolution kernel matrix. The calculation speed is improved through convolution operation, and the convolution neural network is accelerated through the FPGA.
Example 2: referring to fig. 1, in some embodiments of the present application, in step S1, a pixel synchronization clock sends a pixel value in a sequence from left to right and from top to bottom. Secondary coordinate P of camera11Start transmission until Pmn
Example 3: referring to fig. 6, in some embodiments of the present application, in step S2, the FIFO module includes a read/write address control circuit and a dual-port RAM, the width of the FIFO is the bit width of the original pixel data, and the depth is the lateral resolution × 2 of the original pixel data. The resolution of the image sent by the camera is m × n, and the depth of the FIFO should be set to 2 m.
Example 4: referring to fig. 7 to 9, in some embodiments of the present application, in step S3, the method further includes the following steps: s31: when the FIFO module is detected to be not empty, outputting a 0 value; s32: sending n read requests, outputting the received data and outputting two 0 values; s33: repeating the step S32(m-1) times; s34: and sending n read requests, outputting the received data and outputting a 0 value. When the reading control module detects that the FIFO module is not empty, the sending of a 0 value is completed, or a reading request is sent to the FIFO module, and the pixel data output by the FIFO module is forwarded; and each time the FIFO module read request is sent, the counter is increased by one, the count value returns to the read control module, and the read control module judges whether to send an output 0 value or send the read request according to the count value. In some embodiments of the present application, the read control module only performs zero padding before and after each row of the second matrix.
Example 5: referring to fig. 10 to 12, in some embodiments of the present application, the operation module includes a single-port read-only memory mirror image and a shift register, where a depth of the single-port read-only memory mirror image is 9, an initial value of the single-port read-only memory mirror image is convolution parameters S1 to S9 in a convolution kernel matrix, the shift register includes a first shift register, a second shift register, a third shift register, a fourth shift register, and a fifth shift register, a width of the first shift register and a width of the second shift register are pixel data bit widths, and a depth of the first shift register and the second shift register is n + 2; the widths of the third shift register, the fourth shift register and the fifth shift register are pixel data bit widths, the depth of the third shift register, the fourth shift register and the fifth shift register is 3, the matrix to be multiplied enters the first shift register and the fifth shift register firstly, the output end of the first shift register is connected with the input ends of the second shift register and the third shift register, and the output end of the second shift register is connected with the input end of the fourth shift register.
In fig. 10, a Single Port ROM is a Single-Port read-only memory mirror image, and a Shift RAM is a Shift register; the Shift _ RAM _1, the Shift _ RAM _2, the Shift _ RAM _3, the Shift _ RAM _4 and the Shift _ RAM _5 are respectively a first Shift register, a second Shift register, a third Shift register, a fourth Shift register and a fifth Shift register, and the Shift _ RAM _3, the Shift _ RAM _4 and the Shift _ RAM _5 have initial values of 0; the Shift RAM increments the counter by one every time it outputs a number. The count value reflects that what is currently stored in the shift register is the data of the several rows. The counting value is input into a multi-selection module, the current calculation mode is judged according to the counting value, a driving code is generated and input into an Arithmetic Logic Unit (ALU), and the driving calculation unit adopts different operation formulas. The data to be processed sent by the read control circuit enters Shift _ RAM _1 and Shift _ RAM _5 first. The output of Shift _ RAM _1 is the input of Shift _ RAM _2 and Shift _ RAM _3, and the output of Shift _ RAM _2 is the input of Shift _ RAM _ 4. The dashed box marks the convolution template, and during convolution calculation, data stored in the convolution template and parameters of a convolution kernel are read for convolution operation.
Example 6: referring to fig. 12, when 2n +7 data are received, the data stored in the convolution template in the dashed box is the 3 × 3 sub-matrix at the upper left corner of the to-be-processed matrix, when the operation module receives 2n +7 data, the output data is valid, the operation module reserves two output ports, and one port outputs one valid pixel data in each clock cycle. The output data of the arithmetic unit, i.e. the output matrix, also follows the principle from top to bottom and from left to right. Specifically, row 1 of the output matrix will be output with row 2, and the last row 1 of the output matrix will be output with the 2 nd row from the last. That is, after the operation module receives 2n +7 data, it will output the 1 st line and the 2 nd line at the same time at two output ports, then output the 3 rd line to the m-2 line at one output port line by line, and finally output the m-1 st line and the m th line at the same time at two output ports. The first row of the output matrix is calculated as:
Q1j=Addr(2,3)×S4+Addr(2,2)×S5+Addr(2,1)×S6+Addr(1,3)×S7+Addr(1,2)×S8+Addr(1,1)×S9 j∈[1,n]
row 2 to m-1 of the output matrix:
Qij=Addr(2,3)×S1+Addr(2,2)×S2+Addr(2,1)×S3+Addr(1,3)×S4+Addr(1,2)×S5+Addr(1,1)×S6+Addr(3,3)×S7+Addr(3,2)×S8+Addr(3,1)×S9 i∈[2,m-1] j∈[1,n]
row m of the output matrix:
Qmj=Addr(1,3)×S1+Addr(1,2)×S2+Addr(1,1)×S3+Addr(3,3)×S4+Addr(3,2)×S5+Addr(3,1)×S6 j∈[1,n]
in a second aspect, an image acceleration convolution computing system is further provided, which includes a camera, configured to collect original pixel data and convert the original pixel data into a first matrix of m × n, where one matrix point corresponds to one pixel point in the original pixel data; the FIFO module converts the first matrix into a second matrix through a first-in first-out sequence; the reading control module is used for zero padding the second matrix to obtain a matrix to be multiplied; and the operation module is used for performing convolution operation on the to-be-multiplied matrix and the convolution kernel matrix. The calculation speed is improved through convolution operation, the convolution neural network is accelerated through the FPGA, and the FPGA has the characteristics of high speed and parallelism and is very suitable for hardware acceleration of the neural network.
In a third aspect, there is also provided an image accelerated convolution computing apparatus, including: a memory for storing a computer program; a processor for implementing the steps of the image accelerated convolution calculation method as described above when executing the computer program.
In a fourth aspect, a readable storage medium is also provided, the readable storage medium having stored thereon a computer program, which when executed by a processor, implements the steps of the image accelerated convolution calculation method as described above.
The invention has the technical effects that:
the calculation speed is improved through convolution operation, the convolution neural network is accelerated through the FPGA, and the FPGA has the characteristics of high speed and parallelism and is very suitable for hardware acceleration of the neural network.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed descriptions of other embodiments, and are not described herein again.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely illustrative and not restrictive of the broad application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.
Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
For each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, the entire contents of which are hereby incorporated by reference into this application, except for application history documents that are inconsistent with or conflict with the contents of this application, and except for documents that are currently or later become incorporated into this application as though fully set forth in the claims below. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the present disclosure.
The present invention provides a method, a system and a device for detecting a target based on the combination of SSD feature fusion and deep separable convolution, which are described in detail above, and the present invention is explained in the following by applying specific embodiments, and the description of the above embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. An image acceleration convolution calculation method is characterized by comprising the following steps:
step S1: acquiring original pixel data through a camera, and converting the original pixel data into a first matrix of m × n, wherein one matrix point corresponds to one pixel point in the original pixel data;
step S2: outputting the first matrix to an FIFO module, and performing first-in first-out sequencing to obtain a second matrix;
step S3: outputting the second matrix to a reading control module, and performing zero filling on the second matrix to obtain a matrix to be multiplied;
step S4: and outputting the matrix to be multiplied to an operation module, and performing convolution operation on the matrix to be multiplied and the convolution kernel matrix.
2. The method of claim 1, wherein in step S1, a pixel synchronization clock sends a pixel value in the order from left to right and from top to bottom.
3. The method according to claim 1, wherein in step S2, the FIFO module comprises a read-write address control circuit and a dual-port RAM, the FIFO has a width of a bit width of the original pixel data and a depth of a lateral resolution of 2 of the original pixel data.
4. The method of claim 1, wherein in step S3, the method further comprises the following steps:
s31: when the FIFO module is detected to be not empty, outputting a 0 value;
s32: sending n read requests, outputting the received data and outputting two 0 values;
s33: repeating the step S32(m-1) times;
s34: and sending n read requests, outputting the received data and outputting a 0 value.
5. The image acceleration convolution calculation method according to claim 4, wherein the reading control module finishes sending a 0 value or sends a reading request to the FIFO module when detecting that the FIFO module is not empty, and forwards the pixel data output by the FIFO module; and each time the FIFO module read request is sent, the counter is increased by one, the count value returns to the read control module, and the read control module judges whether to send an output 0 value or send the read request according to the count value.
6. The method according to claim 4, wherein the read control module performs zero padding only before and after each row of the second matrix.
7. The image acceleration convolution calculation method according to claim 1, wherein the operation module includes a single-port read-only memory mirror image and a shift register, the depth of the single-port read-only memory mirror image is 9, initial values are convolution parameters S1-S9 in a convolution kernel matrix, the shift register includes a first shift register, a second shift register, a third shift register, a fourth shift register and a fifth shift register, the width of the first shift register and the width of the second shift register are pixel data bit width, and the depth is n + 2; the widths of the third shift register, the fourth shift register and the fifth shift register are pixel data bit widths, the depth of the third shift register, the fourth shift register and the fifth shift register is 3, the matrix to be multiplied enters the first shift register and the fifth shift register firstly, the output end of the first shift register is connected with the input ends of the second shift register and the third shift register, and the output end of the second shift register is connected with the input end of the fourth shift register.
8. An image accelerated convolution computing system comprising
The camera is used for collecting original pixel data and converting the original pixel data into a first matrix of m x n, and one matrix point corresponds to one pixel point in the original pixel data;
the FIFO module converts the first matrix into a second matrix through a first-in first-out sequence;
the reading control module is used for zero padding the second matrix to obtain a matrix to be multiplied;
and the operation module is used for performing convolution operation on the to-be-multiplied matrix and the convolution kernel matrix.
9. An image accelerated convolution computing device, comprising:
a memory for storing a computer program;
a processor for implementing the steps of the image accelerated convolution calculation method according to any one of claims 1 to 7 when executing said computer program.
10. A scale storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the image accelerated convolution calculation method according to any one of claims 1 to 7.
CN202111393744.0A 2021-11-23 2021-11-23 Image acceleration convolution calculation method, system, equipment and readable storage medium Pending CN114120082A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111393744.0A CN114120082A (en) 2021-11-23 2021-11-23 Image acceleration convolution calculation method, system, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111393744.0A CN114120082A (en) 2021-11-23 2021-11-23 Image acceleration convolution calculation method, system, equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN114120082A true CN114120082A (en) 2022-03-01

Family

ID=80439940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111393744.0A Pending CN114120082A (en) 2021-11-23 2021-11-23 Image acceleration convolution calculation method, system, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN114120082A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115456860A (en) * 2022-11-09 2022-12-09 深圳市唯特视科技有限公司 Image enhancement method and device based on FPGA, helmet, equipment and medium
CN117745880A (en) * 2024-02-19 2024-03-22 西南交通大学 Medical image filling method, device, equipment and medium for multidimensional nonlinear transformation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107656899A (en) * 2017-09-27 2018-02-02 深圳大学 A kind of mask convolution method and system based on FPGA
CN108681984A (en) * 2018-07-26 2018-10-19 珠海市微半导体有限公司 A kind of accelerating circuit of 3*3 convolution algorithms
WO2020155044A1 (en) * 2019-01-31 2020-08-06 深圳市大疆创新科技有限公司 Convolution calculation device and method, processor and movable device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107656899A (en) * 2017-09-27 2018-02-02 深圳大学 A kind of mask convolution method and system based on FPGA
CN108681984A (en) * 2018-07-26 2018-10-19 珠海市微半导体有限公司 A kind of accelerating circuit of 3*3 convolution algorithms
WO2020155044A1 (en) * 2019-01-31 2020-08-06 深圳市大疆创新科技有限公司 Convolution calculation device and method, processor and movable device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邵兴龙: "基于FPGA的图像轮廓提取系统设计与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115456860A (en) * 2022-11-09 2022-12-09 深圳市唯特视科技有限公司 Image enhancement method and device based on FPGA, helmet, equipment and medium
CN117745880A (en) * 2024-02-19 2024-03-22 西南交通大学 Medical image filling method, device, equipment and medium for multidimensional nonlinear transformation
CN117745880B (en) * 2024-02-19 2024-05-03 西南交通大学 Medical image filling method, device, equipment and medium for multidimensional nonlinear transformation

Similar Documents

Publication Publication Date Title
CN114120082A (en) Image acceleration convolution calculation method, system, equipment and readable storage medium
CN109740534B (en) Image processing method, device and processing equipment
CN111860398B (en) Remote sensing image target detection method and system and terminal equipment
CN108416327B (en) Target detection method and device, computer equipment and readable storage medium
WO2019127517A1 (en) Data processing method and device, dma controller, and computer readable storage medium
CN110210480B (en) Character recognition method and device, electronic equipment and computer readable storage medium
CN109117940A (en) To accelerated method, apparatus and system before a kind of convolutional neural networks
CN112613541A (en) Target detection method and device, storage medium and electronic equipment
CN104573737B (en) The method and device of positioning feature point
CN113111201B (en) Digital twin model lightweight method and system
CN109978043B (en) Target detection method and device
CN111862343A (en) Three-dimensional reconstruction method, device and equipment and computer readable storage medium
CN110956131A (en) Single-target tracking method, device and system
CN115129297B (en) Multi-point multiplication operation system, method, graphic processor, electronic device and equipment
CN110610184B (en) Method, device and equipment for detecting salient targets of images
CN113033578B (en) Image calibration method, system, terminal and medium based on multi-scale feature matching
Umeo Linear-time recognition of connectivity of binary images on 1-bit inter-cell communication cellular automaton
CN111931937B (en) Gradient updating method, device and system of image processing model
CN113971630A (en) Projection posture recommendation method and device for converting three-dimensional structure diagram into two-dimensional three-view diagram
CN113468469A (en) Convolution processing method and device of feature graph executed by computer and electronic equipment
CN111797972A (en) Method, device and electronic system for processing data by using convolutional neural network
CN109522125B (en) Acceleration method and device for matrix product transposition and processor
CN112200774A (en) Image recognition apparatus
CN117197000B (en) Quick grid denoising method and device and electronic equipment
CN113554092B (en) Based on R 2 Net underwater fish target detection method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220301

RJ01 Rejection of invention patent application after publication