CN106952215B - Image pyramid feature extraction circuit, device and method - Google Patents

Image pyramid feature extraction circuit, device and method Download PDF

Info

Publication number
CN106952215B
CN106952215B CN201710107744.7A CN201710107744A CN106952215B CN 106952215 B CN106952215 B CN 106952215B CN 201710107744 A CN201710107744 A CN 201710107744A CN 106952215 B CN106952215 B CN 106952215B
Authority
CN
China
Prior art keywords
image pyramid
data
unit
image
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710107744.7A
Other languages
Chinese (zh)
Other versions
CN106952215A (en
Inventor
刘劲松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Allwinner Technology Co Ltd
Original Assignee
Allwinner Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Allwinner Technology Co Ltd filed Critical Allwinner Technology Co Ltd
Priority to CN201710107744.7A priority Critical patent/CN106952215B/en
Publication of CN106952215A publication Critical patent/CN106952215A/en
Application granted granted Critical
Publication of CN106952215B publication Critical patent/CN106952215B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/28Indexing scheme for image data processing or generation, in general involving image processing hardware

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to an image pyramid feature extraction circuit. The method comprises a parallel reusable reading circuit, a basic image frame is read in, coordinates of N effective data lines of M layers of an image pyramid in the basic image frame are calculated, and the N effective data lines are applied and stored according to the coordinates; the image pyramid feature calculation circuit comprises M image pyramid feature extraction units and is used for simultaneously calculating the features of M layers of the image pyramid according to N effective data lines; and the image pyramid characteristic confluence output circuit is used for converging M and K image pyramid characteristics output by the image pyramid characteristic calculation circuit and outputting the converged image pyramid characteristics to the memory. Wherein M, N is a positive integer greater than or equal to 2, and K is the number of feature channels of the image pyramid feature calculation unit. The image pyramid feature extraction is realized through hardware, more preprocessing and algorithm optimization for reducing precision are not needed, and the method has higher calculation speed and lower bandwidth. The invention also provides an image pyramid feature extraction device and method.

Description

Image pyramid feature extraction circuit, device and method
Technical Field
The invention relates to the field of image signal processing and integrated circuit design, in particular to an image pyramid feature extraction circuit, device and method.
Background
The multi-resolution and multi-angle decomposition of the image is a basic image analysis method. Obviously, when the method for representing the image is over-complete, it will be easier to extract the required visual information from the image, and the accuracy of the information obtained by calculation is more accurate. For example, constructing an image feature pyramid as an information source for target identification is a very effective method for improving identification accuracy through step-by-step calculation.
The feature extraction of the image pyramid is widely applied to a machine vision algorithm. Such as SIFT feature extraction, Ada-Boost detection, Lucas-Kanade optical flow method and the like. In the machine vision algorithm, pyramid feature extraction is used as a general preprocessing module, and aims to solve the problem of target object scale change and the problem of image semantic description in the operation process of the algorithm.
The traditional image pyramid feature extraction algorithm based on software processing has very large computation amount and bandwidth consumption. With the burning of image signal processing intelligent hardware and the widespread life of some video applications, the implementation of machine vision algorithms has met with challenges from real-time and systematic nature. The existing pyramid feature extraction technology has the defects that the processing step of pure software is used as a preprocessing module in the practical application of high frame rate requirement, high identification precision, long detection distance and relatively tense system resources, so that most of the calculation time is occupied, and the whole identification and detection algorithm cannot meet the real-time requirement of 30-60 frames per second. Pyramid feature extraction based on software is far from meeting actual requirements in the aspects of performance and system, and brings difficulty to product design. Under the condition, the digital IC implementation with greatly improved calculation speed and bandwidth optimization has more important significance, and an excellent pyramid feature extraction VLSI structure can improve the performance and experience of the whole application system.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an image pyramid feature extraction circuit, which overcomes the defects of large operation amount and serious bandwidth consumption of the traditional image pyramid feature extraction algorithm based on software processing.
The technical scheme adopted by the invention for solving the technical problems is as follows: the image pyramid feature extraction circuit comprises a parallel reusable reading circuit, an image pyramid feature calculation circuit and an image pyramid feature confluence output circuit; the parallel reusable reading circuit is used for reading a basic image frame, calculating coordinates of N effective data lines of M layers of an image pyramid in the basic image frame, and applying and storing the N effective data lines according to the coordinates; the image pyramid feature calculation circuit comprises M image pyramid feature calculation units and is used for simultaneously calculating the features of M layers of image pyramids according to the N effective data rows; the image pyramid feature confluence output circuit is used for converging M x K image pyramid features output by the image pyramid feature calculation circuit to an internal memory; wherein M, N is a positive integer greater than or equal to 2, and K is the number of feature channels of the image pyramid feature calculation unit.
The parallel reusable read-in circuit comprises a parallel policer unit and a first SRAM group: the parallel strategy unit is used for calculating coordinates of N effective data rows of the M layers of the image pyramid in the basic image frame and applying for the N effective data rows from a bus according to the coordinates; the first SRAM group includes N SRAMs for storing the N valid data rows.
The parallel policy unit comprises a coordinate calculation unit, a coordinate sorting unit, a data extraction unit and a minimum coordinate updating unit: the coordinate calculation unit is used for calculating the coordinates of N effective data rows of the M layers of the image pyramid in the basic image frame; the coordinate sorting unit is used for sorting the coordinates of the N effective data rows in the basic image frame to obtain the minimum coordinate; the data extraction unit is used for extracting two effective data lines according to the minimum coordinate; the minimum coordinate updating unit is used for updating the next coordinate of the image pyramid layer corresponding to the minimum coordinate according to a bilinear difference method; and the parallel policy unit continues to execute the coordinate sorting unit, and when the coordinates of the extracted effective data rows are the same as the coordinates of the currently selected effective data rows, only the non-extracted effective data rows are currently selected until the extraction of the N effective data rows is completed. And the coordinate calculation unit calculates the coordinates of the N effective data lines of the M layer of the image pyramid in the basic image frame according to a bilinear interpolation method.
The parallel reusable read-in module further comprises a second SRAM group, wherein the second SRAM group comprises N SRAMs and performs ping-pong storage on the N effective data rows with the first SRAM group.
The image pyramid feature calculating circuit comprises a data application unit, a multi-channel gate unit, M image pyramid calculating units and M image pyramid feature calculating units: the data application unit is used for applying the N effective data rows to the parallel reusable read-in module; the multi-channel gate unit comprises N multi-channel gates and is used for distributing the N effective data rows applied by the data application unit to the M image pyramid calculation units; the M image pyramid calculation units are used for simultaneously calculating data of M layers of corresponding rows of the image pyramid according to the N effective data rows; and the M image pyramid feature calculation units are used for simultaneously calculating the features of the image pyramid according to the data of the M layers of corresponding rows of the image pyramid.
And the multi-path gate unit distributes different effective data rows to the M image pyramid computing units according to the effective data rows hit by the M layers of the image pyramid.
The M image pyramid computing units are M bilinear difference computing units, and the M bilinear difference computing units simultaneously compute data of corresponding lines of M layers of the image pyramid according to the effective data lines distributed by the multi-path gate.
And the M image pyramid feature calculation units adopt a 5-by-5 feature extraction algorithm to extract image pyramid features.
The image pyramid feature confluence output circuit comprises an image pyramid feature caching unit, a data arbitrator unit, a priority updater unit, a caching label receiving unit and an image pyramid feature receiving unit: the image pyramid feature caching unit comprises M-K caching modules and is used for caching the M-K image pyramid features output by the image pyramid feature extraction module and sending a data reading request to the data arbiter unit; the priority updater unit is used for determining the read-out priority of the feature data of the image pyramid in the M x K cache modules; the data arbiter unit is used for returning a read-away response to the highest priority cache module determined by the priority updater unit and transmitting the image pyramid feature data in the highest priority cache module; the cache label receiving unit is used for recording the label of the highest priority cache module and generating a writing address of the image pyramid characteristic according to the label of the cache module; the image pyramid feature receiving unit is used for receiving the image pyramid features transmitted by the data arbiter and outputting the image pyramid features to a memory according to the writing addresses of the image pyramid features.
The M-K buffer modules are M-K FIFOs.
The priority policy of the priority updater element is: and when the absolute difference value between the write pointer and the read pointer of the M × K cache modules is maximum, the data read priority of the cache modules is highest.
When the cache module receives 32 pixel values of a bus, a data read request is sent to the data arbiter unit.
The invention also provides an image pyramid feature extraction device which adopts the image pyramid feature extraction circuit as claimed in any one of claims 1 to 13 to complete image pyramid feature extraction.
The invention also provides an image pyramid feature extraction method, which comprises the following steps: acquiring a basic image frame, calculating coordinates of N effective data rows of M layers in an image pyramid in the basic image frame, and acquiring the N effective data rows according to the coordinates; step two, simultaneously calculating the characteristics of M layers of the image pyramid according to the N effective data lines; step three, converging and outputting M x K image pyramid characteristics of the M layers of the image pyramid to a memory; the M, N is a positive integer greater than or equal to 2, and the K is the number of feature channels output by feature extraction of each layer in the M layers in the image pyramid.
The first step specifically comprises: calculating coordinates of N effective data lines of the M layers of the image pyramid in the basic image frame; sorting the coordinates of the N effective data lines in the basic image frame to obtain a minimum coordinate; extracting two effective data lines in the basic image frame according to the minimum coordinates; updating the next coordinate of the image pyramid layer corresponding to the minimum coordinate according to a bilinear difference method; and continuing to sequence and follow-up the latest coordinates, and when the coordinates of the extracted effective data rows are the same as the coordinates of the currently selected effective data rows, only selecting the non-extracted effective data rows at present until the extraction of the N effective data rows is completed.
And calculating the coordinates of the N effective data lines of the M layers of the image pyramid in the basic image frame according to a bilinear interpolation method.
The second step specifically comprises: receiving the N valid data lines; simultaneously calculating data of M layers of corresponding rows of the image pyramid according to the N effective data rows; and simultaneously performing feature calculation of the image pyramid according to the data of the corresponding row of the M layers of the image pyramid.
And distributing different effective data lines according to the effective data lines hit by the M layers of the image pyramid.
And simultaneously carrying out feature calculation of the image pyramid according to a bilinear difference method.
And performing image pyramid feature calculation according to a 5-by-5 feature extraction algorithm.
The third step specifically comprises: receiving the characteristics of M x K image pyramids of the M layers of the image pyramids, and storing the characteristics to the characteristic cache units of the M x K image pyramids; generating the reading priority of the feature data of the M x K image pyramids according to the reading pointers and the writing pointers of the M x K image pyramid feature cache units; acquiring image pyramid characteristic data with the highest read-out priority and a cache label thereof; generating the image pyramid characteristic data writing address according to the cache label; and writing the image pyramid characteristics into a memory according to the image pyramid characteristic data writing address until all the characteristics of the M x K image pyramids are written into the memory.
And when the feature cache unit of the image pyramid receives 32 pixel values, sending a data reading request.
And when the absolute difference value of the writing pointer and the reading pointer of the image pyramid feature caching unit is maximum, the data reading priority of the image pyramid feature caching unit is highest.
The image pyramid feature extraction circuit has the advantages that the image pyramid feature extraction is realized in a hardware mode, the pertinence optimization of parallel design is carried out, the high-speed low-bandwidth image pyramid feature extraction circuit is provided, the image pyramid feature extraction calculation speed is improved, and the bandwidth consumption required by operation is reduced. The frame rate requirement is high, the recognition accuracy is high, the detection distance is long, the performance and experience of an application system with more tense system resources are improved, and more latitude is brought to subsequent modules.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a block diagram of an image pyramid feature extraction circuit 100 according to an embodiment of the invention;
FIG. 2 is a circuit block diagram of a parallel reusable read-in module 200 according to one embodiment of the invention;
FIG. 3 is a circuit block diagram of a parallel policer unit 300 of an embodiment of the invention;
FIG. 4 is a schematic workflow diagram of the parallel policer unit 300 shown in FIG. 3;
FIG. 5 is a circuit block diagram of a parallel reusable read-in module 500 according to one embodiment of the present invention;
FIG. 6 is a schematic diagram of the parallel reusable read module of FIG. 2 or 5 reading a series of 4 pyramid-derived 8 valid data lines;
FIG. 7 is a block circuit diagram of an image pyramid feature calculation module 700 according to an embodiment of the invention;
FIG. 8 is a schematic diagram of the data distribution and pipeline of the present invention corresponding to the left side of FIG. 6 for obtaining 8 valid rows of data;
FIG. 9 is a schematic diagram of the data distribution and pipeline of the present invention for the right side of FIG. 6 to obtain 8 valid rows of data
FIG. 10 is a block diagram of the image pyramid feature merge output module 110 according to an embodiment of the invention;
FIG. 11 is a specific circuit of the image pyramid feature extraction circuit 120 according to an embodiment of the present invention;
FIG. 12 is a block flow diagram of a method 130 for image pyramid feature extraction according to an embodiment of the invention;
FIG. 13 is a block flow diagram of an image pyramid feature calculation method 140 in the image pyramid feature extraction method according to an embodiment of the invention;
FIG. 14 is a flowchart illustrating an image pyramid feature merging output method 150 in the image pyramid feature extraction method according to an embodiment of the present invention;
Detailed Description
The preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
Fig. 1 is a circuit block diagram of an image pyramid feature extraction circuit 100 according to an embodiment of the present invention, which includes a parallel reusable read-in circuit 101, an image pyramid feature calculation circuit 103, and an image pyramid feature merging output circuit 105.
The parallel reusable reading circuit 101 reads in the basic image frame, calculates coordinates of the N effective data lines of the M layers of the image pyramid in the basic image frame, and applies for and stores the N effective data lines to the bus according to the coordinates. In this patent, the data lines in the base image frame that participate in the image pyramid calculation are referred to as valid data lines. The image pyramid feature calculation circuit 103 includes M image pyramid feature calculation units, performs feature calculation of M layers of the image pyramid simultaneously according to N effective data lines output by the parallel reusable read-in module 101, and outputs M × K image pyramid feature values; the image pyramid feature confluence output module 105 receives M × K image pyramid feature values and merges the M × K image pyramid features to output to the memory. M, N may be a positive integer greater than or equal to 2, and K may be the number of feature channels of the image pyramid feature calculation unit. In a preferred embodiment, N may be a power of 2, and may be estimated based on bus efficiency and module efficiency. In particular embodiments, M may be 4, N may be 8, and K may be 3. That is, the image pyramid feature extraction circuit can perform feature extraction of 4 image pyramid layers at most in parallel, each image pyramid feature extraction unit has 3 feature channels, that is, the image pyramid feature extraction circuit outputs 12 image pyramid feature values of the 4 image pyramid layers at most.
In an embodiment, as shown in fig. 2, the parallel reusable read-in circuit 200 may include a parallel policer unit 201 and a first SRAM bank 203, wherein the first SRAM bank 203 may include N SRAMs. The parallel policy unit 201 calculates coordinates of N effective data lines of the M layer of the image pyramid in the basic image frame, and applies for the N effective data lines from the bus according to the calculated coordinates, in a specific embodiment, may calculate vertical coordinates of the N effective data lines of the M layer of the image pyramid in the basic image frame; the N SRAMs of the first SRAM group 203 store the valid data rows of N of the M layer of the image pyramid applied to the data bus by the parallel policy engine, respectively.
In one embodiment, as shown in fig. 3, the parallel policer unit 300 may include a coordinate calculation unit 301, a coordinate sorting unit 303, a data extraction unit 305, and a minimum coordinate update unit 307. The coordinate calculation unit 301 calculates coordinates of N effective data lines of the M layer of the image pyramid in the base image frame; the coordinate sorting unit 303 sorts the coordinates calculated by the coordinate calculation unit 301, and extracts the minimum coordinates; the data extraction unit 305 applies for two corresponding valid data lines in the basic image frame from the bus according to the minimum coordinate, and stores the valid data lines in the first SRAM group; the minimum coordinate updating unit updates the coordinate of the layer where the minimum coordinate is located to obtain the next used coordinate of the layer, and the non-minimum coordinate is kept unchanged; the parallel policer unit 300 continues to execute the coordinate sorting unit 303 and subsequent units until the parallel policer unit 300 completes the extraction of the N valid data lines. In a specific embodiment, the coordinate calculation unit 301 may calculate the vertical coordinates of the N effective data lines of the M layer of the image pyramid in the base image frame.
FIG. 4 is a schematic flow diagram of the operation of the parallel policer unit 300 shown in FIG. 3. The coordinate calculation unit 301 calculates the vertical coordinates of the N effective data lines of the M layer of the image pyramid in the basic image frame according to a bilinear interpolation method; the coordinate sorting unit 303 sorts the vertical coordinates to obtain minimum coordinates; the data extraction unit 305 applies for two corresponding valid data lines in the basic image frame to the bus according to the minimum coordinates; the minimum coordinate updating unit updates the next longitudinal coordinate of the pyramid layer corresponding to the minimum coordinate according to a bilinear difference method; the parallel policy unit 300 continues to execute the coordinate sorting unit 303 and the subsequent steps, and when the coordinates of the extracted valid data line are the same as the coordinates of the currently selected valid data line, only the non-extracted valid data line is currently selected. And so on until the parallel policer unit 300 completes the extraction of the N valid data lines. Therefore, the reusable effective data rows are arranged from the original disordered effective data rows of the M layers of the image pyramid, and the system bandwidth consumption is reduced while the operation speed of the image pyramid feature extraction circuit is improved.
In a specific embodiment, the coordinate calculating unit 301 may calculate, according to a bilinear difference method, a vertical coordinate of 1 effective data line in the base image frame of each of the M continuous layers of the image pyramid, where the vertical coordinate obtained by adding 1 to the vertical coordinate is a vertical coordinate of another effective data line in the base image frame of each of the M continuous layers of the image pyramid.
In one embodiment, as shown in fig. 5, the parallel reusable read-in module 500 may include a parallel policer unit 501 and a first SRAM bank 503 and a second SRAM bank 505, wherein the first SRAM bank 503 may include N SRAMs, the second SRAM bank 505 may include N SRAMs, and the first SRAM bank 503 and the second SRAM bank 505 perform ping-pong storage on N valid data rows. The ping-pong storage refers to: while the first SRAM group 503 or the second SRAM group 505 writes data, the second SRAM group 505 or the first SRAM group 503 may read data to a module behind for calculation, so as to form a high-speed interface for reducing bus latency, and further improve the operation speed of the image pyramid feature extraction circuit.
Fig. 6 is a schematic diagram of 8 valid data lines of the 4-level image pyramid read by the parallel reusable read-in module of fig. 2 or 5. The left side is 8 effective data rows of 4 layers at the bottom of the comparison layer in a certain time image pyramid, the right side is 8 effective data rows of 4 layers at the top of the comparison layer in a certain time image pyramid, and the 8 effective data rows are not necessarily continuous in the spatial domain of the image pyramid. Since the correlation between the layers at the bottom of the comparison in the image pyramid is relatively large, more valid data lines can be reused between the layers on the left in fig. 6. For example, valid data line 1 and valid data line 2 may be used to calculate the level 1, level 2, level 3, level 1, and level 4, line 1 in the image pyramid 4 level; the valid data line 2 and valid data line 3 may be used to compute the level 1, level 2, line 2 and level 3, line 2, etc. in the image pyramid 4 level. Since the correlation between the layers at higher levels in the image pyramid is relatively small, fewer valid data lines can be reused between the layers at the right in fig. 5. For example, valid data line 1 and valid data line 2 can only be used to compute the level 1, line 1 in the image pyramid 4 level; the valid data line 3 and the valid data line 4 can only be used for calculating the layer 2, the 3 rd line, the valid data line 5 and the valid data line 6 in the layer 4 of the image pyramid, and can only be used for calculating the layer 3, the 5 th line, the valid data line 7 and the valid data line 8 in the layer 4 of the image pyramid.
The work flow of the parallel policy engine for obtaining the valid data line shown in the left diagram of fig. 6 is as follows: the parallel policy device firstly calculates the vertical coordinate of an effective data line of each layer in the 4 layers of the image pyramid in the basic image frame according to a bilinear difference method, namely 4 vertical coordinates are obtained; the parallel policy maker sorts the 4 vertical coordinates to obtain the minimum vertical coordinate, namely the vertical coordinate of the layer 1; the parallel strategy device applies for the minimum ordinate and two effective data rows corresponding to the minimum ordinate plus 1, namely an effective data row1 and an effective data row2, from the bus; the parallel policy device stores the effective data row into a first SRAM group or a second SRAM group; the parallel policy device updates the next ordinate of the layer 1 corresponding to the minimum ordinate according to the bilinear difference method, and the non-minimum ordinate is unchanged; and the parallel policy maker continues to sequence the current vertical coordinate and obtain the operation of the effective data rows until 8 effective data rows are obtained, namely, one-time parallel policy operation is completed. It should be noted that, when the coordinates of the extracted valid data line are the same as the coordinates of the currently selected valid data line, only the non-extracted valid data line is currently selected. By adopting the parallel strategy device, the original disordered effective data lines among the 4 layers of pyramids are arranged into a reusable effective data line, the same effective data line does not need to be read and stored for multiple times, the running bandwidth and the cache consumption are reduced, and the operation speed of the image pyramid feature extraction circuit is increased.
Fig. 7 is a circuit block diagram of an image pyramid feature calculation circuit 700 according to an embodiment of the present invention, which includes a data application unit 701, a multiplexer unit 703, M image pyramid calculation units 705, and M image pyramid feature calculation units 707.
When the first SRAM group or the second SRAM group in the parallel reusable read-in circuit is full of data, the data application unit 701 sends a data request to the parallel reusable read-in circuit, that is, applies for an effective data line stored by the parallel reusable read-in circuit from the first SRAM group or the second SRAM group, and applies for data with a bus bit width each time, for example, 64bit of data is requested each time; the multiplexer unit 703 may include N multiplexers to simultaneously allocate different valid data lines to the M image pyramid computation units 705 according to the valid data lines hit in different layers of the image pyramid; the M image pyramid calculation units 705 perform image pyramid calculation simultaneously according to the received effective data lines, each image pyramid calculation unit calculates one pixel value in one line in an image pyramid layer each time, and the calculated pixel value is stored in the RAM; when the RAM is full of sufficient data, the M image pyramid feature calculation units 707 perform image pyramid feature calculation, and output M × K image pyramid features, where K calculates the number of feature channels of a small cell for each image pyramid feature, and K may be 3, for example. The 1 image pyramid computing unit and the 1 image pyramid feature computing unit form computing channels of the 1 layer of the image pyramid, and the computing channels are totally independent.
For example, as shown in the left diagram of fig. 6, for an image pyramid layer 1, the hit effective data row is an effective data row 1-8, the multi-way gate unit 703 allocates effective data rows 1-8 to the image pyramid calculation unit of the calculation layer 1 in the M image pyramid calculation units 705, for an image pyramid layer 4, the hit effective data row is an effective data row1, 2, 4, 5, 7, 8, and the multi-way gate unit 703 allocates effective data rows 1, 2, 4, 5, 7, 8 to the image pyramid calculation unit of the calculation layer 4 in the M image pyramid calculation units 705.
In one embodiment, the valid data rows of the first SRAM group or the second SRAM group received by the M image pyramid calculation units 705 are not fixed, and may be allocated to different valid data rows, so as to obtain a more efficient calculation process. For example, when the parallel reusable read-in module obtains the valid data lines as shown in the left side of fig. 6, the multiplexer unit 703 allocates the valid data lines (1, 2), (2, 3), (3, 4), (5, 6), (6, 7) and (7, 8) to the first image pyramid calculation unit, and the multiplexer unit 703 allocates the valid data lines (1, 2), (4, 5) and (7, 8) to the fourth image pyramid calculation unit.
In one embodiment, the M image pyramid calculation units 705 may be M bilinear difference calculation units, which interpolate and simultaneously calculate data of M-layer corresponding lines of the image pyramid according to the received valid data lines.
FIG. 8 is a schematic diagram of data distribution and pipelining according to the present invention for the left diagram of FIG. 6, which takes 8 valid data lines. The PE0, the PE1, the PE2, and the PE3 are four bilinear interpolation calculation units, the PE0 is responsible for calculating bilinear interpolation of the first image pyramid layer, the PE1 is responsible for calculating bilinear interpolation of the second image pyramid layer, the PE2 is responsible for calculating bilinear interpolation of the third image pyramid layer, and the PE3 is responsible for calculating bilinear interpolation of the fourth image pyramid layer. Row1 corresponds to valid data Row1 in fig. 4, Row2 corresponds to valid data Row2 in fig. 4, and so on. After the parallel reusable read-in module completes the parallel strategy shown in the left diagram of fig. 4 and obtains 8 effective data lines, PE0 can calculate the 1 st data of the image pyramid layer 1 according to the effective data lines 1 and 2; according to the effective data lines 2 and 3, the 2 nd row data of the image pyramid layer 1 can be calculated, and so on. When the situation of the right diagram of fig. 4 occurs, as shown in fig. 9, only one pipeline is needed for completing the calculation.
In one embodiment, in order to reduce the number of multipliers in the bilinear interpolation calculation unit, an optimized bilinear interpolation algorithm is adopted, and the calculation formula is as follows:
Ixi×28=I0×28+xf×28×(I1-I0)
Iyi×28=I2×28+xf×28×(I3-I2)
I=(Ixi×28+yf×28×(Iyi-Ixi))>>8
wherein, I0 is the effective reference point of the upper left corner, I1 is the effective reference point of the upper right corner, I2 is the effective reference point of the lower left corner, and I3 is the effective reference point of the lower right corner. If the fetched data hits the reference point, the bilinear interpolation computation unit starts computation. The obtained data hit reference point refers to a line of pixels in the corresponding layer of the image pyramid layer that can be calculated when the data of the input valid data line is inputted.
In an embodiment, the image pyramid feature calculation unit may perform image pyramid feature extraction by using a 5 × 5 feature extraction algorithm. In order to obtain a value of feature extraction, 5 × 5 feature extraction algorithm needs to input 5 × 5 to 25 pixels, and in order to input 25 points of 5 × 5 feature extraction, each image pyramid feature calculation unit needs to cache 4 lines of data in RAM, and when the last line is input, the image pyramid feature calculation unit is immediately started. In other embodiments, other feature extraction algorithms, such as 3 × 3 feature extraction algorithm, may also be used to achieve the object of the present invention.
In an application system with high recognition accuracy, a plurality of characteristic channels are often needed for detection, and when the design parallelism and the number of the characteristic channels are increased, the output channel competition problem is brought. Fig. 10 is a circuit block diagram of the image pyramid feature confluence output circuit 110 according to an embodiment of the present invention, which includes an image pyramid feature buffering unit 111, a priority updater unit 113, a data arbiter unit 115, a buffering label receiving unit 117, and an image pyramid feature receiving unit 119.
The image pyramid feature caching unit 111 may include M × K caching modules, where K is the number of feature channels of each image pyramid feature calculation unit, and each caching module is respectively assigned with a label of 0-M × K-1 and respectively receives the M × K image pyramid features output by the image pyramid feature extraction module. The image pyramid feature calculation module stores one pixel value in the cache module each time, and in order to improve the bus efficiency, when the cache module is full of enough pixels, a data read-out request is sent to the subsequent data arbiter unit 115; the priority updater unit 113 determines the read priority of the image pyramid feature data in the M × K cache modules; the data arbiter unit 115 receives the data read-away request sent by the image pyramid feature cache unit 111, returns a read-away response to the highest priority cache module determined by the priority updater unit 113, and sends the data of the highest priority cache module to the image pyramid feature receiving unit 119; when the data arbiter unit 115 sends data, the label of the cache module with the highest priority is stored in the cache label receiving unit 117, and the cache label receiving unit 117 generates the write address of the current data according to the received label of the cache module with the highest priority; the image pyramid feature receiving unit 119 outputs the image pyramid feature value received each time to a memory (DDR) according to the write address of the data.
In a specific embodiment, since a basic unit of a common DDR is 32 pixels, in order to improve bus efficiency, an application should be an integer multiple of 32 pixel values each time, and an input of the buffer module is one pixel value. Therefore, when the cache module receives 32 pixel values, it sends a data read request to the data arbiter unit 115, and the data arbiter unit 115 similarly transmits the 32 pixel values to the image pyramid feature receiving unit 119, and the image pyramid feature receiving unit 119 sends the 32 pixel values to the memory.
In one embodiment, the M × K buffer modules of the image pyramid feature buffer unit 111 may be M × K FIFOs, each FIFO is assigned with a label of 0-M × K-1, and when the data received by any FIFO reaches 32 pixel sizes, a data read-out request is immediately sent to the subsequent data arbiter unit 115. At this time, the priority updater 113 defines the read-out priorities of the M × K FIFOs according to the absolute differences between the write pointers and the read pointers of the current M × K FIFOs, that is, the read-out priority of the data in the FIFO is the highest when the absolute difference between the write pointer and the read pointer of the current FIFO is the largest. The data arbiter unit 115 returns a read-away response to the FIFO with the highest priority, and reads away data of 32 pixels in the following 4 cycles, the data arbiter unit 115 sends the tag of the FIFO with the highest priority to the buffer tag receiving unit 117 while sending the data, the buffer tag receiving unit 117 generates a write address of the data according to the tag, and the image pyramid feature receiving unit 119 writes the data of 32 pixels to the memory according to the write address of the data.
In a specific embodiment, M may take 4, K may take 3, i.e. there are 12 feature output channels. By adopting the image pyramid feature confluence output module 110 shown in fig. 10, the data arbiter unit 115 can simultaneously receive data read-out requests of 12 channels at most, and selectively respond to the data read-out request of one cache module at a time according to the priority updater unit 113, and read out data of 32 pixels in the following 4 cycles, thereby realizing an efficient multi-channel feature extraction confluence output, and solving the problems of competition and write-out bottleneck caused by a parallel structure and multiple output channels.
Fig. 11 is a specific circuit of the image pyramid feature extraction circuit 120 according to an embodiment of the present invention. The image pyramid feature extraction circuit 120 is a 4-layer parallel design structure, and can perform feature extraction of 4 layers of the image pyramid at most at the same time each time.
The parallel reusable read-in circuit 121 includes a parallel policer and 16 SRAMs, where SRAMs 00-70 are a first SRAM bank and SRAMs 01-71 are a second SRAM bank. The parallel policer executes a parallel policy once according to the flow shown in fig. 4: firstly, calculating the vertical coordinate of an effective data line of an image pyramid 4 layer in a basic image frame according to a bilinear interpolation method; the vertical coordinates are sequenced, two effective data rows are applied to the bus according to the minimum vertical coordinate and are stored in the first SRAM group or the second SRAM group; the parallel policy device calculates the ordinate to be used next time of the layer corresponding to the minimum ordinate according to a bilinear interpolation method, and the non-minimum ordinate is kept unchanged; and the parallel policy device continues to execute subsequent steps of coordinate sorting and the like, and when the coordinates of the extracted effective data rows are the same as the coordinates of the currently selected effective data rows, only the non-extracted effective data rows are selected at present until 8 effective data rows are extracted. The first SRAM group and the second SRAM group execute a ping-pong storage mechanism, namely, the second SRAM group or the first SRAM group can read data for later calculation while the first SRAM group or the second SRAM group writes data, and a high-speed interface for reducing bus waiting time is formed.
When the first SRAM group or the second SRAM group is full of data, the data application unit in the image pyramid feature calculation circuit 123 sends a data request to the first SRAM group or the second SRAM group, and requests data of a bus bit width each time, where the bus bit width in this embodiment is 64 bits, so that the data application module applies for the data of 64 bits to the first SRAM group or the second SRAM group each time. In the figure, 8P 2S form a multi-way gate unit, and each P2S allocates different effective data rows to the image pyramid feature calculation units (PE0-PE3) according to the situation that 8 effective data rows are hit in different layers of the image pyramid, that is, the data read from the SRAM group is not fixed, and may be allocated to different data resources, so as to obtain a more efficient calculation process.
The image pyramid feature calculation unit 123 includes four bilinear difference calculation units (PE0, PE1, PE2, and PE3), and can perform feature calculation of the image pyramid 4 layer at the same time. In order to reduce the number of multipliers in the bilinear difference calculation unit, an optimized bilinear interpolation algorithm is adopted, and the calculation formula is as follows:
Ixi×28=I0×28+xf×28×(I1-I0)
Iyi×28=I2×28+xf×28×(I3-I2)
I=(Ixi×28+yf×28×(Iyi-Ixi))>>8
wherein, I0 is the effective reference point of the upper left corner, I1 is the effective reference point of the upper right corner, I2 is the effective reference point of the lower left corner, and I3 is the effective reference point of the lower right corner. The calculation is started when the data obtained by the bilinear difference calculation unit hits a reference point, which means that a pixel in a row in the corresponding layer of the image pyramid layer can be calculated when the data of the effective data row is input. Each time a pixel point value of a corresponding line in the image pyramid layer is calculated, the pixel values of which the difference calculation is completed by the bilinear difference calculation units (PE0, PE1, PE2 and PE3) each time are stored in the RAM strip, and when the RAM strip is full of enough data, the Feature extraction module (Feature Extract) is started immediately. Wherein, 4 image pyramid feature calculation channels are completely independent.
The Feature calculation module (Feature Extract) adopts a 5 × 5 Feature extraction algorithm, in order to obtain a value of Feature extraction, 5 × 5 to 25 pixels need to be input, in order to input 25 points of 5 × 5 Feature extraction, each bilinear difference calculation unit needs to cache 4 lines of data in Ram Stride, and when the last line is input, the image pyramid Feature calculation module (Feature Extract) is immediately started. In other embodiments, other feature extraction algorithms, such as 3 x 3 feature extraction algorithm, may also be used to achieve the object of the present invention.
In this embodiment, each image pyramid feature calculation module has 3 feature channels, that is, there are 12 feature channels in 4 parallel calculation paths. In order to solve the problems of channel competition and bottleneck writing, an efficient multi-channel characteristic confluence output structure is designed. That is, the image pyramid feature merge output circuit 125 in fig. 11 includes a Data buffer unit, a priority updater (PRI updata), a Data Arbiter (Arbiter), a Data receiving unit (Data fifo), and a buffer Flag receiving unit (Flag fifo, Addr Gen).
The data caching unit comprises 12 FIFOs, and each FIFO is respectively allocated with 0-11 labels and is respectively used for caching 12 image pyramid characteristics output by the image pyramid characteristic calculating module. In order to improve the efficiency of the bus, for each FIFO, when the data written into the FIFO reaches the size of 32 pixels, a read-away request is immediately sent to a data arbiter; a priority updater (PRI updata) determines the reading priority of the FIFO according to the difference value between the write pointer and the read pointer of the current 12 FIFOs, wherein the larger the difference value is, the higher the reading priority is, namely, the FIFO with more data can obtain the higher reading priority; the Data Arbiter (Arbiter) selects and responds to a FIFO request each time according to the priority strategy of the priority updater (PRI updata), reads Data of 32 pixels in the following 4 periods and stores the Data in the Data receiving unit (Data FIFO); the Data Arbiter (Arbiter) stores the label of the response FIFO into a cache label receiving unit (Flag FIFO) while sending Data, the cache label receiving unit (Flag FIFO) generates a write address of the Data according to the label, the image pyramid feature confluence output module 120 writes the Data in the Data receiving unit (Data FIFO) into a memory (DDR) according to the write address of the Data by 32 pixels at a time, and when the Data of 8 FIFOs are read, 4-parallel image pyramid feature extraction is completed once.
Fig. 12 is a flowchart of an image pyramid feature extraction method 130 according to an embodiment of the invention. Step 131, firstly, obtaining a basic image frame, calculating coordinates of N effective data lines of M layers in an image pyramid in the basic image frame, and then obtaining the N effective data lines in the basic image frame according to the coordinates; step 133, simultaneously calculating the features of the M layers of the image pyramid according to the N effective data lines; step 135, converging and outputting the M x K image pyramid characteristics of the M layers of the image pyramid to a memory; m, N is a positive integer greater than or equal to 2, and K is the number of feature channels of feature extraction output of each layer in the M layers in the image pyramid. In a preferred embodiment, N may be a power of 2, and may be estimated based on bus efficiency and module efficiency. In one embodiment, the vertical coordinates of the N valid data lines of the M layers of the image pyramid in the base image frame may be calculated. In particular embodiments, M may be 4, N may be 8, and K may be 3. That is, the image pyramid feature extraction method can extract features of 4 image pyramid layers at most in parallel, and each layer of feature extraction algorithm has 3 feature channels, that is, the image pyramid feature extraction method outputs 12 image pyramid feature values of the 4 image pyramid layers at most.
In one embodiment, the specific execution flow diagram of step 131 in fig. 12 may be the same as that shown in fig. 4. The method specifically comprises the following steps: firstly, calculating coordinates of N effective data lines of an M layer of an image pyramid in a basic image frame; sorting the coordinates of the N effective data lines in the basic image frame to obtain a minimum coordinate; extracting two effective data lines in the basic image frame according to the minimum coordinate; updating the next coordinate of the image pyramid layer corresponding to the minimum coordinate according to a bilinear difference method; and continuing to sequence and follow-up the latest coordinates, and when the coordinates of the extracted effective data rows are the same as the coordinates of the currently selected effective data rows, only selecting the non-extracted effective data rows at present until the extraction of the N effective data rows is completed. Therefore, the reusable effective data rows are arranged from the original disordered effective data rows of the M layers of the image pyramid, and the system bandwidth consumption is reduced while the operation speed of the image pyramid feature extraction algorithm is improved.
In one embodiment, the coordinates of the N valid data lines of the M layers of the image pyramid in the base image frame may be calculated according to a bilinear interpolation method. In a specific embodiment, a vertical coordinate of 1 effective data line of each of the M continuous layers of the image pyramid in the base image frame may be calculated according to a bilinear difference method, and the vertical coordinate obtained by adding 1 to the vertical coordinate is a vertical coordinate of another effective data line of each of the M continuous layers of the image pyramid in the base image frame.
Fig. 13 is a flowchart of an image pyramid feature calculation method 140 in the image pyramid feature extraction method according to an embodiment of the present invention. Step 141, acquiring N effective data lines extracted from the base image frame in the previous step; step 143, simultaneously calculating data of corresponding rows in the M layers of the image pyramid according to the N effective data rows; step 145, image pyramid feature calculation is simultaneously performed according to the data of the M layers of corresponding rows of the image pyramid.
In one embodiment, different valid data lines may be assigned according to the valid data lines hit in the M layers of the image pyramid. The effective data lines hit in different layers of the image pyramid refer to effective data lines corresponding to a plurality of lines in different layers of the image pyramid acquired in step one at 131, for example, as shown in the left diagram of fig. 6, for the image pyramid layer 1, the hit effective data lines 1 to 8 are allocated to the image pyramid calculation flow of the calculation layer 1, and for the image pyramid layer 4, the hit effective data lines 1, 2, 4, 5, 7, 8 are allocated to the image pyramid calculation flow of the calculation layer 4. In this way, the image pyramid calculation flow of each layer is possibly distributed to different effective data lines, and a more efficient calculation process can be obtained.
In one embodiment, feature calculation of the image pyramid may be performed simultaneously according to a bilinear difference method.
In one embodiment, an optimized bilinear interpolation algorithm is used, and the calculation formula is as follows:
Ixi×28=I0×28+xf×28×(I1-I0)
Iyi×28=I2×28+xf×28×(I3-I2)
I=(Ixi×28+yf×28×(Iyi-Ixi))>>8
wherein, I0 is the effective reference point of the upper left corner, I1 is the effective reference point of the upper right corner, I2 is the effective reference point of the lower left corner, and I3 is the effective reference point of the lower right corner.
In one embodiment, the image pyramid feature calculation may be performed according to a 5 x 5 feature extraction algorithm. In order to obtain a value of feature extraction, 5 × 5 feature extraction algorithm needs to input 5 × 5 to 25 pixels, and in order to input 25 points of 5 × 5 feature extraction, the image pyramid feature calculation process of each layer needs to cache 4 lines of data in the RAM, and when the last line is input, image pyramid feature calculation is immediately performed. In other embodiments, other feature extraction algorithms, such as 3 × 3 feature extraction algorithm, may also be used to achieve the object of the present invention.
Fig. 14 is a flowchart of an image pyramid feature merging output method 150 in the image pyramid feature extraction method according to an embodiment of the present invention. Step 151, obtaining the characteristics of M x K image pyramids of M layers of image pyramids, and storing the characteristics to the characteristic cache units of the M x K image pyramids; step 153, generating the read-out priority of the feature data of the M × K image pyramids according to the read pointer and the write pointer of the M × K image pyramid feature cache unit; step 155, obtaining image pyramid feature data with the highest read-away priority and cache labels thereof; step 157 generating the image pyramid feature data writing address according to the cache label; 159, writing the image pyramid feature into a memory according to the image pyramid feature data writing address; and writing the characteristics of the M x K image pyramids into a memory. The method realizes efficient multi-channel feature extraction and confluence output, and solves the problems of competition and bottleneck writing caused by image pyramid multi-layer parallel computation and multiple output channels. In a specific embodiment, when an absolute difference between a write pointer and a read pointer of the image pyramid feature caching unit is the largest, the data read priority of the image pyramid feature caching unit is the highest.
In an embodiment, since a basic unit of a commonly used DDR is 32 pixels, in order to improve bus efficiency, each application should be an integer multiple of 32 pixel values, and an input of an image pyramid feature buffer unit is one pixel value, and when the image pyramid feature buffer unit receives 32 pixel values, a data read-out request is sent.
It should be understood that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same, and those skilled in the art can modify the technical solutions described in the above embodiments, or make equivalent substitutions for some technical features; and all such modifications and alterations are intended to fall within the scope of the appended claims.

Claims (21)

1. An image pyramid feature extraction circuit, comprising:
the parallel reusable read-in circuit comprises a parallel strategy device unit and a first SRAM group, wherein the parallel strategy device unit comprises a coordinate calculation unit, a coordinate sorting unit, a data extraction unit and a minimum coordinate updating unit;
the coordinate calculation unit receives a basic image frame and calculates the coordinates of N effective data lines of an M layer of an image pyramid in the basic image frame;
the coordinate sorting unit is used for sorting the coordinates of the N effective data lines in the basic image frame to obtain the minimum coordinate;
the data extraction unit is used for extracting two effective data lines in the basic image frame according to the minimum coordinate;
the minimum coordinate updating unit is used for updating the next coordinate of the image pyramid layer corresponding to the minimum coordinate according to a bilinear difference method;
the parallel policy unit continues to execute the coordinate sorting unit, and when the coordinates of the extracted effective data rows are the same as the coordinates of the currently selected effective data rows, only the non-extracted effective data rows are currently selected until the extraction of the N effective data rows is completed;
the first SRAM bank comprises N SRAMs for storing the N valid data rows;
the image pyramid feature calculation circuit comprises M image pyramid feature calculation units and is used for simultaneously calculating the features of M layers of the image pyramid according to the N effective data lines;
the image pyramid feature confluence output circuit is used for converging M x K image pyramid features output by the image pyramid feature calculation circuit and outputting the converged M x K image pyramid features to the memory;
wherein M, N is a positive integer greater than or equal to 2, and K is the number of feature channels of the image pyramid feature calculation unit.
2. The image pyramid feature extraction circuit of claim 1, wherein the coordinate calculation unit calculates coordinates of the N active data lines of the M layers of the image pyramid in the base image frame according to a bilinear interpolation method.
3. The image pyramid feature extraction circuit of claim 1 or 2, wherein the parallel reusable read-in module further comprises a second SRAM bank, the second SRAM bank comprising N SRAMs and performing ping-pong storage with the first SRAM bank for the N valid data rows.
4. The image pyramid feature extraction circuit of claim 1, wherein the image pyramid feature calculation circuit comprises a data application unit, a multi-way gate unit, M image pyramid calculation units, and M image pyramid feature calculation units:
the data application unit is used for applying the N effective data rows to the parallel reusable read-in module;
the multi-channel gate unit comprises N multi-channel gates and is used for distributing the N effective data rows applied by the data application unit to the M image pyramid calculation units;
the M image pyramid calculation units are used for simultaneously calculating data of M layers of corresponding rows of the image pyramid according to the N effective data rows;
and the M image pyramid feature calculation units are used for simultaneously calculating the features of the image pyramid according to the data of the M layers of corresponding rows of the image pyramid.
5. The image pyramid feature extraction circuit of claim 4, in which the multiplexer unit allocates different rows of valid data to the M image pyramid computation units according to the rows of valid data hit in the M layers of the image pyramid.
6. The image pyramid feature extraction circuit of claim 4, wherein the M image pyramid computation units are M bilinear difference computation units that simultaneously compute data of M-level corresponding lines of the image pyramid according to the valid data lines allocated by the multiplexer.
7. The image pyramid feature extraction circuit of claim 4, wherein the M image pyramid feature calculation units perform image pyramid feature calculation using a 5 x 5 feature extraction algorithm.
8. The image pyramid feature extraction circuit of claim 1, wherein the image pyramid feature confluence output circuit comprises an image pyramid feature caching unit, a data arbitrator unit, a priority updater unit, a cache label receiving unit, and an image pyramid feature receiving unit:
the image pyramid feature caching unit comprises M-K caching modules and is used for caching the M-K image pyramid features output by the image pyramid feature extraction module and sending a data reading request to the data arbiter unit;
the priority updater unit is configured to determine a read priority of the feature data of the image pyramid in the M × K cache modules;
the data arbiter unit is configured to return a read-away response to the highest priority cache module determined by the priority updater unit, and transmit image pyramid feature data in the highest priority cache module;
the cache label receiving unit is used for recording the label of the highest priority cache module and generating a writing address of the image pyramid characteristic according to the label of the cache module;
and the image pyramid feature receiving unit is used for receiving the image pyramid features transmitted by the data arbiter and outputting the image pyramid features to the memory according to the writing addresses of the image pyramid features.
9. The image pyramid feature extraction circuit of claim 8, in which the M x K buffer modules are M x K FIFOs.
10. The image pyramid feature extraction circuit of claim 9, in which the priority policy of the priority updater unit is: and when the absolute difference value between the write pointer and the read pointer of the M × K cache modules is maximum, the data read priority of the cache modules is highest.
11. The image pyramid feature extraction circuit of claim 8, in which a data read-away request is sent to the data arbiter unit when the cache module receives a bus of 32 pixel values.
12. An image pyramid feature extraction device, characterized in that the image pyramid feature extraction is completed by adopting the image pyramid feature extraction circuit according to any one of claims 1 to 11.
13. An image pyramid feature extraction method is characterized by comprising the following steps:
acquiring a basic image frame, and calculating coordinates of N effective data rows of an M layer of an image pyramid in the basic image frame; sorting the coordinates of the N effective data lines in the basic image frame to obtain a minimum coordinate; extracting two effective data lines in the basic image frame according to the minimum coordinates; updating the next coordinate of the image pyramid layer corresponding to the minimum coordinate according to a bilinear difference method; continuing to sequence and follow-up the latest coordinates, and when the coordinates of the extracted effective data rows are the same as the coordinates of the currently selected effective data rows, only selecting the non-extracted effective data rows at present until the extraction of the N effective data rows is completed;
step two, simultaneously calculating the characteristics of M layers of the image pyramid according to the N effective data lines;
step three, converging and outputting M x K image pyramid characteristics of the M layers of the image pyramid to a memory;
and M, N is a positive integer greater than or equal to 2, and K is the number of feature channels output by feature extraction in each layer of the M layers in the image pyramid.
14. The method of claim 13, wherein the coordinates of the N valid data lines of the M layers of the image pyramid in the base image frame are calculated according to a bilinear interpolation method.
15. The image pyramid feature extraction method of claim 13, wherein the second step specifically includes:
receiving the N valid data lines;
simultaneously calculating data of M layers of corresponding rows of the image pyramid according to the N effective data rows;
and simultaneously performing feature calculation of the image pyramid according to the data of the corresponding row of the M layers of the image pyramid.
16. The image pyramid feature extraction method of claim 15, wherein different valid data lines are allocated according to valid data lines hit in the M layers of the image pyramid.
17. The method of claim 15, wherein the feature of the image pyramid is calculated simultaneously according to a bilinear difference method.
18. The method of claim 15, wherein the image pyramid feature calculation is performed according to a 5 x 5 feature extraction algorithm.
19. The image pyramid feature extraction method of claim 13, wherein the third step specifically includes:
receiving the characteristics of M x K image pyramids of the M layers of the image pyramids, and storing the characteristics to the characteristic cache units of the M x K image pyramids;
generating the reading priority of the feature data of the M x K image pyramids according to the reading pointers and the writing pointers of the M x K image pyramid feature cache units;
acquiring image pyramid characteristic data with the highest read-out priority and a cache label thereof;
generating the image pyramid characteristic data writing address according to the cache label;
and writing the image pyramid characteristics into a memory according to the image pyramid characteristic data writing address until all the characteristics of the M x K image pyramids are written into the memory.
20. The image pyramid feature extraction method of claim 19, wherein when the feature caching unit of the image pyramid receives 32 pixel values, a data read-away request is sent.
21. The image pyramid feature extraction method of claim 19, wherein when an absolute difference between a write pointer and a read pointer of the image pyramid feature buffer unit is the largest, a data read-out priority of the image pyramid feature buffer unit is the highest.
CN201710107744.7A 2017-02-27 2017-02-27 Image pyramid feature extraction circuit, device and method Active CN106952215B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710107744.7A CN106952215B (en) 2017-02-27 2017-02-27 Image pyramid feature extraction circuit, device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710107744.7A CN106952215B (en) 2017-02-27 2017-02-27 Image pyramid feature extraction circuit, device and method

Publications (2)

Publication Number Publication Date
CN106952215A CN106952215A (en) 2017-07-14
CN106952215B true CN106952215B (en) 2020-02-28

Family

ID=59467774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710107744.7A Active CN106952215B (en) 2017-02-27 2017-02-27 Image pyramid feature extraction circuit, device and method

Country Status (1)

Country Link
CN (1) CN106952215B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019041265A1 (en) * 2017-08-31 2019-03-07 深圳市大疆创新科技有限公司 Feature extraction circuit and integrated image processing circuit
CN116701921B (en) * 2023-08-08 2023-10-20 电子科技大学 Multi-channel time sequence signal self-adaptive noise suppression circuit

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102760053A (en) * 2012-06-20 2012-10-31 东南大学 Human body detection method based on CUDA (Compute Unified Device Architecture) parallel calculation and WCF framework
CN103839066A (en) * 2014-03-13 2014-06-04 中国科学院光电技术研究所 Feature extraction method based on biological vision
CN103914874A (en) * 2014-04-08 2014-07-09 中山大学 Compact SFM three-dimensional reconstruction method without feature extraction
CN105184824A (en) * 2015-09-30 2015-12-23 重庆师范大学 Intelligent agricultural bird repelling system and method based on image sensing network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102024867B1 (en) * 2014-09-16 2019-09-24 삼성전자주식회사 Feature extracting method of input image based on example pyramid and apparatus of face recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102760053A (en) * 2012-06-20 2012-10-31 东南大学 Human body detection method based on CUDA (Compute Unified Device Architecture) parallel calculation and WCF framework
CN103839066A (en) * 2014-03-13 2014-06-04 中国科学院光电技术研究所 Feature extraction method based on biological vision
CN103914874A (en) * 2014-04-08 2014-07-09 中山大学 Compact SFM three-dimensional reconstruction method without feature extraction
CN105184824A (en) * 2015-09-30 2015-12-23 重庆师范大学 Intelligent agricultural bird repelling system and method based on image sensing network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于GPU的图像特征提取加速算法;陈鹏;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150315(第3期);I138-2669 *
针对图像特征提取算法的并行加速研究;杨冬蕾;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150315(第3期);I138-2444 *

Also Published As

Publication number Publication date
CN106952215A (en) 2017-07-14

Similar Documents

Publication Publication Date Title
CN104881666B (en) A kind of real-time bianry image connected component labeling implementation method based on FPGA
CN108681984B (en) Acceleration circuit of 3*3 convolution algorithm
US11775430B1 (en) Memory access for multiple circuit components
US11430134B2 (en) Hardware-based optical flow acceleration
Lu et al. A resource-efficient pipelined architecture for real-time semi-global stereo matching
CN111931918B (en) Neural network accelerator
CN112149795A (en) Neural architecture for self-supervised event learning and anomaly detection
CN110390382B (en) Convolutional neural network hardware accelerator with novel feature map caching module
CN107748723B (en) Storage method and access device supporting conflict-free stepping block-by-block access
CN111861883B (en) Multi-channel video splicing method based on synchronous integral SURF algorithm
CN106952215B (en) Image pyramid feature extraction circuit, device and method
CN109658337A (en) A kind of FPGA implementation method of image real-time electronic racemization
CN104503731A (en) Quick identification method for binary image connected domain marker
CN111294520B (en) FPGA-based real-time lucky imaging method and system
CN103793873A (en) Obtaining method and device for image pixel mid value
WO2019136761A1 (en) Three-dimensional convolution device for recognizing human action
Guo et al. DSCA: A Dual Semantic Correlation Alignment Method for domain adaptation object detection
WO2023184754A1 (en) Configurable real-time disparity point cloud computing apparatus and method
WO2021070303A1 (en) Computation processing device
CN113900813B (en) Blind pixel filling method, system and device based on double-port RAM
CN111191780B (en) Averaging pooling accumulation circuit, device and method
CN108460784A (en) A kind of gray scale and bianry image dilation erosion processing method based on FPGA
CN115601223A (en) Image preprocessing device, method and chip
CN112035056B (en) Parallel RAM access equipment and access method based on multiple computing units
CN103093485A (en) Full view video cylindrical surface image storage method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant