CN103546752B - A kind of picture size based on hardware concurrent framework compression traversal method - Google Patents

A kind of picture size based on hardware concurrent framework compression traversal method Download PDF

Info

Publication number
CN103546752B
CN103546752B CN201310482288.6A CN201310482288A CN103546752B CN 103546752 B CN103546752 B CN 103546752B CN 201310482288 A CN201310482288 A CN 201310482288A CN 103546752 B CN103546752 B CN 103546752B
Authority
CN
China
Prior art keywords
row
compression
address
data
generating module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310482288.6A
Other languages
Chinese (zh)
Other versions
CN103546752A (en
Inventor
徐向民
陈晓仕
吴岱玲
黄帅凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201310482288.6A priority Critical patent/CN103546752B/en
Publication of CN103546752A publication Critical patent/CN103546752A/en
Application granted granted Critical
Publication of CN103546752B publication Critical patent/CN103546752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)
  • Image Input (AREA)

Abstract

The present invention discloses a kind of picture size based on hardware concurrent framework compression traversal method, comprises the steps of image storage complete, and address generating module one calculates next line address, and the view data in corresponding RAM reads into buffer area;Address generating module two array of compressed address simultaneously, corresponding buffer data exports in the output caching of Nth row, and N circulates in 1 to M+K, until completing the squeeze operation of a line;Writing the output caching of full M row when first time, address generating module three is started working;When compression stage is constant, address generating module three carries out output function to the data exporting caching from N M row to this M row of N 1 row, and reads data from its output caching, until the corresponding line width that this data length is current compression level;During compression stage change, the operating pointer corresponding to address generating module three points to lower K row;Repeat the above steps has traveled through to compression of images.The method of the present invention, it takies, and resource is few, the speed of service is fast.

Description

A kind of picture size based on hardware concurrent framework compression traversal method
Technical field
The present invention relates to target characteristic and search technical field, be specifically related to a kind of figure based on hardware concurrent framework As size compression traversal method.
Background technology
Along with the increase in demand of machine intelligence, the demand of man-machine interactive system is also constantly being improved by people. Wherein, systemic resolution is an important technical specification.The process of big data quantity is existing intelligence in real time The main feature of man-machine interactive system, is also the technical bottleneck limiting high resolution system development.
In target detection technique field, there are two technique directions: one is the detection of specific objective, such as Identify whether the people on image is " Zhang San ";Another one is that target is searched, such as, find in a sub-picture Going out to have how many " people ", they are where.It is right that the maximum difference of two technique directions is that the latter needs Image carry out size compression (adapting to different size of target) and traversal (each of search graph picture may go out The place of existing target).
For the latter, picture size compression traversal technology refers to large-sized image to enter according to specific ratio The compression of row size, and the treatment technology of the image traversal operation through row certain window size.Picture size pressure Contracting traversal technology is front end preconditioning technique important in algorithm of target detection.Compression, the performance of traversal method Directly affect the speed of back-end processing and the scale of whole system.
Target detection technique needs to process substantial amounts of data, and the requirement of real-time of field of human-computer interaction is the biggest The development of technology is constrained in degree.Existing technical conditions solve this problem the most in two ways: one Kind it is to reduce algorithm complex, reaches real time handling requirement for cost sacrificing resolution;Another is to adopt It is accelerated by parallel architecture, obtains processing speed and higher resolution faster for cost sacrificing resource.
The main flow way being accelerated picture size compression and traversal with parallel architecture has three kinds.
First method is that entire image is stored in depositor, presses line by line generating the method for address Contracting, by using linebuffer buffering to travel through.Linebuffer is used to carry out doing of buffering traversal Method is, it is assumed that traversal window size is 20*20, and original image size is 640*480, then exampleization 20 is a length of The buffer of 640, data enter from the first row, carry out shifting function, and the data of the first row export and enter Second row, and so on, then when the buffer of 20 row fills up when, module exports the number of 20 simultaneously According to, these data are all at same string, it is seen that often one data of output move one equal to window, complete time The function gone through.
Second method is to carry out pixel selection at data entry process, and compression image is stored depositor In, only the image after storage compression, carries out traversal output by the method generating address.
The third method is that entire image is cached in RAM, reads image and be compressed operation from RAM, Image is stored back in RAM, and compression image reads into from RAM and carries out traversing operation in linebuffer.
No matter three of the above method, use any method, has the biggest restriction in the middle of reality application.
First, it is the consumption in resource: first method needs consume depositing of 640* (480+20) * 8 Device, the consumption of these register resources directly correspond to the expansion of system area;Second method is assumed pressure The contracting factor is 1.25, starts to the tenth grade from the third level, and required memory space is 640*480*0.7*8 Depositor, less than what first method was used, but owing to being compressed in front end, buffer areas at different levels need To promote through single address arithmetic, algorithm complex, to have improve the consumption of logical resource;3rd Because employing external memory storage in the method for kind, so the consumption of register resources is the least, advantage of lower cost.
Next to that the restriction in speed: second method generates the mode that address carries out traveling through, reads every time One data, will read 20 data, needs to consume 20 clock cycle, and this is bigger than first method The expense of the big time that adds;In the third method, owing to needing image to be carried out reading repeatedly and writes Enter, restrict the data width in external memory storage, slower than second method in speed, about the second side 1/8 (8 compression stages) of method.
So, while the fewest use hardware resource, the biggest raising speed is image chi The difficult point place of very little compression traversal technology.
Summary of the invention
It is an object of the invention to the shortcoming overcoming prior art with not enough, it is provided that a kind of based on hardware concurrent frame The picture size compression traversal method of structure, it takies, and resource is few, the speed of service is fast.
The purpose of the present invention is realized by following technical scheme:
A kind of picture size based on hardware concurrent framework compression traversal method, comprises the step of following sequence:
(1) during raw image data stores RAM (random access memory), when the storage of the first row data is complete Afterwards, the first row relief area is read into;
(2) after entire image storage is complete, squeeze operation is proceeded by;
(3) address generating module one calculates next line address according to compressibility factor and compression stage, and address is corresponding RAM in view data read into the second row cache district or the first row buffer area;
(4) while carrying out step (3), address generating module two start the first row buffer area or Second row cache district operates, and calculates array of compressed address according to compressibility factor and compression stage, and address is corresponding Buffer data export Nth row output caching in, N circulates in 1 to M+K;
(5) repetition step (4) is until completing the squeeze operation of a line, often completes the operation of a line, next The compression data of row export N+1 row, if N+K is more than M+K, then N is equal to 1;
(6) repeating step (3) to step (5), until writing the output caching of full M row for the first time, is produced from address Raw module three is started working;
(7) in the case of compression stage is constant, address generating module three is to from this M row of N-M row to N-1 row The data of output caching carry out output function, output that wherein N is operating in being step (4) caching is compiled Number, when N-M is less than 1, N-M is equal to N+K, and when N is equal to 1, N-1 is equal to M+K, simultaneously from the output of this M row Caching reads data, until the line width that a length of current compression level of reading data is corresponding;
(8) in the case of compression stage changes, owing to compression stage change causes often row pixel to tail off, step is worked as After (5) complete Nth row operation suddenly, step (7) does not completes the output to this M row of N-M to N-1 and delays The process of deposit data, if so step (7) does not completes, then after step (5) completes Nth row operation, Proceed step (4) and step (5), N+1 row is operated, the M row output of operation in step (7) The numbering of caching is constant, if step (7) completes, then waiting step (5) completes;
(9) repeat step (3) and arrive step (8), until a sub-picture compression traversal completes.
Described step (3), particularly as follows: address generating module one is according to compressibility factor a and current compression level C calculates the address of next line corresponding to current compression level, and extracts from RAM and be stored in buffer area, Computational methods are: set the compression having been carried out n row, and the row address calculating the (n+1)th row is floor (n*ac), Floor function representation rounds downwards.
Described step (4), particularly as follows: address generating module two is according to compressibility factor a and current compression Level c calculates the address of next column corresponding to current compression level, and extracts data out from buffer area, calculates Method is: set the compression having been carried out n row, and the column address calculating the (n+1)th row is floor (n*ac), Floor function representation rounds downwards.
The present invention compared with prior art, has the advantage that and beneficial effect:
1, necessary hardware resource is few, cost is relatively low: the hardware structure of traversal method of the present invention includes one piece of off-chip RAM, two row cache districts, a data selector, M+K row output caching, three address generating module, K is determined by compressibility factor a, and K rounds up equal to a, and such as compressibility factor a is 5/4, and K is equal to 2;M Being determined by cycling among windows height, window size is 20*25, then M=25;The size of buffer area is long by original image Degree determines, such as picture size is 640*480, then a length of size of buffer area is 640;Output buffer storage length Being determined by the reduction length of the first order, such as compressibility factor is 1.25, and original image a length of 640, by the third level Start compression, then output buffer storage length is 640/ (1.25)3=327 (rounding downwards).
2, the speed of service is fast: the present invention can carry out the squeeze operation of specified compression ratio also to a sub-picture size Continual window traversing operation can be carried out in one stage of compression, only need to wait for during two stages of compression saltus step Very short time can carry out the traversing operation of next stage compression image, and the speed of service is very fast.
Accompanying drawing explanation
Fig. 1 is the hard of a kind of picture size based on hardware concurrent framework of the present invention compression traversal method Part Organization Chart;
Fig. 2 is the address generating module one of method described in Fig. 1 and address generating module two is collaborative carries out image pressure The operation chart of contracting;
Fig. 3 is the traversing operation figure of method described in Fig. 1.
Detailed description of the invention
A kind of picture size based on hardware concurrent framework compression traversal method, comprises the step of following sequence:
(1) during raw image data stores RAM (random access memory), when the storage of the first row data is complete Afterwards, the first row relief area is read into;
(2) after entire image storage is complete, squeeze operation is proceeded by;
(3) address generating module one calculates next line address according to compressibility factor and compression stage, and address is corresponding RAM in view data read into the second row cache district or the first row buffer area, particularly as follows: address produce Module one calculates the address of next line corresponding to current compression level according to compressibility factor a and current compression level c, And extract from RAM and to be stored in buffer area, computational methods are: set the compression having been carried out n row, The row address calculating the (n+1)th row is floor (n*ac), and floor function representation rounds downwards;
(4) while carrying out step (3), address generating module two start the first row buffer area or Second row cache district operates, and calculates array of compressed address according to compressibility factor and compression stage, and address is corresponding Buffer data export Nth row output caching in, N circulate in 1 to M+K, particularly as follows: address product Raw module two calculates the ground of next column corresponding to current compression level according to compressibility factor a and current compression level c Location, and extract data out from buffer area, computational methods are: set the compression having been carried out n row, calculate The column address of the (n+1)th row is floor (n*ac), and floor function representation rounds downwards;
(5) repetition step (4) is until completing the squeeze operation of a line, often completes the operation of a line, next The compression data of row export N+1 row, if N+K is more than M+K, then N is equal to 1;
(6) repeating step (3) to step (5), until writing the output caching of full M row for the first time, is produced from address Raw module three is started working;
(7) in the case of compression stage is constant, address generating module three is to from this M row of N-M row to N-1 row The data of output caching carry out output function, output that wherein N is operating in being step (4) caching is compiled Number, when N-M is less than 1, N-M is equal to N+K, and when N is equal to 1, N-1 is equal to M+K, simultaneously from the output of this M row Caching reads data, until the line width that a length of current compression level of reading data is corresponding;
(8) in the case of compression stage changes, owing to compression stage change causes often row pixel to tail off, step is worked as After (5) complete Nth row operation suddenly, step (7) does not completes the output to this M row of N-M to N-1 and delays The process of deposit data, if so step (7) does not completes, then after step (5) completes Nth row operation, Proceed step (4) and step (5), N+1 row is operated, the M row output of operation in step (7) The numbering of caching is constant, if step (7) completes, then waiting step (5) completes;
(9) repeat step (3) and arrive step (8), until a sub-picture compression traversal completes.
As Fig. 1 is the hardware structure figure that picture size based on hardware concurrent framework compresses traversal method, wherein RAM (random access memory), address generating module one, address generating module two, buffer area one, buffer area two With a data selector pie graph as compression function module;Numbering and cache from the output of 1 to M+K, is produced from address Raw module two and address generating module three constitute image traversal functional module;All have between three address modules The communication of control signal.
Below in conjunction with Fig. 2, image compression operation step is illustrated:
Step 1 a: sub-picture is complete when storing in RAM, and address generating module works at the beginning, address produces Module one operating pointer points to buffer area one, waits that in RAM, the data of the first row deposit into current address generation module Buffer area pointed by one operating pointer is complete;
Step 2: after the first row storage, address generating module one produces the address of next line, and corresponding Operating pointer point to buffer area two;Meanwhile, address generating module one sends caching and completes signal to address product Raw module two, address generating module two is started working;
Step 3: address generating module two operating pointer points to buffer area one, and (current address generation module one operates Pointer points to buffer area two), address generating module two calculates array of compressed address according to current compression level situation, from Buffer area pointed by current address generation module two operating pointer reads corresponding column data;
Step 4: data selector points to what select finger pointed to by current address operation module 2 operating pointer The output data of buffer area export;
Step 5: after address generating module two completes the squeeze operation of a line, respective operations pointer points to another Individual buffer area, sends compression simultaneously and completes signal to address generating module one, under address generating module one produces The address of a line, corresponding operating pointer points to another buffer area;
Step 6: repetition step 2, to step 5, completes the multilevel size squeeze operation of piece image;
Wherein address generating module one operating pointer and address generating module two operating pointer point to different forever Buffer area, so generation module operation in two-address is relatively independent, functioning in parallel, make data output free of discontinuities.
Below in conjunction with Fig. 3, image traversal operating procedure is illustrated:
Step 1: address generating module two produces output caching write ground while being compressed operating procedure 4 Location, being write in the output caching that presently written pointer points to by data selector data Data2 out;
Step 2: repeat step 1, after completing the write of a line, address generating module two is to address generating module three Sending write and complete signal, write pointer points to next line output caching, when presently written pointer points to M+K During row, after completing write operation, write pointer points to the first row;
Step 3: when data write full M row for the first time, address generating module three is started working, and traversal pointer refers to To M row, the data more than M row including M row M row altogether are operated by address generating module San Tong method;
Step 4: address generating module three calculates traversal OPADD, from the figure that 1 to current compression level is corresponding As length, from above-mentioned M row output caching, read out the view data after M compression, if window size is N*M, A window data is constituted, with front N-1 data structure when of the N+1 data output after then exporting N number of data Becoming a window, repeating this step until exporting a length of current compression level correspondence image length to complete a line window Mouthful traversing operation, after waiting that address generating module two write signal, traversal pointer sensing next line or K row under person;
Step 4 wherein travels through pointer and points to selection point two kinds: in the case of compression stage is constant, as the most just Write data Data2 with before write data be in same compression stage compression image number According to, then traversal pointer points to next line;In the case of compression stage changes, such as data Data2 being currently written into Corresponding compression stage with currently travel through the compression stage corresponding to pointer indication data different in the case of, then Traversal pointer points to lower K row.
Above-described embodiment is the present invention preferably embodiment, but embodiments of the present invention are not by above-mentioned reality Execute the restriction of example, the change made under other any spirit without departing from the present invention and principle, modification, Substitute, combine, simplify, all should be the substitute mode of equivalence, within being included in protection scope of the present invention.

Claims (3)

1. picture size based on a hardware concurrent framework compression traversal method, comprises the step of following sequence:
(1), during raw image data stores RAM, after the storage of the first row data is complete, the is read into One row buffer;
(2) after entire image storage is complete, squeeze operation is proceeded by;
(3) address generating module one calculates next line address, address pair according to compressibility factor a and compression stage c View data in the RAM answered reads into the second row cache district or the first row buffer area;
(4) while carrying out step (3), address generating module two start the first row buffer area or Second row cache district operates, and calculates array of compressed address, address pair according to compressibility factor a and compression stage c The buffer data answered exports in the output caching of Nth row, and N circulates in 1 to M+K;
(5) repetition step (4) is until completing the squeeze operation of a line, often completes the operation of a line, next The compression data of row export N+1 row, if N+K is more than M+K, then N is equal to 1;
(6) repeating step (3) to step (5), until writing the output caching of full M row for the first time, is produced from address Raw module three is started working;
(7) in the case of compression stage c is constant, address generating module three is to from this M row of N-M row to N-1 row The data of output caching carry out output function, output that wherein N is operating in being step (4) caching is compiled Number, when N-M is less than 1, N-M is equal to N+K, and when N is equal to 1, N-1 is equal to M+K, simultaneously from the output of this M row Caching reads data, until the line width that a length of current compression level c of reading data is corresponding;
(8) in the case of compression stage c changes, owing to compression stage c change causes often row pixel to tail off, when After completing Nth row operation in step (5), step does not completes the output to this M row of N-M to N-1 in (7) Data cached process, if so step (7) does not completes, then after step (5) completes Nth row operation, Proceed step (4) and step (5), N+1 row is operated, the M row output of operation in step (7) The numbering of caching is constant, if step (7) completes, then waiting step (5) completes;
(9) repeat step (3) and arrive step (8), until a sub-picture compression traversal completes.
Picture size based on hardware concurrent framework the most according to claim 1 compression traversal method, it is special Levying and be, described step (3), particularly as follows: address generating module one according to compressibility factor a and is currently pressed Contracting level c calculates the address of next line corresponding to current compression level, and extracts from RAM and be stored in buffer area In, computational methods are: set the compression having been carried out n row, and the row address calculating the (n+1)th row is floor (n*ac), floor function representation rounds downwards.
Picture size based on hardware concurrent framework the most according to claim 1 compression traversal method, its It is characterised by, described step (4), particularly as follows: address generating module two according to compressibility factor a and is worked as Front compression stage c calculates the address of next column corresponding to current compression level, and extracts data out from buffer area, Computational methods are: set the compression having been carried out n row, and the column address calculating the (n+1)th row is floor (n*ac), Floor function representation rounds downwards.
CN201310482288.6A 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method Active CN103546752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310482288.6A CN103546752B (en) 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310482288.6A CN103546752B (en) 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method

Publications (2)

Publication Number Publication Date
CN103546752A CN103546752A (en) 2014-01-29
CN103546752B true CN103546752B (en) 2016-10-05

Family

ID=49969746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310482288.6A Active CN103546752B (en) 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method

Country Status (1)

Country Link
CN (1) CN103546752B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2775974Y (en) * 2004-12-31 2006-04-26 北京中星微电子有限公司 DMA controller for Mpeg-4 movement evaluation method
CN101753950A (en) * 2008-11-28 2010-06-23 深圳迈瑞生物医疗电子股份有限公司 Method for processing ultrasonic image frame interpolation and ultrasonic system thereof
CN102117326A (en) * 2011-02-28 2011-07-06 华南理工大学 Traversal method used for searching image features
CN102263880A (en) * 2010-05-25 2011-11-30 安凯(广州)微电子技术有限公司 Image scaling method and apparatus thereof
CN102333212A (en) * 2010-07-14 2012-01-25 北京大学 Bilinear two-fold upsampling method and system thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW594743B (en) * 2001-11-07 2004-06-21 Fujitsu Ltd Memory device and internal control method therefor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2775974Y (en) * 2004-12-31 2006-04-26 北京中星微电子有限公司 DMA controller for Mpeg-4 movement evaluation method
CN101753950A (en) * 2008-11-28 2010-06-23 深圳迈瑞生物医疗电子股份有限公司 Method for processing ultrasonic image frame interpolation and ultrasonic system thereof
CN102263880A (en) * 2010-05-25 2011-11-30 安凯(广州)微电子技术有限公司 Image scaling method and apparatus thereof
CN102333212A (en) * 2010-07-14 2012-01-25 北京大学 Bilinear two-fold upsampling method and system thereof
CN102117326A (en) * 2011-02-28 2011-07-06 华南理工大学 Traversal method used for searching image features

Also Published As

Publication number Publication date
CN103546752A (en) 2014-01-29

Similar Documents

Publication Publication Date Title
CN109598338B (en) Convolutional neural network accelerator based on FPGA (field programmable Gate array) for calculation optimization
CN108681984B (en) Acceleration circuit of 3*3 convolution algorithm
CN105872432B (en) The apparatus and method of quick self-adapted frame rate conversion
US10257456B2 (en) Hardware friendly virtual frame buffer
CN106021182B (en) A kind of row transposition architecture design method based on Two-dimensional FFT processor
CN103151015A (en) Overdrive method, circuit, display panel and display device
CN208766715U (en) The accelerating circuit of 3*3 convolution algorithm
CN102509071B (en) Optical flow computation system and method
CN103647937A (en) An image tracking system and an image data processing method thereof
CN110390382B (en) Convolutional neural network hardware accelerator with novel feature map caching module
CN113792621B (en) FPGA-based target detection accelerator design method
CN103760525A (en) Completion type in-place matrix transposition method
US20140225902A1 (en) Image pyramid processor and method of multi-resolution image processing
CN101599167B (en) Access method of memory
CN101426134A (en) Hardware device and method for video encoding and decoding
CN103106412B (en) Flaky medium recognition methods and recognition device
CN103389413A (en) Real-time statistical method for frequency spectrum histogram
CN104869284A (en) High-efficiency FPGA implementation method and device for bilinear interpolation amplification algorithm
CN103546752B (en) A kind of picture size based on hardware concurrent framework compression traversal method
CN117115200A (en) Hierarchical data organization for compact optical streaming
CN101796845A (en) Device for motion search in dynamic image encoding
CN105204799A (en) Method for increasing display refreshing rate of multi-channel deep memory logic analyzer
CN112837337B (en) Method and device for identifying connected region of massive pixel blocks based on FPGA
CN1297899C (en) Digital images matching chip
CN103108162A (en) High-definition high-frame-rate real-time video anti-reflection instrument

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant