CN103546752A - Image size compression traversing method on basis of hardware parallel architecture - Google Patents

Image size compression traversing method on basis of hardware parallel architecture Download PDF

Info

Publication number
CN103546752A
CN103546752A CN201310482288.6A CN201310482288A CN103546752A CN 103546752 A CN103546752 A CN 103546752A CN 201310482288 A CN201310482288 A CN 201310482288A CN 103546752 A CN103546752 A CN 103546752A
Authority
CN
China
Prior art keywords
row
address
data
generating module
compression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310482288.6A
Other languages
Chinese (zh)
Other versions
CN103546752B (en
Inventor
徐向民
陈晓仕
吴岱玲
黄帅凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201310482288.6A priority Critical patent/CN103546752B/en
Publication of CN103546752A publication Critical patent/CN103546752A/en
Application granted granted Critical
Publication of CN103546752B publication Critical patent/CN103546752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)
  • Image Input (AREA)

Abstract

The invention discloses an image size compression traversing method on the basis of a hardware parallel architecture. The image size compression traversing method includes steps storing images, enabling a first address generating module to generate a next-row address and reading image data in cache regions of a second row/a first row in a corresponding RAM (random access memory); enabling a second address generating module to generate compression column addresses of the first row/the second row, and outputting data in the corresponding cache regions to an output cache of an N<th> row; enabling a third address generating module to start working when output caches of M rows are fully written for the first time; enabling the third address generating module to output data of output caches of M rows from an N-M<th> row to an N-1<th> row when a compression stage is unchanged, reading the data in the output caches until the length of the data is equal to a corresponding row width of the current compression stage; enabling an operation pointer corresponding to the third address generating module to point to next K rows when the compression stage is changed; repeatedly carrying out the steps until image compression traversing is completed. The N is circularly equal to a number in the range from 1 to M+K until compression operation on a corresponding row is completed. The image size compression traversing method has the advantages of low quantity of occupied resources and high running speed.

Description

A kind of picture size compression traversal method based on hardware parallel architecture
Technical field
The present invention relates to target signature and search technical field, be specifically related to a kind of picture size compression traversal method based on hardware parallel architecture.
Background technology
Along with the increase in demand of machine intelligence, people to the demand of man-machine interactive system also in continuous raising.Wherein, systemic resolution is an important technical indicator.The processing of big data quantity is the main feature of existing intelligent human-machine interaction system in real time, is also a technical bottleneck of restriction high resolution system development.
In target detection technique field, have two technique directions: one is the detection of specific objective, for example whether the people on recognition image is " Zhang San "; Another one is that target is searched, for example, in a sub-picture, find out how many " people ", and where they are.The maximum difference of two technique directions is that the latter need to carry out size compression (target that adapts to different sizes) and traversal (place of target may appear in each of searching image) to image.
For the latter, picture size compression traversal technology refers to and large-sized image carried out to the compression of size according to specific ratio, and the treatment technology operating through the image traversal of row certain window size.Picture size compression traversal technology is front end preconditioning technique important in algorithm of target detection.The performance of compression, traversal method directly affects the speed of back-end processing and the scale of whole system.
Target detection technique need to be processed a large amount of data, and the requirement of real-time of field of human-computer interaction has restricted the development of technology to a great extent.Existing technical conditions mainly address this problem with two kinds of methods: a kind of is to reduce algorithm complex, take to sacrifice resolution and reach real-time processing requirements as cost; Another is to adopt parallel architecture to accelerate, and take to sacrifice resource and obtain processing speed and higher resolution faster as cost.
The main flow way of accelerating picture size compression and traversal with parallel architecture has three kinds.
First method is that entire image is stored in register, compresses line by line, by using linebuffer buffering to travel through with the method for calculated address.Use linebuffer to cushion the way traveling through and be, suppose that traversal window size is 20*20, original image size is 640*480, the buffer that 20 length of exampleization are 640, data enter from the first row, carry out shifting function, and the second row is exported and entered to the data of the first row, so analogize,, when the buffer of 20 row fills up, module is exported the data of 20 simultaneously, and these data are all at same row, can find out that data of every output equal window and move one, complete the function of traversal.
Second method is to carry out pixel selection in data input process, compressed image stored in register, and the image after a store compressed, the method by calculated address travels through output.
The third method is that entire image is cached in RAM, reads image and carry out squeeze operation from RAM, and image is deposited back in RAM, and compressed image reads into and in linebuffer, carries out traversing operation from RAM.
Above three kinds of methods no matter adopt any method, have very large restriction in the middle of practical application.
First, be the consumption in resource: in first method, need to consume 640*(480+20) * 8 register, the consumption of these register resources is corresponding the expansion of system area directly; In second method, suppose that compressibility factor is 1.25, from the third level, start to the tenth grade, needed memory space is the register of 640*480*0.7*8, that than first method, uses is little, but owing to compressing at front end, buffer areas at different levels need to pass through independent address arithmetic, and algorithm complex promotes to some extent, has improved the consumption of logical resource; In the third method, because used external memory storage, so the consumption of register resources is very little, cost is relatively low.
Next is the restriction in speed: the mode that in second method, calculated address travels through, read data at every turn, and read 20 data, need to consume 20 clock cycle, this will increase the expense of time greatly than first method; In the third method, owing to need to image being carried out to reading and writing repeatedly, restrict the data width in external memory storage, slower than second method in speed, be about 1/8(8 compression stage of second method).
So how large as far as possible raising speed is the difficult point place of picture size compression traversal technology when the least possible use hardware resource.
Summary of the invention
The shortcoming that the object of the invention is to overcome prior art, with not enough, provides a kind of picture size compression traversal method based on hardware parallel architecture, and it takies, and resource is few, the speed of service is fast.
Object of the present invention realizes by following technical scheme:
A picture size compression traversal method for hardware parallel architecture, the step that comprises following order:
(1) raw image data stores RAM(random asccess memory into) in, after the storage of the first row data is complete, read into the first row buffering area;
(2) after entire image storage is complete, start to carry out squeeze operation;
(3) address generating module one calculates next line address according to compressibility factor and compression stage, and the view data in RAM corresponding to address reads into the buffer area of the second row/the first row;
(4) when carrying out step (3), address generating module two starts the first row/the second line operate of advancing, according to compressibility factor and compression stage, calculate array of compressed address, buffer data corresponding to address outputs in the output buffer memory that N is capable, and N circulates in 1 to M+K;
(5) repeating step (4) until complete the squeeze operation of a line, often completes the operation of a line, and it is capable that the packed data of next line outputs to N+1, if N+K is greater than M+K, N equals 1;
(6) repeating step (3) is to step (5), until write for the first time the output buffer memory that full M is capable, address generating module three is started working;
(7) in the constant situation of compression stage, three pairs of address generating module are carried out output function from the capable data to the capable output buffer memory of capable this M of N-1 of N-M, wherein N is the output buffer memory numbering operating in step (4), when N-M is less than 1, N-M equals N+K, when N equals 1, N-1 equals M+K, and while sense data from the capable output buffer memory of this M, until the length of sense data is the line width that current compression stage is corresponding;
(8) in the situation that compression stage changes, because changing, compression stage cause every row pixel to tail off, in step (5), complete after N line operate, in step (7), also do not complete N-M to the capable data cached processing of output of this M of N-1, so if step (7) does not complete, in step (5), complete after N line operate, proceed step (4) and step (5), to the N+1 line operate of advancing, in step (7), the numbering of the M line output buffer memory of operation is constant, if step (7) completes, waiting step (5) completes;
(9) repeating step (3) is to step (8), until a sub-picture compression has traveled through.
Described step (3), be specially: the address that address generating module one calculates according to compressibility factor a and current compression stage c the next line that current compression stage is corresponding, and from RAM, extract and be stored in buffer area, computational methods are: establish and carried out the capable compression of n, calculating the row address that n+1 is capable is floor(n*ac), floor function representation rounds downwards.
Described step (4), be specially: the address that address generating module two calculates according to compressibility factor a and current compression stage c the next column that current compression stage is corresponding, and from buffer area, extract data out, computational methods are: establish the compression of having carried out n row, the column address of calculating n+1 row is floor(n*ac), floor function representation rounds downwards.
Compared with prior art, tool has the following advantages and beneficial effect in the present invention:
1, required hardware resource is few, cost is lower: the hardware structure of traversal method of the present invention comprises the outer RAM of a sheet, two row cache districts, a data selector, M+K line output buffer memory, three address generating module, K determines by compressibility factor a, K equals a and rounds up, for example compressibility factor a is that 5/4, K equals 2; M determines by cycling among windows height, and window size is 20*25, M=25; The size of buffer area determines by original image length, and for example picture size is 640*480, and buffer area length is that size is 640; Output buffer storage length determines by the reduction length of the first order, and for example compressibility factor is 1.25, and original image length is 640, by the third level, starts compression, and exporting buffer storage length is 640/(1.25) 3=327(rounds downwards).
2, the speed of service is fast: the present invention can carry out the squeeze operation of specified compression ratio and can in one-level compression, carry out continual window traversing operation a sub-picture size, in two stages of compression saltus step process, only need the wait utmost point short time can carry out the traversing operation of next stage compressed image, the speed of service is very fast.
Accompanying drawing explanation
Fig. 1 is the hardware structure figure of a kind of picture size compression traversal method based on hardware parallel architecture of the present invention;
Fig. 2 is address generating module one and the collaborative operation chart of carrying out image compression of address generating module two of method described in Fig. 1;
Fig. 3 is the traversing operation figure of method described in Fig. 1.
Embodiment
A picture size compression traversal method for hardware parallel architecture, the step that comprises following order:
(1) raw image data stores RAM(random asccess memory into) in, after the storage of the first row data is complete, read into the first row buffering area;
(2) after entire image storage is complete, start to carry out squeeze operation;
(3) address generating module one calculates next line address according to compressibility factor and compression stage, view data in RAM corresponding to address reads into the buffer area of the second row/the first row, be specially: the address that address generating module one calculates according to compressibility factor a and current compression stage c the next line that current compression stage is corresponding, and from RAM, extract and be stored in buffer area, computational methods are: establish and carried out the capable compression of n, calculating the row address that n+1 is capable is floor(n*ac), floor function representation rounds downwards;
(4) when carrying out step (3), address generating module two starts the first row/the second line operate of advancing, according to compressibility factor and compression stage, calculate array of compressed address, buffer data corresponding to address outputs in the output buffer memory that N is capable, N circulates in 1 to M+K, be specially: the address that address generating module two calculates according to compressibility factor a and current compression stage c the next column that current compression stage is corresponding, and from buffer area, extract data out, computational methods are: establish the compression of having carried out n row, the column address of calculating n+1 row is floor(n*ac), floor function representation rounds downwards,
(5) repeating step (4) until complete the squeeze operation of a line, often completes the operation of a line, and it is capable that the packed data of next line outputs to N+1, if N+K is greater than M+K, N equals 1;
(6) repeating step (3) is to step (5), until write for the first time the output buffer memory that full M is capable, address generating module three is started working;
(7) in the constant situation of compression stage, three pairs of address generating module are carried out output function from the capable data to the capable output buffer memory of capable this M of N-1 of N-M, wherein N is the output buffer memory numbering operating in step (4), when N-M is less than 1, N-M equals N+K, when N equals 1, N-1 equals M+K, and while sense data from the capable output buffer memory of this M, until the length of sense data is the line width that current compression stage is corresponding;
(8) in the situation that compression stage changes, because changing, compression stage cause every row pixel to tail off, in step (5), complete after N line operate, in step (7), also do not complete N-M to the capable data cached processing of output of this M of N-1, so if step (7) does not complete, in step (5), complete after N line operate, proceed step (4) and step (5), to the N+1 line operate of advancing, in step (7), the numbering of the M line output buffer memory of operation is constant, if step (7) completes, waiting step (5) completes;
(9) repeating step (3) is to step (8), until a sub-picture compression has traveled through.
If Fig. 1 is the hardware structure figure of the picture size compression traversal method based on hardware parallel architecture, RAM(random asccess memory wherein), address generating module one, address generating module two, buffer area one, buffer area two and a data selector composing images compression function module; The output buffer memory of numbering from 1 to M+K, address generating module two and address generating module three composing images traversal functional modules; Between three address modules, all there is the communication of control signal.
Below in conjunction with Fig. 2, image compression operation step is described:
Step 1: store a sub-picture complete in RAM, address generating module is worked at the beginning, address generating module one operating pointer points to buffer area one, and it is complete that the data of waiting for the first row in RAM deposit into current address generation module one operating pointer buffer area pointed;
Step 2: after the first row storage, the address that address generating module one produces next line, and corresponding operating pointer is pointed to buffer area two; Meanwhile, address generating module one sends buffer memory settling signal to address generating module two, and address generating module two is started working;
Step 3: address generating module two operating pointers point to buffer area one (generation module one operating pointer in current address points to buffer area two), address generating module two is calculated in ,Cong current address, array of compressed address generation module two operating pointer buffer area pointed and is read corresponding column data according to current compression stage situation;
Step 4: the output data that data selector points to the buffer area of select finger sensing by current address operational module 2 operating pointers are exported;
Step 5: after address generating module two completes the squeeze operation of a line, another buffer area of respective operations pointed, send compression settling signal to address generating module one simultaneously, the address that address generating module one produces next line, corresponding operating pointer points to another buffer area;
Step 6: repeating step 2, to step 5, completes the multilevel size squeeze operation of piece image;
Wherein address generating module one operating pointer points to different buffer areas forever with address generating module two operating pointers, so generation module operation in two-address is relatively independent, parallel running, makes the output of data free of discontinuities.
Below in conjunction with Fig. 3, image traversal operating procedure is described:
Step 1: address generating module two produces output buffer memory writing address when carrying out squeeze operation step 4, being write by data selector data Data2 out in the current output buffer memory that writes pointed;
Step 2: repeating step 1, complete the rear address generating module of writing of a line two and write settling signal to address generating module three transmissions, write pointed next line output buffer memory, when the current pointed M+K that writes is when capable, complete after write operation, write pointed the first row;
Step 3: write for the first time full M when capable when data, address generating module three is started working, M is capable for traversal pointed, and address generating module three comprises that to M is capable the capable data of the capable M altogether of M operate simultaneously above;
Step 4: address generating module three calculates traversal OPADD, from 1 to image length corresponding to current compression stage, from above-mentioned M line output buffer memory, read out M the view data after compression, if window size is N*M, after exporting N data, form a window data, in the time of the output of N+1 data with a window of front N-1 data formation, repeat this step until output length is the traversing operation that current compression stage correspondence image length completes a line window, wait for that address generating module two writes after settling signal, traversal pointed next line or lower K are capable;
Wherein in step 4, traveling through pointed selects to divide two kinds: in the situation that compression stage is constant, if the data Data2 now writing is data of the compressed image in same compression stage with the data that write before, travel through pointed next line; In the situation that compression stage changes, in compression stage as corresponding in the data Data2 writing and the different situation of the corresponding compression stage of current traversal pointer indication data, travel through under pointed K capable.
Above-described embodiment is preferably execution mode of the present invention; but embodiments of the present invention are not restricted to the described embodiments; other any do not deviate from change, the modification done under Spirit Essence of the present invention and principle, substitutes, combination, simplify; all should be equivalent substitute mode, within being included in protection scope of the present invention.

Claims (3)

1. the picture size based on hardware parallel architecture is compressed a traversal method, the step that comprises following order:
(1) raw image data stores in RAM, after the storage of the first row data is complete, reads into the first row buffering area;
(2) after entire image storage is complete, start to carry out squeeze operation;
(3) address generating module one calculates next line address according to compressibility factor and compression stage, and the view data in RAM corresponding to address reads into the buffer area of the second row/the first row;
(4) when carrying out step (3), address generating module two starts the first row/the second line operate of advancing, according to compressibility factor and compression stage, calculate array of compressed address, buffer data corresponding to address outputs in the output buffer memory that N is capable, and N circulates in 1 to M+K;
(5) repeating step (4) until complete the squeeze operation of a line, often completes the operation of a line, and it is capable that the packed data of next line outputs to N+1, if N+K is greater than M+K, N equals 1;
(6) repeating step (3) is to step (5), until write for the first time the output buffer memory that full M is capable, address generating module three is started working;
(7) in the constant situation of compression stage, three pairs of address generating module are carried out output function from the capable data to the capable output buffer memory of capable this M of N-1 of N-M, wherein N is the output buffer memory numbering operating in step (4), when N-M is less than 1, N-M equals N+K, when N equals 1, N-1 equals M+K, and while sense data from the capable output buffer memory of this M, until the length of sense data is the line width that current compression stage is corresponding;
(8) in the situation that compression stage changes, because changing, compression stage cause every row pixel to tail off, in step (5), complete after N line operate, in step (7), also do not complete N-M to the capable data cached processing of output of this M of N-1, so if step (7) does not complete, in step (5), complete after N line operate, proceed step (4) and step (5), to the N+1 line operate of advancing, in step (7), the numbering of the M line output buffer memory of operation is constant, if step (7) completes, waiting step (5) completes;
(9) repeating step (3) is to step (8), until a sub-picture compression has traveled through.
2. the picture size based on hardware parallel architecture according to claim 1 is compressed traversal method, it is characterized in that, described step (3), be specially: the address that address generating module one calculates according to compressibility factor a and current compression stage c the next line that current compression stage is corresponding, and from RAM, extract and be stored in buffer area, computational methods are: establish and carried out the capable compression of n, calculating the row address that n+1 is capable is floor(n*ac), floor function representation rounds downwards.
3. the picture size based on hardware parallel architecture according to claim 1 is compressed traversal method, it is characterized in that, described step (4), be specially: the address that address generating module two calculates according to compressibility factor a and current compression stage c the next column that current compression stage is corresponding, and from buffer area, extract data out, computational methods are: establish the compression of having carried out n row, the column address of calculating n+1 row is floor(n*ac), floor function representation rounds downwards.
CN201310482288.6A 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method Active CN103546752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310482288.6A CN103546752B (en) 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310482288.6A CN103546752B (en) 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method

Publications (2)

Publication Number Publication Date
CN103546752A true CN103546752A (en) 2014-01-29
CN103546752B CN103546752B (en) 2016-10-05

Family

ID=49969746

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310482288.6A Active CN103546752B (en) 2013-10-15 2013-10-15 A kind of picture size based on hardware concurrent framework compression traversal method

Country Status (1)

Country Link
CN (1) CN103546752B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030088753A1 (en) * 2001-11-07 2003-05-08 Fujitsu Limited Memory device and internal control method therefor
CN2775974Y (en) * 2004-12-31 2006-04-26 北京中星微电子有限公司 DMA controller for Mpeg-4 movement evaluation method
CN101753950A (en) * 2008-11-28 2010-06-23 深圳迈瑞生物医疗电子股份有限公司 Method for processing ultrasonic image frame interpolation and ultrasonic system thereof
CN102117326A (en) * 2011-02-28 2011-07-06 华南理工大学 Traversal method used for searching image features
CN102263880A (en) * 2010-05-25 2011-11-30 安凯(广州)微电子技术有限公司 Image scaling method and apparatus thereof
CN102333212A (en) * 2010-07-14 2012-01-25 北京大学 Bilinear two-fold upsampling method and system thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030088753A1 (en) * 2001-11-07 2003-05-08 Fujitsu Limited Memory device and internal control method therefor
CN2775974Y (en) * 2004-12-31 2006-04-26 北京中星微电子有限公司 DMA controller for Mpeg-4 movement evaluation method
CN101753950A (en) * 2008-11-28 2010-06-23 深圳迈瑞生物医疗电子股份有限公司 Method for processing ultrasonic image frame interpolation and ultrasonic system thereof
CN102263880A (en) * 2010-05-25 2011-11-30 安凯(广州)微电子技术有限公司 Image scaling method and apparatus thereof
CN102333212A (en) * 2010-07-14 2012-01-25 北京大学 Bilinear two-fold upsampling method and system thereof
CN102117326A (en) * 2011-02-28 2011-07-06 华南理工大学 Traversal method used for searching image features

Also Published As

Publication number Publication date
CN103546752B (en) 2016-10-05

Similar Documents

Publication Publication Date Title
TW201447776A (en) Performing object detection operations via a graphics processing unit
CN103647937A (en) An image tracking system and an image data processing method thereof
CN102890427B (en) Method for preparing skewed data in field programmable gate array (FPGA) of direct-writing type photoetching system
CN110390382B (en) Convolutional neural network hardware accelerator with novel feature map caching module
CN103151015A (en) Overdrive method, circuit, display panel and display device
CN115080455B (en) Computer chip, computer board card, and storage space distribution method and device
US20140225902A1 (en) Image pyramid processor and method of multi-resolution image processing
CN116010299B (en) Data processing method, device, equipment and readable storage medium
CN105653474A (en) Coarse-grained dynamic reconfigurable processor-oriented configuration cache controller
CN102163404B (en) Large-screen light emitting diode (LED) display control device and method based on synchronous dynamic random access memory (SDRAM)
CN101599167B (en) Access method of memory
CN106101712A (en) A kind of processing method and processing device of video stream data
CN105550979A (en) High-data-throughput texture cache hierarchy structure
CN105577985B (en) A kind of digital image processing system
CN103389413A (en) Real-time statistical method for frequency spectrum histogram
CN103546752A (en) Image size compression traversing method on basis of hardware parallel architecture
CN109671127B (en) Method and device for realizing waveform drawing
CN103761052A (en) Method for managing cache and storage device
CN103034455B (en) Based on data message buffer memory management method and the system of Decoding Analysis in advance
CN110599580A (en) Multi-seismic mathematical data rapid cross display based on Hilbert space filling curve index
CN101796845A (en) Device for motion search in dynamic image encoding
CN105204799A (en) Method for increasing display refreshing rate of multi-channel deep memory logic analyzer
CN1297899C (en) Digital images matching chip
CN103745681B (en) A kind of graphicalphanumeric generator based on complex programmable device
Li et al. Batched trajectory compression algorithm based on hierarchical grid coordinates

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant