CN105160622A - Field programmable gate array (FPGA) based implementation method for image super resolution - Google Patents

Field programmable gate array (FPGA) based implementation method for image super resolution Download PDF

Info

Publication number
CN105160622A
CN105160622A CN201510623919.0A CN201510623919A CN105160622A CN 105160622 A CN105160622 A CN 105160622A CN 201510623919 A CN201510623919 A CN 201510623919A CN 105160622 A CN105160622 A CN 105160622A
Authority
CN
China
Prior art keywords
ram
image
fpga
implementation method
ram0
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510623919.0A
Other languages
Chinese (zh)
Other versions
CN105160622B (en
Inventor
钟雪燕
李春英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Institute of Railway Technology
Original Assignee
Nanjing Institute of Railway Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Institute of Railway Technology filed Critical Nanjing Institute of Railway Technology
Priority to CN201510623919.0A priority Critical patent/CN105160622B/en
Publication of CN105160622A publication Critical patent/CN105160622A/en
Application granted granted Critical
Publication of CN105160622B publication Critical patent/CN105160622B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)

Abstract

The invention provides a field programmable gate array (FPGA) based implementation method for image super resolution. The method comprises the following steps that a cycle control module controls cyclic dispatching of a random-access memory (RAM) module to achieve data writing; a RAM applying a single-input and dual-output port defines the depth of the RAM applying the single-input and dual-output port as a row of pixel points of a source image and the width as a pixel data width when the RAM module reads data so that adjacent two rows of pixels of source data are stored; and a weight acquired by a position analysis module is an normalization decimal and is mapped within an integer range for operation. By the FPGA based implementation method, the image processing rate is improved, and super resolution is achieved.

Description

Based on the implementation method of the image super-resolution of FPGA
Technical field
The present invention relates to a kind of implementation method of image super-resolution, be specifically related to a kind of implementation method of the image super-resolution based on FPGA.
Background technology
Common image display has fixing resolution, the view data of low resolution needs to carry out SUPERRESOLUTION PROCESSING FOR ACOUSTIC, obtaining the resolution matched with display device could normally show (as HDTV, High-DefinitionTV), this process nature is exactly a kind of Image Super-resolution process.
Image super-resolution technology is used widely in every field, as industries such as public safety, medical imaging, military affairs, geology, industry and consumer electronics.Improved the resolution of image by this technology as far as possible, reach better recognition capability and accuracy of identification.
Along with the increase of image data amount, image processing speed is had higher requirement, utilize hardware implementing image procossing to become the important topic of graphics process research gradually.
FPGA obtains extensive concern due to intrepid data-handling capacity, and it adopts parallel flow ability of swimming processing mode to data, accelerates data processing speed.Be less than or equal to a frame with regular software to view synthesis per second, it is per second that the process of FPGA Hardware can reach 25 ~ 30 frames in real time.Thus the FPGA Hardware of image procossing is worth research.
FPGA realizes image processing algorithm needs seeking balance between algorithm performance and resource use amount.Traditional linear interpolation algorithm comprises arest neighbors interpolation, bilinear interpolation, 4 bicubic interpolations and 6 bicubic interpolations, and wherein the super resolution image effect of arest neighbors interpolation is undesirable, and high order interpolation method complexity is high is not easy to hardware implementing.
Summary of the invention
The object of this invention is to provide a kind of implementation method of the image super-resolution based on FPGA, based on the image super-resolution bilinear interpolation implementation of FPGA, propose the secondary cycle scheduling mechanism based on single-input double-output port ram buffering, distribute and parallel pipelining process process in order to realize shared resource.
The invention provides following technical scheme:
Based on an implementation method for the image super-resolution of FPGA, the round-robin scheduling of cycle control module control RAM module realizes data write;
When RAM module reads data, adopt the RAM of single-input double-output port, the degree of depth defining the RAM of described single-input double-output port is the pixel number of source images a line, and width is pixel data width, realizes the storage of source data adjacent rows pixel;
The weights obtained by location analysis module are normalized decimals, weights are mapped to computing in integer range.
Preferably, the RAM of described single-input double-output port defines the RAM0-3 of four single-input double-output ports in bilinear interpolation hardware structure diagram, during the pixel value interpolation of the wherein corresponding target image of RAM0, RAM1 ranking operation, RAM2, RAM3 write the pixel value of the source images needed for the computing of target image next line; After RAM0, RAM1 computing terminates, RAM2, RAM3 compute weighted, and RAM0, RAM1 start to write source image pixels value, realize data consecutive operations in time and export, spatially realize the parallel multiplexing of ram space, improve operation efficiency.
Further, the round-robin scheduling that cycle control module controls four RAM modules realizes data write, is respectively between RAM0, RAM1 and RAM2, RAM3 and between RAM0RAM1, between RAM2RAM3; Between RAM0, RAM1 and RAM2, RAM3 between computing target image pixel value and write source image pixels value function cyclic switching; Source image pixels value recurrent wrIting is realized between RAM0RAM1, between RAM2RAM3.Such structural design takes full advantage of the multiplexing feature of FPGA parallel pipelining process, both ensure that making full use of of data bandwidth, in turn saves the space resources of FPGA.
Further, in whole calculating process, weights obtain based on floating point arithmetic, by floating number integer, and can by computing all integer; The integer of floating number is the corresponding figure place that the floating number of correspondence moved to left, and terminates rear right move corresponding figure place in multiplying.
The invention has the beneficial effects as follows: the image super-resolution bilinear interpolation implementation that the present invention is based on FPGA, propose the secondary cycle scheduling mechanism based on single-input double-output port ram buffering, distribute and parallel pipelining process process in order to realize shared resource.Improve image procossing speed, achieve super-resolution.
Accompanying drawing explanation
Accompanying drawing is used to provide a further understanding of the present invention, and forms a part for instructions, together with embodiments of the present invention for explaining the present invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is bilinear interpolation hardware structure diagram;
Fig. 2 is RAM round-robin scheduling mechanism choice;
Fig. 3 FPGA resource takies figure;
Fig. 4 algoritic module sets up retention time figure;
Fig. 5 is the Lena figure before interpolation, and the resolution of source images is 512x512;
Fig. 6 is the Lena figure after interpolation, and after interpolation, resolution is 1024x1024;
Fig. 7 is the histogram of Lena figure before interpolation;
Fig. 8 is the histogram of Lena figure after interpolation.
Embodiment
FPGA has the performance of two opposition: (1) has parallel processing and pipelining, can reach high performance process, but M performance doubly will expend M times of logic; (2) there is multiplex technique, can logic be reduced, but control complexity rising.Based on the functional characteristic of FPGA, the present invention proposes to realize shared resource based on the secondary cycle scheduling mechanism of single-input double-output port ram buffering and distributes and parallel pipelining process process.The FPGA of Xilinx is based on LUT structure simultaneously, can realize floating-point operation and multiplying, but can cause the serious waste of resource.Herein by all floating numbers all integer, carry out data operation in integer field.
Bilinear interpolation determines a plane by 4, is a Planar Mechanisms problem, so the single order interpolation on a rectangular grid needs to use bilinear function.The function making f (x, y) be Two Variables, is defined as the arbitrary value in 4 squares formed, makes Bilinear Equations
f(x,y)=ax+by+cxy+d(1)
Define a hyperbolic paraboloid and known point matching.
The realization of image bilinear interpolation algorithm has come through over-sampling, horizontal and vertical linear interpolation three step.If X s, Y sbe respectively the size of source images on X, Y, X d, Y dbe respectively the size of target image on X, Y, the zoom factor S of both definition, then the zoom factor of horizontal direction, vertical direction is respectively
S x=X s/X d(2)
S y=Y s/Y d(3)
The pixel location sets of definition source images horizontal direction sampling
I x s = { 0 , 1 , 2 , 3 ... X s - 1 } - - - ( 4 )
The pixel location sets of objective definition image level direction sampling
I x d = { 0 , 1 , 2 , 3 ... X d - 1 } - - - ( 5 )
Mapping relations between both definition image slices vegetarian refreshments are then can obtain according to formula (2)
R x s = { 0 , 1 × S x , 2 × S x , 3 × S x ... X d - 1 × S x } - - - ( 6 )
Target image horizontal direction X can be obtained thus di pixel position that () some position is mapped to original image is
R(X d(i))=X d(i)×S x(7)
R (the X obtained d(i)) be real number, this target image horizontal direction X di () some picture element interpolation is at source images [R (X d(i))] and ([R (X d(i))]+1) between, (R (X simultaneously d(i))-[R (X d(i))]) and ([R (X d(i))]+1-R (X d(i))) corresponding to target image X d(i) point and source images [R (X d(i))] put and ([R (X d(i))]+1) normalized value of relative distance between point.
Make F (X d(i))=R (X d(i))-[R (X d(i))] (8)
Target image pixel value is made to be V d, source image pixels value is V s, then
V d(X d(i))=V s([R(X d(i))])×F(X d(i))+
V s([R(X d(i))]+1)×(1-F(X d(i)))(9)
In like manner interpolation is in the vertical direction
V d(X d(i),Y d(j))=V s(X d(i),R[Y d(j)])×G(Y d(j))
+V s([X d(i),R(Y d(j))]+1)×(1-G(Y d(j)))(10)
(8) formula is substituted into (9) formula obtain
V d(X d(i),Y d(j))=V s([R(X d(i))],[R(Y d(j))])
×(1-F(X d(i)))×(1-G(Y d(j)))+
V s([R(X d(i))],[R(Y d(j))]+1)×F(X d(i))×
(1-G(Y d(j)))+V s([R(X d(i))]+1,[R(Y d(j))])×
(1-F(X d(i)))×G(Y d(j))+
V s([R(X d(i))]+1,[R(Y d(j))]+1)×F(X d(i)))×
G(Y d(j))(11)
Can find that formula (11) and formula (1) are similar.
From operational analysis, any pixel value of target image, by adjacent 2 decisions of source images adjacent rows, devises the RAM of single-input double-output port for this reason, realizes two data reading neighbor address.The degree of depth defining this RAM is the pixel number of source images a line, and width is pixel data width, realizes the storage of source data adjacent rows pixel.
According to the position X of target image storage pixel d(i), Y dj () obtains position (the R ([X of the source images that pixel is therewith correlated with through location analysis module d(i)]), R ([X d(j)])), (R ([X d(i)])+1, R ([X d(j)])), (R ([X d(i)]), R ([X d(j)]+1)), (R ([X d(i)])+1, R ([X d(j)]+1)) and the weights F (X of relevant position point d(i)), (1-F (X d(i))) and G (Y d(j)), (1-G (Y d(j))).The module writing the position control source images of the source images that control module obtains according to location analysis module writes in corresponding RAM.
As shown in Figure 1, in bilinear interpolation hardware structure diagram, define the RAM of four single-input double-output ports.During the pixel value interpolation of the wherein corresponding target image of RAM0, RAM1 ranking operation, RAM2, RAM3 write the pixel value of the source images needed for the computing of target image next line; After RAM0, RAM1 computing terminates, RAM2, RAM3 compute weighted, and RAM0, RAM1 start to write source image pixels value, realize data consecutive operations in time and export, spatially realize the parallel multiplexing of ram space, improve operation efficiency.
The round-robin scheduling that cycle control module controls four RAM modules realizes data write, and round-robin scheduling is divided into two-stage as shown in Figure 2, is respectively between RAM0, RAM1 and RAM2, RAM3 and between RAM0RAM1, between RAM2RAM3.Between RAM0, RAM1 and RAM2, RAM3 between computing target image pixel value and write source image pixels value function cyclic switching; Source image pixels value recurrent wrIting is realized between RAM0RAM1, between RAM2RAM3.Such structural design takes full advantage of the multiplexing feature of FPGA parallel pipelining process, both ensure that making full use of of data bandwidth, in turn saves the space resources of FPGA.
Normalized decimals according to known four weights obtained by location analysis module of operational analysis, although FPGA can support floating point arithmetic, need a large amount of logic and interconnection resource, Performance Ratio is poor, be unfavorable for the computing of FPGA, therefore weights be mapped to computing in integer range.In whole calculating process, S x, S ybe floating number, weights are based on S x, S ycomputing obtains, by S x, S yinteger, can by computing all integer.The integer of floating number is the corresponding figure place that the floating number of correspondence moved to left, and terminates rear right move corresponding figure place in multiplying.
The present embodiment experimental verification on the Kintex-7 development board of Xilinx company is the FPGA resource that this algorithm takies as shown in Figure 3, and comprise 645 triggers, 2 RAM and 11 DSP etc., resources occupation rate is not high.
Scheme the retention time of setting up that bilinear interpolation Hardware realizes shown in Fig. 4, as long as can be seen from the figure minimum data keeps (0.067+5.175)=5.242ns to transmit, mean that clock frequency can reach 190MHz, the time being interpolated into 1024x1024 pixel is 5.5ms, do not adopt desirable highest frequency in actual use procedure, select lower frequency to avoid loss of data or entanglement.It is per second that general selection realizes 25 ~ 30 frames.
Be the Lena figure before and after interpolation as seen in figs. 5-6, the resolution of source images is 512x512, and after interpolation, resolution is 1024x1024, roughly can see that Fig. 6 is more clear than Fig. 5 details aspect.
Fig. 7-8 is that the histogram of Lena figure before and after interpolation compares, can the details continuity problem of analysis chart picture more intuitively from histogram.Comparison diagram 7 and Fig. 8 can see that the Lena histogram after interpolation is than more level and smooth before interpolation, and the continuous transition of key diagram picture is better, achieves histogrammic equalization.
To sum up, the present invention can effectively improve image resolution ratio and image processing speed.This point is being verified on Kintex-7 development board, realizes image procossing 25 ~ 30 frame per second, and after image interpolation, not only details is more clear simultaneously, can see that image obtains equalization from histogram.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, although with reference to previous embodiment to invention has been detailed description, for a person skilled in the art, it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (4)

1. based on an implementation method for the image super-resolution of FPGA, it is characterized in that, the round-robin scheduling of cycle control module control RAM module realizes data write;
When RAM module reads data, adopt the RAM of single-input double-output port, the degree of depth defining the RAM of described single-input double-output port is the pixel number of source images a line, and width is pixel data width, realizes the storage of source data adjacent rows pixel;
The weights obtained by location analysis module are normalized decimals, weights are mapped to computing in integer range.
2. the implementation method of the image super-resolution based on FPGA according to claim 1, it is characterized in that, the RAM of described single-input double-output port defines the RAM0-3 of four single-input double-output ports in bilinear interpolation hardware structure diagram, during the pixel value interpolation of the wherein corresponding target image of RAM0, RAM1 ranking operation, RAM2, RAM3 write the pixel value of the source images needed for the computing of target image next line; After RAM0, RAM1 computing terminates, RAM2, RAM3 compute weighted, and RAM0, RAM1 start to write source image pixels value, realize data consecutive operations in time and export, spatially realize the parallel multiplexing of ram space.
3. the implementation method of the image super-resolution based on FPGA according to claim 1 and 2, it is characterized in that, the round-robin scheduling that cycle control module controls four RAM modules realizes data write, is respectively between RAM0, RAM1 and RAM2, RAM3 and between RAM0RAM1, between RAM2RAM3; Between RAM0, RAM1 and RAM2, RAM3 between computing target image pixel value and write source image pixels value function cyclic switching; Source image pixels value recurrent wrIting is realized between RAM0RAM1, between RAM2RAM3.
4. the implementation method of the image super-resolution based on FPGA according to claim 1, is characterized in that, in whole calculating process, weights obtain based on floating point arithmetic, by floating number integer, and can by computing all integer; The integer of floating number is the corresponding figure place that the floating number of correspondence moved to left, and terminates rear right move corresponding figure place in multiplying.
CN201510623919.0A 2015-09-25 2015-09-25 The implementation method of image super-resolution based on FPGA Active CN105160622B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510623919.0A CN105160622B (en) 2015-09-25 2015-09-25 The implementation method of image super-resolution based on FPGA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510623919.0A CN105160622B (en) 2015-09-25 2015-09-25 The implementation method of image super-resolution based on FPGA

Publications (2)

Publication Number Publication Date
CN105160622A true CN105160622A (en) 2015-12-16
CN105160622B CN105160622B (en) 2018-08-31

Family

ID=54801465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510623919.0A Active CN105160622B (en) 2015-09-25 2015-09-25 The implementation method of image super-resolution based on FPGA

Country Status (1)

Country Link
CN (1) CN105160622B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407153A (en) * 2016-11-23 2017-02-15 诺仪器(中国)有限公司 High-resolution data acquisition method and device
CN115471404A (en) * 2022-10-28 2022-12-13 武汉中观自动化科技有限公司 Image scaling method, processing device and storage medium
CN115829842A (en) * 2023-01-05 2023-03-21 武汉图科智能科技有限公司 Device for realizing picture super-resolution reconstruction based on FPGA

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101527033A (en) * 2008-03-04 2009-09-09 河海大学 Industrial CCD color imaging system based on super-resolution reconstruction and automatic registration
US20110235946A1 (en) * 2010-03-24 2011-09-29 Xerox Corporation Reducing buffer size requirements in an electronic registration system
CN103248797A (en) * 2013-05-30 2013-08-14 北京志光伯元科技有限公司 Video resolution enhancing method and module based on FPGA (field programmable gate array)
CN104748729A (en) * 2015-03-19 2015-07-01 中国科学院半导体研究所 Optimized display device and optimized display method for range-gating super-resolution three-dimensional imaging distance map

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101527033A (en) * 2008-03-04 2009-09-09 河海大学 Industrial CCD color imaging system based on super-resolution reconstruction and automatic registration
US20110235946A1 (en) * 2010-03-24 2011-09-29 Xerox Corporation Reducing buffer size requirements in an electronic registration system
CN103248797A (en) * 2013-05-30 2013-08-14 北京志光伯元科技有限公司 Video resolution enhancing method and module based on FPGA (field programmable gate array)
CN104748729A (en) * 2015-03-19 2015-07-01 中国科学院半导体研究所 Optimized display device and optimized display method for range-gating super-resolution three-dimensional imaging distance map

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
钟雪燕 等: "基于FPGA的图像超分辨率的硬件化实现", 《现代电子技术》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407153A (en) * 2016-11-23 2017-02-15 诺仪器(中国)有限公司 High-resolution data acquisition method and device
CN106407153B (en) * 2016-11-23 2019-08-23 一诺仪器(中国)有限公司 A kind of high-resolution data acquisition method and device
CN115471404A (en) * 2022-10-28 2022-12-13 武汉中观自动化科技有限公司 Image scaling method, processing device and storage medium
CN115829842A (en) * 2023-01-05 2023-03-21 武汉图科智能科技有限公司 Device for realizing picture super-resolution reconstruction based on FPGA
CN115829842B (en) * 2023-01-05 2023-04-25 武汉图科智能科技有限公司 Device for realizing super-resolution reconstruction of picture based on FPGA

Also Published As

Publication number Publication date
CN105160622B (en) 2018-08-31

Similar Documents

Publication Publication Date Title
US10356385B2 (en) Method and device for stereo images processing
US20170024632A1 (en) Performance Enhancement For Two-Dimensional Array Processor
CN107885700B (en) Multi-core implementation method for large-scale matrix convolution
Chen VLSI implementation of a low-cost high-quality image scaling processor
CN105493497A (en) Method and processor for efficient video processing in a streaming environment
Li et al. High throughput hardware architecture for accurate semi-global matching
US20200027219A1 (en) Dense optical flow processing in a computer vision system
Huang et al. A novel interpolation chip for real-time multimedia applications
CN105160622A (en) Field programmable gate array (FPGA) based implementation method for image super resolution
WO2018113224A1 (en) Picture reduction method and device
Dürre et al. A HOG-based real-time and multi-scale pedestrian detector demonstration system on FPGA
CN111967582B (en) CNN convolutional layer operation method and CNN convolutional layer operation accelerator
CN107680028B (en) Processor and method for scaling an image
CN108521824A (en) Image processing apparatus, method and interlock circuit
CN114329324A (en) Data processing circuit, data processing method and related product
Zhang et al. An efficient accelerator based on lightweight deformable 3D-CNN for video super-resolution
CN111028136A (en) Method and equipment for processing two-dimensional complex matrix by artificial intelligence processor
CN109416743B (en) Three-dimensional convolution device for identifying human actions
CN111610963B (en) Chip structure and multiply-add calculation engine thereof
Wang et al. Low-resource hardware architecture for semi-global stereo matching
WO2019006405A1 (en) Hierarchical data organization for dense optical flow
CN108960203B (en) Vehicle detection method based on FPGA heterogeneous computation
US11830114B2 (en) Reconfigurable hardware acceleration method and system for gaussian pyramid construction
CN112132914A (en) Image scale space establishing method and image processing chip
Schneider A processor for an object‐oriented rendering system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant