CN110717852B

CN110717852B - FPGA-based field video image real-time segmentation system and method

Info

Publication number: CN110717852B
Application number: CN201910511004.9A
Authority: CN
Inventors: 张志斌; 许冰; 斯勤夫
Original assignee: Inner Mongolia University
Current assignee: Inner Mongolia University
Priority date: 2019-06-13
Filing date: 2019-06-13
Publication date: 2022-09-16
Anticipated expiration: 2039-06-13
Also published as: CN110717852A

Abstract

The invention belongs to the technical field of field crop growth information processing, and discloses a field video image real-time segmentation system and method based on an FPGA (field programmable gate array). the structure of an image segmentation algorithm is changed to be suitable for a hardware environment, and data and pipeline instructions are added to enable the image segmentation algorithm to be suitable for an application scene of real-time image segmentation; the method comprises the steps that through designing each module of an image acquisition display system, a video channel based on a camera is built; and realizing field video image segmentation by combining the flows of field plant super-green feature extraction, a Canny edge detection algorithm, median filtering and threshold segmentation and FPGA hardware development. The invention has abundant register resources and pipeline structure, can process a large amount of image data in real time, and greatly improves the speed of image processing; the development board used by the invention has the advantages of lower cost, small volume and low power consumption, and is more suitable for complex field working environment.

Description

FPGA-based field video image real-time segmentation system and method

Technical Field

The invention belongs to the technical field of field crop growth information processing, and particularly relates to a field video image real-time segmentation system and method based on an FPGA.

Background

Currently, the closest prior art:

currently, computer technology has been applied to various fields in life. Agricultural informatization is the global agricultural development trend of the twenty-first century and is a basic guarantee for developing agricultural modernization. In the agricultural field, identification, analysis and processing of images of agricultural crops using image processing and machine vision techniques have become increasingly popular. The image segmentation is a key step in the image processing process and has guiding effect on the processing and analysis of subsequent images. Therefore, edge detection of video images has been one of the classic research subjects in the field of digital image processing. Digital images have been widely used in many fields, such as aerospace, biomedical, automated industrialization, and military police.

When the video image is segmented, the requirements of the corresponding video image processing algorithm on the processing speed and the processing time are high. Because the CPU has strong processing capacity, the traditional method mostly uses computer software to realize the image segmentation, and in most cases, a software system can meet the basic application requirement. However, as the application requirements increase and the application environment becomes complex, the requirements on the processing time, the processing speed and the power consumption of the algorithm are higher and higher. With the increasing data volume of video images, the processing algorithm of the video images becomes complex, and the requirements of instantaneity and low power consumption cannot be met by using the traditional software method to realize the algorithm. For some image preprocessing algorithms with very large calculation amount, if CPU-based software on a computer is used for processing, as most CPUs do not have more cores, the parallel processing capability is weak, the processing speed is slow, the requirement of real-time processing cannot be met in many cases, and the corresponding processing precision is also reduced, so that the image preprocessing algorithms cannot be applied to the fields with high real-time requirements such as farmland environment and the like. In view of the shortcomings of software in real-time image processing algorithms, the acceleration of image processing algorithms by using FPGAs has been paid more and more attention by researchers.

FPGA (field Programmable Gate Array) is a new technology developed gradually on the basis of Programmable devices such as PAL (Programmable Array Logic), GAL (general Array Logic), CPLD (Complex Programmable Logic device) and the like. Compared with a Digital Signal Processor (DSP), the FPGA has higher computing power and working efficiency. Because a plurality of programmable logic modules are integrated in the FPGA, and a plurality of non-programmable logic modules are integrated in the traditional CPU, different from the execution mode of CPU software, the FPGA can execute different logic tasks in parallel in the same time by programming the programmable logic modules in advance and process more tasks than the traditional CPU. Powerful parallel processing capacity and pipeline design of FPGA technology ^[4] Therefore, the method has unique advantages in image processing algorithms. Compared with the same software algorithm, the hardware image processing algorithm adopting the FPGA has the advantages that the speed is improved by dozens of times to hundreds of times, even by several orders of magnitude, and the video image processing is lightened to a certain degreeThe real-time performance of video image processing and the accuracy of image algorithms are improved at the same time.

FPGAs are a relatively widely used programmable logic device, also known as a programmable ASIC (application specific integrated chip). Compared with an ASIC (application specific integrated circuit) and a DSP (digital signal processor), the FPGA is relatively more flexible, and can reprogram wiring resources according to actual requirements so as to realize the function of self-defining hardware, and the parallel execution and the special pipeline design of the FPGA ensure that the FPGA realizes a large amount of high-speed electronic circuit design at a higher speed, so that the universality is higher, and the development period is obviously shortened.

Image segmentation algorithms are widely applied to image recognition, and research on the aspect is more, but the cases of combining the segmentation algorithms and the FPGA hardware architecture and applying the segmentation algorithms to agricultural informatization are fewer, and a static image processing mode or method is generally adopted. However, the method or method for processing the video image can not only improve the real-time performance of the segmentation algorithm, but also realize the complex segmentation algorithm and improve the precision.

In the beginning of the twenty-first century, with the rapid development of the FPGA technology in the process and application fields, FPGAs are widely used in the fields of digital signal processing, biomedical imaging, computer vision, communication, deep learning, and the like. The method has the advantages that image segmentation application of an Active Contour Model (ACM) is realized on the FPGA by Hiren K and the like, the segmentation precision reaches over 80%, the algorithm running speed reaches 7.3270ns, and compared with a GPU (graphics processing Unit) implementation mode, the method is higher in implementation speed and lower in delay. Mohammad Eslami and the like design an image segmentation algorithm based on a hard threshold and haar wavelet transform, and realize real-time segmentation of cerebral aneurysm within average time of 5.2 ms. The Kofi Appiah utilizes FPGA parallelism to segment moving targets in a video in real time, an integrated FPGA-based video segmentation algorithm is provided, a background difference and a connected component mark are executed in parallel in an FPGA efficient pipeline mode, and execution efficiency is obviously improved. Seungwon Lee provides an effective and strong-robustness moving target segmentation method applied to a high-resolution video monitoring system, the calculation efficiency is higher, the method is easier to be embedded into DSP software, and the segmentation effect is better. Guijarro M provides a novel automatic segmentation method for images of grain and corn crops, graying field images by adopting a plurality of color factors, extracting plant greenness, redness and blueness by combining the plurality of color factors, and finally classifying different types of plants and soil by using a fuzzy clustering method. Riomoros proposes an agricultural image segmentation strategy based on discrete wavelet transform, and successfully distinguishes soil and green plants. But weeds and crops have much room for improvement due to irregular distribution and different height spaces.

Wang Guam et al (2011) adopt FPGA hardware to realize watershed algorithm, the processing time of each frame of melon image is 20 ms-40 ms, which is more than 10 times faster than a software method, and real-time segmentation of melon images is realized. The minimum cross entropy algorithm based on Pulse Coupled Neural Network (PCNN) is realized on an FPGA hardware platform by using the custard et al, the segmentation time of the minimum cross entropy algorithm is 3 orders of magnitude faster than the simulation time of MATLAB, and a better segmentation effect can be obtained. Yang Yong et al ^[14] A real-time image segmentation algorithm based on an FPGA is provided, and is used for solving the problem of poor segmentation effect under the condition of non-uniform illumination. The segmentation algorithm only needs 105 mus after processing an image, and the real-time performance is good.

In summary, the image processing algorithms are researched more domestically and abroad, but the cases of combining the segmentation algorithm and the FPGA hardware architecture and applying the combination to agricultural informatization are few, and a static image processing mode is generally adopted, so that the real-time requirement cannot be met.

In summary, the problems of the prior art are as follows:

(1) with the continuous expansion of the image data scale, the requirements of the digital image segmentation algorithm on the image processing speed and the real-time performance are higher and higher. For some algorithms with real-time requirements on the millisecond level, the traditional software platform is difficult to complete corresponding operations within a specified time.

In addition, in the prior art, the register resource and the pipeline structure of the FPG are not combined, so that the image of the field crop growth process can not be divided in real time under the condition of improving the precision, the image processing speed can not be greatly improved, and the real-time operation of the agricultural robot can not be suitable for the field working environment.

(2) The method can provide a real-time processing technical means for crop growth detection, such as estimating the growth area of crops by image real-time segmentation, or further estimating the growth height and density of crops by performing matching processing after binocular vision image segmentation.

The difficulty of solving the technical problems is as follows:

1) the algorithm described in the embodiment is reasonably decomposed, structured and optimized to perform parallel and pipeline processing based on the FPGA.

2) In the architecture of the FPGA, data transmission, storage, forwarding and calculation units are designed reasonably to save resources and improve data processing efficiency.

The significance of solving the technical problems is as follows:

the real-time performance of the system algorithm can be improved, the complex algorithm can be realized, and the accuracy, the applicability and the robustness of the system algorithm are further improved.

Disclosure of Invention

Aiming at the problems in the prior art, the invention provides a field video image real-time segmentation system and method based on an FPGA. The invention adopts a video image processing mode and uses the FPGA technology to accelerate the image segmentation algorithm, and can improve the real-time performance of the segmentation algorithm to a certain extent so as to be applied to the reality.

The invention is realized in this way, a field video image real-time segmentation method based on FPGA, comprising:

firstly, combining an image segmentation algorithm under a software condition with relevant theoretical knowledge, modifying the structure of the image segmentation algorithm suitable for a hardware environment, and adding data and pipeline instructions to enable the image segmentation algorithm to be suitable for an application scene of real-time image segmentation.

And secondly, designing each module of the image acquisition display system to build a video channel based on the camera. Meanwhile, the image segmentation algorithm is optimized correspondingly, the project of the image segmentation algorithm is packaged into an IP core, and the IP core is connected into the whole video channel, so that the real-time segmentation display of the image is realized.

And thirdly, designing and realizing a field video image segmentation system based on the FPGA by using Verilog and C + + programming languages and combining the flows of field plant ultragreen feature extraction, a Canny edge detection algorithm, median filtering and threshold segmentation and FPGA hardware development.

Further, in the first step, the modification of the structure of the image segmentation algorithm, which is applicable to the hardware environment, includes the improvement of a memory structure, and the method for improving the memory structure includes: and reading and writing the image data in the memory by adopting a line buffer area and a memory window, wherein the line buffer area is used for storing complete image lines, and the memory window is used for carrying out operation on the image matrix.

The line buffer area continuously stores new image data, the window moves to the right, image pixel points in the window move to the left, and the window is updated by the new image data.

The buffer is empty of image data and the entire line buffer buffers data starting at the start address of the buffer until the first line of data is completely written into the line buffer. After the buffering of the first row of data is completed, the image data in the first row and the first column will move upwards until the buffering of the last row is completed, and the pixels in the rightmost column come from the current column of the line buffer and the newly input image pixel point data. And circulating the steps, and reserving three lines of image data in the line buffer area until the whole picture is completely buffered.

And when the line buffer is filled, writing the image data in the line buffer into the window buffer.

After the line buffer area image data is completely buffered, the line buffer area image data is copied to a window buffer area, new image data can fill the window buffer area along with the rightward movement of a window, and the window buffer area moves from left to right and from top to bottom until the whole image data is completely buffered.

Further, in the first step, data and pipeline instructions are added to adapt the image segmentation algorithm to the application scene of real-time image segmentation, including pipeline optimization, and the pipeline optimization is performed through an instruction # pragma HLS PIPELINE II-1. The pipeline instruction defines a number of clock cycles between cycle body starts using a cycle start interval (II) factor.

Furthermore, in the first step, data and pipeline instructions are added to make the image segmentation algorithm adapt to the application scene of real-time image segmentation, and further comprises array splitting optimization,

part of array structures in Canny edge detection, median filtering and OTSU threshold segmentation are split into arrays with smaller bit width by using a line buffer array, and after the arrays are divided into smaller arrays or independent elements, an RTL circuit generated by the array structure comprises a plurality of small memories or registers.

Further, in the second step, the method for constructing the video channel comprises the following steps:

firstly, after the design of an image sensor collection IP core is finished, the image sensor collection IP core is connected with a VTCIP core, a VID _ INIP core, a VDMA IP core, a VID _ OUTIP core and an rgb2 dvisiP core, after timing constraint and pin constraint, the whole project is integrated and executed, and then a bit stream file is generated.

The entire project is then exported to the SDK. And performing read-write operation programming on a corresponding register at the PS end in the SDK.

And finally, downloading the bit stream file to the FPGA board by using the J-Link, building a hardware environment of the whole video channel, and downloading a software program to the FPGA board.

Further, in the third step, the method for extracting the ultragreen features includes:

the field crop image is first divided into R, G, B three channels.

An adder is then used to calculate a green value for twice the green channel. The output of the adder and the red gray value of the red channel are connected as input to the subtracter for subtraction.

And then the output of the subtracter and the blue gray value of the blue channel are connected to another subtracter as inputs for difference calculation.

And finally, comparing the output of the subtracter with a threshold value T, and obtaining a final classification result through a data selector. The output of the subtractor is compared with a threshold T, and when the output of the subtractor is greater than the threshold T, the result obtained by the selector is 255, and when the output of the subtractor is less than the threshold T, the result obtained by the selector is 0. The green color of the field image was isolated.

Further, in the third step, the FPGA implementation method for Canny edge detection includes:

(1) image smoothing: image smoothing is achieved by gaussian convolution. Firstly, line buffering is carried out on an image, after the buffering of a frame of image data is finished, the line buffered data is gradually written into a window buffer area, and a 5 multiplied by 5 moving window Gaussian operator is used for carrying out convolution on all pixels in the 5 multiplied by 5 window buffer area until the data in the window buffer area is completely processed. And finally, shifting the pixel points of the convolved image by eight bits to the right, and outputting the image data after being smoothed according to lines.

(2) Solving gradient direction and module value: creating a3 x 3 filtered image calculation kernel window, convolving the calculation kernel with Sobel horizontal and vertical 3 x 3 window operators respectively to obtain horizontal and vertical gradient partial derivatives G _x And G _y Through G _x And G _y The sum of the absolute values of (c) determines the magnitude of the gradient G. The direction of the gradient is:

θ＝arctan(G _y /G _x )。

(3) non-maxima suppression: and calculating pixel points around the central point of the kernel of the image, comparing the high threshold Th with the low threshold Tl, and if the central point of the kernel is greater than Th, determining the image as a strong edge point. And if the central point of the calculation kernel is between Tl and Th, the calculation kernel is a weak edge point.

Further, in the third step, the FPGA implementation method of median filtering includes:

the median filter with the window size of 5 multiplied by 5 is adopted to firstly perform line buffering on the image, and after the buffering of a frame of image data is finished, the line buffered data is gradually written into the window buffer area.

The FPGA implementation method for OTSU threshold segmentation comprises the following steps:

1) and (2) field image pixel gray level grading, comparing field image pixels with 256 single-channel gray levels, firstly comparing the image pixels with 0, then storing the compared value into a Reg0 register through a data selector, meanwhile, firstly performing negation operation on the output of the comparator 0, then passing the output of the comparator and the output of the comparator 1 through an AND gate, then inputting the output into the data selector, and storing the output of the data selector into a Reg1 register. And so on until all pixel values are stored in the corresponding registers.

2) Calculating the ratio of foreground pixels to background pixels, calculating the ratio of background pixels to foreground pixels, calculating the ratio of Reg0 from the Reg0 register, sum being the total number of pixels, using Reg0 and sum as the input of divider, and obtaining the output of divider as the ratio of background pixels omega ₁ . The output sum 1 of the divider is used as the input of the subtracter, and the target pixel occupation ratio omega is output ₂ 。

3) And calculating the average gray values of the target and the background, inputting a picture with the size of M multiplied by N resolution, and performing division operation with Reg0, wherein the calculated and graded image pixel gray values are stored in registers of Reg 0-Reg 255.

The output of the divider is multiplied by 0, the output of the multiplier is stored in a Reg register, the average gray value mu of the background stored in the Reg register ₁ 。

Mean gray value μ of the object ₂ Calculate the average gray value μ by background ₁ The inverse process of (2).

4) Calculating the variance between classes:

the calculation formula is as follows:

g＝ω ₁ ×(μ-μ ₁ ) ² +ω ₂ ×(μ-μ ₂ ) ² 。

the simplified formula is:

g＝ω ₁ ×ω ₂ ×(μ ₁ -μ ₂ ) ² 。

the invention also aims to provide a method for implementing field video image real-time segmentation system based on FPGA to transmit image data in FPGA logic by an AXI bus.

The image data collected by the camera passes through a Video in to AXI-Stream IP core and becomes AXI Stream.

The field video image real-time segmentation method is executed in HLS image processing. The AXI VDMA performs data interaction with the PS end through AXI interconnection, and stores or reads image data into or out of the DDR.

Axi-Stream to video out converts AXI Stream into image data in RGB format, and displays the image through HDMI controller.

The invention further aims to provide a field operation robot for implementing the FPGA-based field video image real-time segmentation method.

In summary, the advantages and positive effects of the invention are as follows:

the FPGA (field Programmable Gate array) has rich register resources and a pipeline structure, can process a large amount of image data in real time, and greatly improves the image processing speed.

The invention relates to and realizes a video image acquisition processing and display system for an agricultural robot. The system can acquire field video images in real time in a complex environment, improves the super green area extraction, Canny edge detection, OTSU threshold segmentation and median filtering algorithm of the image suitable for the FPGA hardware environment, and is convenient for rapidly and accurately segmenting acquired image data, and segmented images can be displayed on a display through an interface.

According to the method, parallel pipelines are adopted for optimization design in the image green degree extraction, Canny edge detection, OSTU threshold segmentation and median filtering algorithm, four modules in the image threshold segmentation algorithm are designed in a parallel mode, and the image processing speed is effectively improved. The experimental result shows that the segmentation algorithm used by the invention only needs 2.44ms for the image segmentation with the resolution of 640 × 480 and only needs 16.01ms for the image segmentation with the resolution of 1920 × 1080, and the realization speed of the segmentation algorithm is about ten times of the realization speed of PC software tested by the invention. And 2824, 4625 and 4KB respectively occupy the flip-flop FF, use the lookup table LUT and the block register BRAM, and the design of the algorithm is realized by less resources. The system experiment shows that the FPGA-based field video image real-time segmentation system can complete the acquisition, caching, filtering and segmentation of field video images, and display the processed result on a display, and the algorithm realized by adopting the FPGA has higher efficiency than that of the traditional software method, thereby meeting the expected requirement.

The market price of the Intel Core i7-8500H (2.2Ghz) type CPU used by the invention is about 3000 yuan, while the price of the ZYNQ-7000 series development board used by the invention is about 600 yuan, so that the cost is lower compared with the traditional complete machine using the CPU. And secondly, the development board used by the invention has small volume and low power consumption, and is more suitable for complex field working environment. Finally, the development board used by the invention is an ARM + FPGA architecture, and compared with the traditional CPU chip, the development board can realize large-scale professional application at lower cost in practice.

Drawings

Fig. 1 is a flowchart of an ultragreen feature extraction algorithm provided in an embodiment of the present invention.

FIG. 2 is a diagram of the effect of extracting supergreen characteristics of maize plants provided by the embodiment of the present invention.

Fig. 3 is a diagram of a 5 × 5 median filter provided by an embodiment of the present invention.

Fig. 4 is an overall architecture diagram of a field video image real-time segmentation system of an FPGA according to an embodiment of the present invention.

Fig. 5 is a block diagram of the internal structure of the OV7725 according to the embodiment of the present invention.

Fig. 6 is a flowchart of an OV7725 camera initialization code according to an embodiment of the present invention.

Fig. 7 is an initialization timing diagram of the OV7725 provided by the embodiment of the present invention.

FIG. 8 is a SCCB timing diagram of serial data provided by an embodiment of the invention.

Fig. 9 is an IP core diagram of an image acquired by a camera according to an embodiment of the present invention.

Fig. 10 is an IP core timing simulation diagram of an image acquired by a camera according to an embodiment of the present invention.

In the figure: a) and (5) a timing simulation diagram I. b) And a timing simulation diagram II. c) And a third time sequence simulation diagram.

Fig. 11 is a diagram of image format conversion of a camera according to an embodiment of the present invention.

Fig. 12 is a timing diagram of VGA row field provided by the embodiment of the present invention. In the figure: (a) VGA row timing; (b) VGA field timing.

Fig. 13 is a circuit diagram of an HDMI display RTL according to an embodiment of the present invention.

Fig. 14 is a block diagram of an AXI VDMA provided by an embodiment of the invention.

FIG. 15 is a block diagram of a read/write operation design according to an embodiment of the present invention.

Fig. 16 is a diagram of a row buffer architecture provided by an embodiment of the invention.

Fig. 17 is a diagram of a memory window buffer architecture according to an embodiment of the present invention.

FIG. 18 is a comparison of a pipeline and a non-pipeline provided by embodiments of the present invention. In the figure: a) non-pipelined. b) And (4) pipelining.

FIG. 19 is a diagram of a parallel optimization of a line buffer array according to an embodiment of the present invention.

FIG. 20 is a field image green region classification chart provided by an embodiment of the invention.

Fig. 21 is a flow chart of the Canny algorithm provided by the embodiment of the present invention.

Fig. 22 is a hardware implementation diagram of a Sobel operator according to the embodiment of the present invention.

Fig. 23 is a diagram of a pipeline optimization of the Canny algorithm provided by the embodiment of the present invention.

Fig. 24 is a diagram of a design of a median filtering window according to an embodiment of the present invention.

Fig. 25 is a diagram of a hardware design for median filtering according to an embodiment of the present invention.

FIG. 26 is a field image pixel gray scale map provided by an embodiment of the present invention.

Fig. 27 is a calculation chart of the image background pixel ratio according to the embodiment of the present invention.

Fig. 28 is a calculation chart of the target and background mean gray-scale values according to the embodiment of the present invention.

Fig. 29 is a hardware configuration diagram of a video path provided in the embodiment of the present invention.

Fig. 30 is a block diagram of a camera capture module system according to an embodiment of the present invention.

Fig. 31 is an RTL circuit diagram of a camera image capturing module according to an embodiment of the present invention.

Fig. 32 is an RTL circuit diagram of an image processing path provided by an embodiment of the present invention.

FIG. 33 is a graph of field crop segmentation results provided by embodiments of the present invention. In the figure: a) and (5) performing plant original drawing. b) Plant super green gray scale map. c) Canny edge detection. d) OTSU threshold segmentation. e) And (5) plant original drawings. f) Plant super green gray scale map. g) Canny edge detection. h) OTSU threshold segmentation. i) And (5) plant original drawings. j) Plant super green gray scale map. k) Canny edge detection. l) OTSU threshold partitioning.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

With the continuous expansion of the image data scale, the requirements of the digital image segmentation algorithm on the image processing speed and the real-time performance are higher and higher. For some algorithms with real-time requirements on the millisecond level, the traditional software platform is difficult to complete corresponding operations within a specified time. In addition, the prior art does not combine with register resources and a flow line structure of FPG, images cannot be segmented in real time in the growth process of field crops, the image processing speed cannot be greatly improved, and the real-time operation of the agricultural robot cannot be adapted to the field working environment.

Aiming at the problems in the prior art, the invention provides a field video image real-time segmentation method based on an FPGA, and the invention is described in detail below with reference to the accompanying drawings.

The field video image real-time segmentation method based on the FPGA provided by the embodiment of the invention comprises the following steps:

In an embodiment of the present invention, in a first step, modifying a structure of an image segmentation algorithm, which is applicable to a hardware environment, includes improving a memory structure, where the method for improving the memory structure includes: and reading and writing the image data in the memory by adopting a line buffer area and a memory window, wherein the line buffer area is used for storing complete image lines, and the memory window is used for carrying out operation on the image matrix.

In the embodiment of the invention, in the first step, data and pipeline instructions are added to adapt the image segmentation algorithm to the application scenario of real-time image segmentation, including pipeline optimization, and the pipeline optimization is performed by using the instruction # pragma HLS PIPELINE II-1. The pipeline instruction defines a number of clock cycles between cycle body starts using a cycle start interval (II) factor.

In the embodiment of the invention, in the first step, data and pipeline instructions are added to make the image segmentation algorithm adapt to the application scene of real-time image segmentation, and further comprises array splitting optimization,

In the embodiment of the invention, in the second step, the method for constructing the video channel comprises the following steps:

In the embodiment of the present invention, in the third step, the method for extracting an ultragreen feature includes:

the field crop image is first divided into R, G, B three channels.

An adder is then used to calculate a green value twice that of the green channel. The output of the adder and the red gray value of the red channel are connected as input to the subtracter for subtraction.

And finally, comparing the output of the subtracter with a threshold value T, and obtaining the final classification result through a data selector. The output of the subtractor is compared with a threshold T, and when the output of the subtractor is greater than the threshold T, the result obtained by the selector is 255, and when the output of the subtractor is less than the threshold T, the result obtained by the selector is 0. The green color of the field image was isolated.

In the embodiment of the present invention, in the third step, the FPGA implementation method for Canny edge detection includes:

(1) image smoothing: image smoothing is achieved by gaussian convolution. Firstly, line buffering is carried out on an image, after the data buffering of a frame of image is finished, the line buffered data is gradually written into a window buffer area, and convolution is carried out on all pixels in the 5 x 5 window buffer area by using a 5 x 5 moving window Gaussian operator until all data in the window buffer area are processed. And finally, shifting the pixel points of the convolved image by eight bits to the right, and outputting the image data after being smoothed according to lines.

θ＝arctan(G _y /G _x )。

(3) non-maxima suppression: and calculating pixel points around the central point of the kernel central point of the image, comparing the high threshold Th with the low threshold Tl, and if the calculated kernel central point is greater than Th, determining the image as a strong edge point. And if the central point of the calculation kernel is between Tl and Th, the calculation kernel is a weak edge point.

In the embodiment of the present invention, in the third step, the FPGA implementing method for median filtering includes:

the median filter with the window size of 5 multiplied by 5 is adopted to firstly carry out line buffering on the image, and after the buffering of a frame of image data is finished, the line buffered data is gradually written into a window buffer area.

2) Calculating the ratio of the target pixel to the background pixel, respectively calculating the ratio of the foreground pixel to the background pixel of the image, wherein the input value of a Reg0 register is the value of a Reg0 register in the previous step, sum is the total number of pixels, Reg0 and sum are used as the input of a divider, and the output of the divider is the ratio omega of the background pixel to the ratio omega ₁ . The output sum 1 of the divider is used as the input of the subtracter, and the target pixel ratio omega is output ₂ 。

4) Calculating the variance between classes:

the calculation formula is as follows:

g＝ω ₁ ×(μ-μ ₁ ) ² +ω ₂ ×(μ-μ ₂ ) ² 。

the simplified formula is:

g＝ω ₁ ×ω ₂ ×(μ ₁ -μ ₂ ) ² 。

the embodiment of the invention provides a field video image real-time segmentation system based on an FPGA (field programmable gate array), wherein image data are transmitted in FPGA logic by an AXI (advanced extensible interface) bus.

The image data collected by the camera is changed into AXI Stream through a Video in to AXI-Stream IP core.

The present invention is further described below with reference to specific assays.

1. The development board adopted by the invention is ZYNQ-7000 series, and is a light-weight FPGA development platform developed by MicroZus.

The invention adopts a development board of a microphase science and technology company, is an embedded development platform of a ZYNQ-7000 series SoC based on a Xilinx company, and adopts a popular solution scheme of an FPGA + ARM structure SoC (System on chip). The ZYNQ-7000 architecture used by microphase technologies incorporated 1 dual core ARM Cortex-a9 processor, and Xilinx 7 series Field Programmable Gate Array (FPGA) logic. The architecture of the FPGA and the ARM provides a solution for replacing upgrading in the field of traditional CPU and FPGA application, and the solution of the single-chip SoC has great advantages in price and development difficulty and can be widely applied to aspects of high-frame-rate video image processing, hardware acceleration industrial real-time control, artificial intelligence and the like. Two DDR3 chips with capacity of 256MB and single data bit width of 16bit are configured inside the development board, and two chips reach 32bit width to provide sufficient memory space and ensure the image data stored temporarily in the system.

The chip supports a High-Level Synthesis (HLS) technology provided by Xilinx corporation, the technology can directly use C, C + + and System C language specifications to carry out programmable design on an FPGA hardware platform, an RTL circuit does not need to be created manually, and the design and development of an IP core are accelerated to a certain extent. The HLS supports the design environment of the ISE and the Vivado, so that the time for a developer to be familiar with the software environment can be reduced, and the design and development period of a digital system is shortened. According to the invention, HLS technology is adopted for hardware design of the ultragreen feature extraction, Canny edge detection, median filtering and OSTU threshold segmentation algorithm, on the premise of meeting hardware conditions and having no syntax errors, all steps of the algorithm are converted into RTL circuits, and the algorithm is accelerated by using hardware.

1.1FPGA type selection analysis.

ZYNQ is an FPGA + ARM solution proposed by Xilinx, and an AXI bus protocol (AMBA 3.0) is adopted by an on-chip bus, so that compared with a common AHB bus protocol (AMBA2.0), the performance and bandwidth are higher, and the time delay is lower. With the increasing complexity of digital systems and the increasing difficulty of software and hardware collaborative design, the low-cost SoC solution of FPGA + ARM tends to be great ^[15] 。

The ZYNQ-7000 series chip is a dual-core ARM Cortex produced by Xilinx corporation ^TM The A9 processor and the FPGA programmable logic unit are integrated in a single chip, so that a single chip SoC solution of PS (processing System) plus PL (programmable logic) is formed, the development difficulty of the FPGA is reduced due to the chip, and the FPGA programmable logic unit is more widely applied due to the cost advantage. The PS part takes an ARM Processor as a core Application Processing Unit (APU), and integrates rich peripherals, such as DDR, QSPI Flash, USB, UART, SPI, CAN, Ethernet and the like. And the PL part provides a high-precision DSP processor, an XADC analog-to-digital converter and other common logic units for users. The PS and PL interaction adopts an AXI bus based on an ARM AMBA3.0 protocol, the on-chip bus has the characteristics of high performance, high bandwidth and low delay, and a high-performance SoC solution is easier to design through flexible programming of an FPGA.

1.2FPGA technology.

FPGA (field Programmable Gate array) is a high-performance Programmable logic device firstly introduced by Xilinx corporation in the 80 th 20 th century, and is used as a semi-custom Circuit in the field of Application Specific Integrated Circuits (ASIC), and the FPGA overcomes the defects that the Gate Circuit number of the original Programmable device is insufficient and the custom Circuit cannot be modified ^[16] . Compared with a professional integrated circuit, the FPGA has low processing speed, but the cost of the FPGA is lower, and the FPGA can be reconfigured after the logic circuit is designed and shaped, so that the FPGA is more flexible and convenient compared with an ASIC. The FPGA provides a reconfigurable hardware solution and software for a specific application programCompared with the traditional CPU, the FPGA has lower power consumption and can be applied to actual application in a large range ^[17] 。

With the emergence of a new generation of FPGA technology, the FPGA chip has higher speed, higher integration level and stronger flexibility, and can be applied to a complex sequential logic and combinational logic circuit in a large scale. Particularly in the field of high-frame-rate video image processing, a parallel operation mode and a pipeline design of the FPGA can process a large amount of image data in real time, and the speed is higher than that of a soft-ware method. In some fields with higher requirements on processing speed and real-time performance, compared with the traditional software method, the FPGA has the advantages of higher flexibility, better stability, smaller volume, lower cost and higher practical use value.

1.3 main method of image segmentation.

Image segmentation can divide an image into a plurality of small regions which are not intersected with each other ^[19] And each small region has a characteristic property different from other small regions, and finally, the interesting target region is extracted from the small regions. Image segmentation may extract a target region of interest from a background region to facilitate subsequent processing and analysis of the image. Image segmentation is a premise for effective feature matching and target recognition of images, and some conventional image segmentation methods will be described next.

1.3.1 threshold-based image segmentation method.

Threshold segmentation (threshold) is to select an optimal threshold according to the gray features of an image and divide each pixel point in the image into a target area and a background area. In general, threshold segmentation methods can be classified into two major categories, global thresholding and local thresholding. Because the global threshold method is to use the same threshold to segment the whole image, when the gray scale difference between the target and the background in the image is large and the contrast is obvious, the global threshold segmentation method can be used. However, in most cases, the contrast between the object and the background is different in various parts of the image, so that it is difficult to segment the object from the background by using a uniform threshold. When actual processing is performed, different threshold segmentation can be respectively adopted for the sub-regions of the image according to the local features of the image.

Because the threshold segmentation is calculated by adopting the gray value of the image, compared with other segmentation methods, the threshold segmentation is simpler, and the image can be segmented more quickly and effectively. However, the spatial and textural features of the image are ignored, the threshold segmentation is not ideal for the segmentation effect of the image with a complex background at the edge, and the segmentation effect is relatively poor.

1.3.2 edge-based image segmentation method.

The image edge is the boundary between two pixel points with obviously different brightness values, the image edge segmentation generally adopts the gray value of the image to divide the image, the image edge is often the area with larger gray value change, and the image edge can be extracted by calculating the gradient of the gray value. The edge segmentation of the image is mainly divided into the following three key steps ^[24] ：

(1) And (3) detecting edges: firstly, smoothing processing is carried out on the image by using a Gaussian low-pass filter, and image noise is reduced. And then calculating the local gradient (modulus and direction) of each pixel point in the smoothed image, and detecting all edges in the image.

(2) Non-maxima suppression: the local maxima are marked as an edge and other non-maxima will be suppressed and set to zero. The main purpose of this step is to remove potential false edges.

(3) Hysteresis thresholding and edge linking: using two thresholds, a low threshold T _low And a high threshold value T _high Thresholding is carried out on the pixel points after the non-maximum value is restrained, and the gradient value is smaller than T _low Is considered as a non-edge pixel, the gradient value is greater than T _high The pixel of (a) is considered a strong edge pixel. Gradient value between T _low And T _high The pixels in between are called weak pixels, and whether the gradient value of the adjacent pixels is larger than T or not should be referred to _high If so, the pixel is an edge, otherwise, the pixel is a non-edge. This process is called hysteresis thresholding. Finally, connecting the edge pixels to obtain a stripThe edges of significance.

Compared with other image segmentation methods, the edge detection method has the advantages that the calculation speed is high, the positioning of edge pixel points is more accurate, the defect is that the edge extraction is easily interfered by noise, especially when a complex image is segmented, the segmented image edge may have the problems of edge blurring, edge discontinuity, edge loss and the like, and finally the obtained error edge affects the efficiency of subsequent image processing.

1.3.3 region-based segmentation method.

The principle of the region-based image segmentation method is to divide pixels with similar properties into the same region, namely to divide target pixels and background pixels in an image into different regions respectively ^[25] . The region segmentation method is mainly classified into two types, i.e., a region growing method and a region separating and merging method. In the former method, a single pixel point is taken as a seed pixel point, and pixel points with the same or similar properties with the seed pixel point in adjacent pixel points are merged into the same region, and by analogy, the segmented region is finally obtained. The latter is used as the inverse process of the region growing, namely starting from the global image, continuously splitting the global image region to obtain each sub-region, then combining all the foreground sub-regions to obtain the foreground target, and realizing the segmentation of the target region and the background region.

The watershed algorithm used initially in the invention realizes the segmentation of the field image, and the flow is as follows: firstly loading an RGB color image, graying the image, then performing Gaussian filtering on the image to remove noise, performing edge detection by using a Canny operator to obtain the outline of the image, and numbering the outlines of different areas. Finally, filling each region of the image with color, and fusing the segmented and filled image with the original image to obtain the final segmented image,

as can be seen from the segmentation effect, Canny edge detection is not suitable for images with more edges, so the whole segmentation effect is not good. Especially in the field environment with complex background, the watershed segmentation algorithm is not suitable for segmenting crops and cannot separate plants from the soil background, so the algorithm is not suitable for segmenting the crops in the field.

1.3.3 other segmentation methods.

The image segmentation method based on the mathematical graph theory comprises the following steps: the principle is that an image is mapped into an undirected weighted graph, pixel points on the image are used as vertexes of the graph, adjacent relations among the pixel points are used as edges of the graph, and weights on the edges are used for indicating similarity and difference among the adjacent pixel points. The graph theory-based segmentation method is essentially to divide a graph into a plurality of subgraphs, wherein the similarity inside the subgraphs is maximum, and the similarity between the subgraphs and the subgraphs is minimum ^[27] . There are many ways of image segmentation.

2. The method is suitable for the division algorithm of field crops.

The image segmentation algorithm based on the watershed is not suitable for being used on the images of field crops as can be known from the segmentation effect of the images. Compared with a common image, the number of the edges of the image of the field crop is large, the segmentation effect by using the watershed algorithm is poor, and the region growth is difficult to realize in a hardware environment. In order to better segment the field crop image, the invention mainly introduces a method suitable for preprocessing the field crop image, detecting the edge and segmenting the image.

2.1 super-green feature extraction algorithm of field crops.

When processing field crop images, firstly all pixel points in the images need to be divided into two types: plants and background. The background removal is a critical step, and if the processing is not proper, the wrong classification is likely to be caused.

Several researchers have used color characteristics to separate plants from soil backgrounds, Rasmussen, Meyer, G.E. and Camargo-Neto ^[29] And Kirk et al used color characterization to distinguish green plants from soil and estimate leaf area. Therefore, in order to extract the green plants in the field crop image, the invention adopts an ultragreen characteristic extraction algorithm to distinguish the green crops from the background. The algorithm is Wobebeck ^[28]-[30] Et al, 1992-1995 contrasted various plants used to isolate green plants from a soil backgroundAnd after the color indexes are obtained, a better algorithm for separating the plants and the soil is obtained.

The ExG (Exprocess Green index) is an ultragreen index, can extract a Green channel in an RGB color space image, and can better separate plants from soil. ExG provides a clear contrast between the plant and the soil, resulting in a supergreen grayscale image close to a binary image. The ExG index finds wide application in the separation of plants from non-plants. The main operation process is shown as formula (2-1).

ExG＝2G-B-R (2-1)

The ExG can extract the green part in one image and convert the original RGB color image into an ultragreen gray image, so that the green plant part of the crop image is separated from the soil background, and the specific calculation process of the ultragreen feature extraction algorithm is shown as the formula (2-2) ^[33] 。

After the image is subjected to super-green feature extraction, the calculated super-green index ExG is compared with an image threshold T (the calculation method of the threshold will be described in the next section), and when ExG > T, it is determined to be a green plant, otherwise, it is determined to be a soil background. The specific operation flow chart of the algorithm is shown in fig. 1.

After ultragreen feature extraction of the field crop image, the green plants appear gray while the soil background appears black, as shown in fig. 2, the green plants are successfully separated from the soil background.

2.2OTSU adaptive threshold segmentation.

OTSU, also known as maximum inter-class variance method, is an adaptive threshold segmentation algorithm proposed by Otsu university in 1979. The algorithm can better segment the foreground and the background of an image without manually setting threshold parameters, is simple to calculate, and is more suitable for field crop images in complex environments than other segmentation algorithms.

The algorithm calculates the maximum inter-class variance between the target and the background by continuously counting the histogram information of each region, and the corresponding gray value is the most appropriate segmentation threshold T. The basic steps of the OTSU algorithm are as follows:

for image I (x, y), the segmentation threshold for the foreground (i.e., object) and background is denoted as T, and the ratio of foreground pixels to the entire image is denoted as ω ₁ Mean gray value of foreground as mu ₁ The proportion of background pixels to the whole image is omega ₂ Average gray level of background is μ ₂ The average gray level of the whole image is recorded as mu, and the inter-class variance is g.

Assuming that the image size is M × N, the number of pixels in the image with the gray-scale value smaller than the threshold value T is N ₁ The number of pixels with the pixel gray level larger than the threshold value T is N ₂ Then:

N ₁ +N＝M×N (2-3)

ω ₁ +ω ₂ ＝1

μ＝ω ₁ μ ₁ +ω ₂ μ ₂

g＝ω ₁ (μ ₁ -μ) ² +ω ₂ (μ ₂ -μ) ²

and (3) adopting a traversal method to enable the threshold T with the maximum inter-class variance g to be the obtained threshold T. The threshold T obtained here is the threshold for comparing the previous section with the ultragreen index, so that when the gray value corresponding to the pixel point is greater than the threshold, the pixel point is set to 1 (white), otherwise, the gray value is set to 0 (black). Finally, the whole image is converted into a binary image, and the background and the target are separated.

2.3 median filtering.

After the image is subjected to threshold segmentation, the tiny noise in the image needs to be removed, so that the edge of the binary image is smoother. The invention adopts the median filtering method to remove the noise, because the method can better reserve the edge information compared with other filtering methods to protect the image details while reducing the image noise. The invention adopts a 5 × 5 Median Filter (media Filter) sliding window, sorts all pixels in the window to take a Median, and assigns the Median to a pixel at a central point, and the specific operation process is shown in fig. 3.

As can be seen from fig. 3, the gray values of all the pixel points in the window are sorted, the median is taken and filled into the window center point, and so on until all the pixel points of the image are traversed, and the median filtering of the whole image is completed by sorting, taking the median and filling the window center point.

The invention is further described below in connection with the design of the system functional modules.

According to the image acquisition and display sequence in the system, the invention realizes the design and realization of the following parts: configuring a camera sensor; designing an IP core of the image sensor; and designing an image display module. And designing an image data storage module.

Fig. 4 shows the overall architecture of the system of the present invention. As can be seen from fig. 4, the image data is transferred within the FPGA logic with an AXI bus. Image data acquired by a camera passes through a Video in to AXI-Stream IP core and becomes AXI Stream, and the image algorithm disclosed by the invention is executed in HLS image processing. The AXI VDMA performs data interaction with the PS end through AXI interconnection, and stores or reads image data into or out of the DDR. Axi-Stream to video out converts AXI Stream into image data in RGB format, and the image can be displayed by HDMI controller.

1. A camera is provided.

The OV7725 of the present invention is an 1/4 inch CMOS VGA (640 x 480) image sensor manufactured by OmniVision. The properties of OV7725 are as follows: high sensitivity, low voltage are suitable for embedded applications. The standard SCCB configuration interface can output video image outputs in formats of RawRGB, RGB (RGB565/RGB555/RGB444), YUV422, YCbCr and the like through configuration of corresponding registers. GA. QVGA, and various image sizes of 40 × 30 to GIF (352 × 288) resolutions. The node edge is enhanced, noise is suppressed, a frame synchronization mode can be realized, and image scaling, translation and window setting are supported. Image control function: automatic Exposure (AEC), Automatic White Balance (AWB), automatic band-pass filtering (ABF), Automatic Black Level Calibration (ABLC), etc. Parameters such as color saturation, sharpness, gamma calibration, etc. may be adjusted.

1.1OV7725 register.

The OV7725 camera has 172 registers, and can obtain image data with better image quality by configuring the registers through an SCCB bus interface.

2. And designing a camera register configuration interface. The system adopts an Omni Vision 7725 CMOS camera as an image acquisition sensor. The FPGA carries out initialization configuration on the OV7725 Camera through SCCB (OmniVision Serial Camera Control bus) bus protocol, writes values in corresponding registers to Control all parameters of image acquisition, and can completely Control the image data format and the transmission mode. Fig. 5 is a block diagram of the internal structure of the OV 7725.

The main interface for image acquisition comprises the following pins, as shown in table 1.

Table 1 main pins for image acquisition

Because the hardware platform used by the invention does not have an IIC interface, the invention adopts GPIO ports of two PL ends to simulate IIC interface signals. According to the IIC protocol and corresponding timing, the I/O module is used for simulating the IIC timing so as to complete the configuration of the OV7725 image sensor. The process comprises the following steps: according to the timing characteristics of the SCL clock, register addresses are written through the SDA interface and values of parameters of the corresponding operating mode are written within these register addresses. I.e. first the device address, second the register address and finally the data, image data D [9:0 ]. The key code for simulating the SCCB time sequence by using the I/O port is as follows:

1.1.1SCCB initialization timing

SCCB is a communication protocol based on the IIC protocol and follows the IIC bus protocol, and such an interface generally has two signals, i.e., a data signal SDA and a clock signal SCL. In order to make the camera work normally, the configuration of the camera register must be completed through the IIC communication bus protocol. OV7725 has a total of 172 registers, some read-only, some write-only, and some read-writable. Only by initializing the camera and configuring the corresponding register, the camera can normally work and the image quality of the image meets the requirements. The code flow chart for initializing the camera according to the present invention is shown in fig. 6.

In the initialization function, 0x80 is written into the COM7 register of the camera by the SCCB write function, and the image output format is set to RGB 565. As known from OV7725 chip documents, the number of the chip is stored in the 0x1c register, if the communication can be correctly carried out, the 0x70 can be read out, and the relevant register can be configured after the chip number is correctly read out. The invention realizes the corresponding functions of OV7725 camera acquisition by configuring the register: the OV7725 captures an image with pixels of 640 × 480, supports a capture frame number of 60fps, outputs an image in RGB565 VGA format, configures an internal clock, cancels white balance, and the like.

Configuration flow of OV7725 image sensor register: firstly, according to the time sequence of SCL clock bus, the parameter value of working mode is written into the corresponding register address by bus interface SDA so as to collect video image data D [9:0 ]. The line synchronizing signal HREF and the field synchronizing signal VSYNC of the image are obtained while outputting the image data. Fig. 7 is a timing diagram for OV7725 initialization.

3. Design of the SCCB bus controller.

The SCCB bus controller is mainly used for configuring an internal register of the OV7725 camera and initializing the OV7725 according to requirements so as to obtain the required image quality. Because the ZYNQ hardware platform used by the invention has no IIC interface, the bus time sequence of the SCCB is simulated by utilizing the programmable capability of the FPGA and referring to the time sequence of the SCCB protocol through two pins of the GPIO interface so as to realize the configuration of a camera register and the transmission of image data to the DDR.

The SCCB bus of the camera mainly depends on two signal lines to complete data communication between the master equipment and the slave equipment. The clock SCL signal line is used by the master device to drive the slave device to receive and transmit data. The SDA signal line is used for selecting a device address and transmitting a data signal. The slave addresses transmitted through the SCCB bus cannot be duplicated and a slave can only have one address value. The driving signal is generated by the master device regardless of whether data is written or read between the master device and the slave device. The master device of the system is an FPGA board, and the slave device is an OV7725 camera. When the FPGA board and the OV7725 camera transmit data, firstly, the device address of the OV7725 camera is searched, namely, the slave device communicating with the FPGA board is determined to be the OV7725 camera through address byte matching, wherein the high 7 bits in the address bytes are the device address, and the lowest bit is the reading and writing flag bit. An SCCB timing diagram for serial data is shown in fig. 8.

According to the transmission timing diagram of the SCCB bus, the invention makes the following settings for the reading and writing of the register:

(1) writing of the register: when the SCCB carries out writing configuration on the camera register, firstly, the address of the OV7725 camera is written, then, the address of the register is written, and finally, a specific numerical value is written into the register. The write address of the OV7725 camera used in the invention is 0x42, the first 8 bits of data of the written register represent the address of the register, and the last 8 bits of data represent the value set by the register.

(2) Reading of the register: when the SCCB completes the read configuration of one register, it needs to read the device address of the OV7725 camera once, then read the address of the corresponding register, and finally read the specific value in the register. Likewise, the reading address of the OV7725 camera used in the present invention is also 0x 42.

4. And designing an image sensor interface IP core. The invention packs the hardware language of the image sensor interface by a Vivado IP design tool, and introduces the IP into a hardware system, and the IP core of the packed image sensor interface is shown as figure 9.

The IP core main port description of the present invention is shown in table 2:

TABLE 2IP core Key Port

The IP core contains 3 source program files, OV _ sensor.v, cmos _ decode.v, and count _ reset.v. In the OV _ sensor.v program, one register operation is performed on cmos _ data _ i, cmos _ href _ i and cmos _ vsync _ i, because some glitches are removed after register, the stability of image data is improved, and the image effect is improved greatly. And cmos _ decode.v is a key part in the IP core, mainly realizing decoded output of RGB565 and timing matching with vid _ in IP. count _ reset.v source file implements a delayed reset of the signal.

5. Video capture encoding

The video acquisition coding part of the invention is mainly embodied in a cmos _ decode.v file. The reset signal is first delayed by 5 clock cycles. Hardware language is different from software, and the FPGA is likely to have slow signal turnover or long delay, and the setup time and the hold time are insufficient, so that the reset signal error is caused in practice. The invention delays the reset signal for 5 periods, namely, the reset signal is stabilized and then is subjected to other processing. Then, the line field signal is subjected to edge detection processing, and the falling edge of the vsync signal is captured as vsync _ start, and the rising edge of the vsync signal is captured as vsync _ stop. And the vsync _ start signal marks the start of image data acquisition, and the vsync _ stop signal marks the end of the complete frame of image data acquisition. Finally, the frame rate is counted according to the vsync _ start signal and the pixel clock, and a jump is made when the count reaches 15 times. The main role of the frame rate count is to control the output enable signal out _ en. In order to output and display the image data on the display screen, the image data needs to be format-converted, and this part is described in the next section.

The logic correctness of the image acquisition code is verified through the time sequence simulation of the image sensor interface IP core, and fig. 10 shows the function simulation result of the camera acquisition module.

As can be seen from fig. 11, when the vsync signal is a falling edge, the vsync _ start signal is active. With Vsync being a rising edge, the Vsync _ stop signal is active. The line field signals vs _ o and hs _ o also meet the requirements of the invention, and the correctness of the module is verified.

6. And (5) converting the image format.

The image data collected by the OV7725 camera is in RGB565 format, and in order to display the image, the image in RGB565 format must be converted into RGB888 to be displayed on the screen, and the conversion process of the present invention is shown in fig. 11.

The invention has simple conversion process, divides the 16-bit data of RGB565 into 8-bit data of RGB three channels, and fills the spare bits with 0. The present invention implements this process using a hardware language. In the OV _ sensor.v file, the video output decoding process is implemented by the following statements.

7. And designing an image driving display module.

With the development of multimedia high-definition technology and devices, the conventional vga (video Graphics array) interface technology has failed to meet the actual requirements. A High Definition Multimedia Interface (HDMI) is a digital video/audio Interface, and has a larger transmission information amount and a faster transmission speed compared to a VGA Interface.

8. And displaying the HDMI image.

Because the hardware platform used by the invention does not have a special HDMI chip ADV7511 for HDMI output, a method of simulating HDMI time sequence by an I/O interface is adopted to realize the HDMI interface function. Compared with the implementation mode of the special chip ADV7511, the method not only saves the hardware cost, but also is more practical under the condition that the FPGA development board chip resources are rich, and the maximum output resolution can reach 1080P. HDMI employs the same transmission principles as dvi (digital Visual interface), i.e., using the TMDS protocol. The I/O structure on the PL side in ZYNQ supports TMDS, and the I/O of FPGA can be used for directly driving HDMI signals or receiving HDMI signals. The HDMI display image can be simulated through an I/O (input/output) interface, and the VGA time sequence and data are converted into TMDS data and an HDMI time sequence which can be used for HDMI display.

The VGA timing is largely divided into row timing and field timing as shown in fig. 12. The VGA adopts a progressive scanning mode, from top to bottom and from left to right, when a line control signal changes from a high level to a low level and lasts for a time a, the line control signal jumps from the low level to the high level, at the moment, the line signal is synchronized and sequentially enters a line shadow elimination back shoulder, line effective data and a line shadow elimination front shoulder, and the display of image data is completed during the line effective data. Finally, when the row control signal changes from high to low again, the scanning of one row of data is completed. The field timing is similar to the row timing and is not described in greater detail herein. It should be noted that the effective display of a frame of image can be completed only during the period when the line and field data are simultaneously effective.

Table 3 shows several VGA resolution timing parameters that are commonly used.

The invention adopts RGB2DVI IP core of Digilent company to complete the HDMI display of the image, the IP core can convert RGB signals output by the VGA module into DVI signals, namely TMDS format, and the image can be displayed by connecting the HDMI display with the HDMI interface on the FPGA development board.

The process of simulating the HDMI display image by adopting the I/O port comprises the following steps: the image data stream clock is set to two clocks by the PLL lock as the pixel clock and the serial clock of the rgb2dvi IP core, respectively, and the rgb2dvi IP core can change the VGA image data into the TMDS image data by the pixel clock, the serial clock, and the line signal, the field signal, and the enable signal of the image, wherein the HDMI display RTL circuit is as shown in fig. 13.

9. And (4) video timing.

A Video Timing Controller (VTC) IP core is provided by Xilinux authority. Which is integrated at the PL terminal to generate the appropriate video signal to drive the HDMI display. The video timing controller is mainly used for monitoring and generating video timing signals so as to synchronize video data.

The video timing signals are used to detect the horizontal, vertical sync, blanking signal and range of the active pixels of the video stream. In the design of the invention, the IP core is matched with an AXI4-Stream IP core (Vid in) to generate a required video time sequence signal to drive an HDMI display screen to display image data.

10. AXI-Stream Subset changer module.

The AXI-Stream Subset Converter module adopted by the invention is mainly used for extracting 24-bit red, green and blue signals from 32-bit (data read from DDR is read by taking 32bits as a unit). The storage manner of an image to be displayed in the DDR is defaulted to BGR (bush, Green, Red), which is [7:0], [15:8], [23:16] constituting image data of 24 bits, respectively. The rgb2dvi module used to generate the TMDS signals in the present invention has pixel ordering of R, B, G. The display order adjustment needs to be performed for the configuration of the TDATA Remap String. This is important as the image display is not essential, which could result in the image displaying being scrambled or even impossible.

11. And designing a data transmission module.

11.1VDMA IP core configuration.

The CMOS camera image acquisition interface IP core adopted by the invention reads video data (640 multiplied by 480) from the OV7725 camera, and sends the video data to the DDR memory for temporary storage by using the VDMA IP core after a series of video preprocessing IP core processing.

AXI VDMA, however, is a soft core IP developed by Xilinx corporation, and is primarily used to provide a high-speed data access channel between the AXI4-Stream video type target IP and system memory. The AXI4-Stream format data Stream cannot be directly used to drive HDMI display output, and the data Stream needs to be driven by a video enable signal and a horizontal field synchronization signal to drive an HDMI display to display video.

The IP core has two paths of AXI4-Stream interfaces, namely AXI Memory to Stream (MM2S) and AXI4-Stream to Memory Map (S2 MM). Among them, MM2S is mainly used for outputting video Stream data converted into AXI4-Stream format, and S2MM is mainly used for converting the received AXI4-Stream format video Stream into memory data and storing it into a memory. The two ports MM2S and S2MM are independent of each other and can operate simultaneously.

The present invention writes a data Stream of an AXI-Stream type image to the DDR3 through the VDMA write channel, and the read channel can read image data from the DDR3, and finally display the image through the display device. Because the VDMA can control up to 32 frames and can freely switch the frames, the multi-buffer mode used in the present invention realizes the display of images. A block diagram 14 of the AXI VDMA is shown.

There are mainly five interface types: AXI-lite, AXI Memory Map Write, AXI Memory Map Read, AXI Stream Write (S2MM), AXI Stream Read (MM2S), AXI Stream, the latter three interfaces being mainly used in the present invention.

In the design of the invention, the Address Width is set to 32bits according to the size parameter of each frame image and the specific read-write space of the VDMA, and the addressing space can reach 4 GB. In addition, when VDMA IP is arranged in the PL portion, since the number of frame buffers is 3, it is necessary to arrange 3 START _ addresses in S2MM and MM2S, and the interval between addresses is not less than the total number of bytes of one frame image.

In order to make video data displayed smoothly on the HDMI, the phenomenon that two front and back frames of video are staggered easily due to reading and writing of a single frame of video occurs, and the problem can be well solved by using the Ping-Pong reading and writing operation of a Bank. The design idea of the invention is as follows: the frame buffer of DDR is set as a read buffer and a write buffer, so that the input and output bandwidth of data is in the effective range of SDRAM.

And in time, the two read-write buffer areas can be used for simultaneously performing read-write operation on the image without mutual influence. The specific implementation is shown in fig. 15.

The invention adopts ping-pong read-write operation to realize the data transmission process of crossing clock domains as follows: t is t ₀ At the initial time, in clock domain 1, data is written into the read buffer while data in the write buffer is read. t is t ₁ At the next time, in clock domain 2, data is written into the write buffer while data in the read buffer is read. When the data in the read buffer area is completely read and the data in the write buffer area is completely written,thus realizing one-time ping-pong read-write operation.

The invention builds the collection process of the video image, designs each module by utilizing the storage and display of the FPGA image data, and explains the design of each module in detail. The hardware framework of the whole video image system is realized through the design and the realization of the main modules. An effective solution is provided for problems occurring in the process of building a video channel, and possible consequences of wrong design are also explained.

Fourthly, the invention is further described by combining with the FPGA-based field crop division algorithm.

The method analyzes the design and implementation of the ultragreen feature extraction, Canny edge detection, median filtering and OTSU threshold segmentation algorithm on an FPGA platform. The four algorithms are redesigned by using Vivado HLS software of Xilinx company, so that the algorithms are more suitable for parallel computation of FPGA, and the execution speed of the algorithms is improved to a certain extent under the condition of less resource usage.

1. And optimizing a hardware structure. The invention is mainly designed and adapted to the FPGA hardware environment through the following three hardware structures.

1.1 memory structure design.

In processing intensive image and video applications, the biggest bottlenecks are memory access issues and memory buffer setup issues. The invention writes the image data into the memory in the form of stream, then processes the video stream, and finally reads the memory and outputs the real-time video stream. Due to the limited resources of the FPGA, it is not possible to store the entire image in the FPGA. Therefore, the buffering method is needed to optimize each stage of the image algorithm. In HLS, image data is accessed in an AXI-Stream manner, and when picture data is read, the picture data must be read in the order from left to right and from top to bottom, and cannot be read randomly or reversely, but a buffer area in FPFA can be read randomly.

Therefore, processing a data stream on an FPGA requires a memory architecture to access data in the memory buffer multiple times. The invention uses the line buffer area and the memory window to read and write the image data in the memory, the line buffer area is mainly used for storing the complete image line, and the memory window mainly carries out operation on the image matrix. Fig. 16 shows a line buffer memory architecture used on an input frame image.

The line buffer continuously stores new image data, the window moves to the right, and image pixel points in the window move to the left, so that the window is updated by the new image data. First, there is no image data in the buffer, and the entire line buffer buffers data starting from the start address of the buffer until the first line of data is completely written into the line buffer. After the buffering of the first row of data is completed, the image data in the first row and the first column will move upwards until the buffering of the last row is completed, and the pixels in the rightmost column come from the current column of the line buffer and the newly input image pixel point data. And circulating the steps, and reserving three lines of image data in the line buffer area until the whole picture is completely buffered.

When the line buffer is filled, the image data in the line buffer is written into the window buffer, and fig. 17 is a memory window buffer architecture diagram of the image data.

After the line buffer area image data is completely buffered, the line buffer area image data is copied to a window buffer area, new image data can fill the window buffer area along with the rightward movement of a window, and the window buffer area moves from left to right and from top to bottom until the whole image data is completely buffered. The memory window buffer architecture is used in median filtering, Canny edge detection and OTSU threshold segmentation algorithms.

1.2 pipeline optimization.

The invention adopts the pipeline to redesign the algorithm, different functions in the algorithm can be operated concurrently, namely, the next function can be executed before the previous function is executed. The invention adopts a pipeline design to reduce the delay of the whole cycle and increase the throughput, and particularly has obvious effect on the acceleration of the algorithm with a function which consumes more time. Since all functions are executed in parallel, even if there is a function which takes a lot of time, the execution of the subsequent function is not affected. Unlike a general software method, if there is a time-consuming function in the algorithm, the subsequent function needs to be started after the function is executed, and the execution speed of the algorithm is correspondingly reduced. FIG. 18 shows a comparison of circulation pipelining and non-pipelining.

In C/C + +, etc., the operations in the loop body are executed in sequence, and the next iteration of the loop can only start after the last operation in the current iteration of the loop is completed. While the cycle pipe allows operations in the cycle to be implemented in a concurrent manner. Assuming that one algorithm operation comprises three functions, in the case of not using a pipeline, there are three clock cycles between two algorithm operations, and the whole cycle needs nine clock cycles to be completed, however, after using a pipeline, there is only one clock cycle between two algorithm operations, and when the last algorithm operation executes the function 2, the function 1 of the next algorithm operation can be executed in parallel with it, so that the clock delay between the cycles is shortened, and the whole cycle needs only four clock cycles to be completed.

The present invention performs pipeline optimization by instruction # pragma HLS PIPELINE II ═ 1. The pipeline instruction uses a loop launch interval (II) factor to define the number of clock cycles between the launch of the loop body. In the invention, in the process line optimization of the Canny algorithm, all the AXIS2GrayArray function, GrayArray2AXIS function, Gaussian filter function, Sobel function and non-maximum suppression cycle operation use II ═ 1, which shows that each clock period has one cycle body and no idle waiting clock. The loop body pipeline takes advantage of parallelism among loop iterations, thereby improving performance of hardware.

1.3 array split optimization.

The method optimizes each step of the field crop division algorithm by using the ARRAY splitting instructions HLS ARRAY _ RESHAPE and HLS ARRAY _ PARTITION. According to the method, part of array structures in Canny edge detection, median filtering and OTSU threshold segmentation are split into arrays with smaller bit width, so that the array is more suitable for an FPGA (field programmable gate array) architecture. The array elements are connected by increasing the bit width, which reduces the use of RAM blocks. When an array is split into smaller arrays, the benefits of partitioning manifest themselves: arrays can be accessed in parallel and more data can be accessed in one clock cycle, improving the design throughput.

After the array is divided into smaller arrays or individual elements, the generated RTL circuit comprises a plurality of small memories or registers instead of a large memory in the traditional software environment, so that the number of the stored read-write ports is effectively increased. The line buffer array used in the invention is split, as shown in fig. 19, the number of read-write ports of the split line buffer area is increased, and 5 array elements can be accessed simultaneously in one clock cycle, thereby fully playing the advantages of parallel computation of the FPGA. FPGA implementation of green feature extraction

The extraction of the ultragreen characteristics of the field crop image only needs to calculate the gray values of the red, green and blue channels of the current pixel point, so that a buffer area does not need to be designed for caching, the hardware implementation of the algorithm is simple, and the classification of the green areas of the image is mainly carried out.

2. And a green threshold calculation module.

As shown in FIG. 20, the present invention first divides the field crop image into R, G, B three channels, and then calculates the green value twice the green channel by using an adder. And connecting the output of the adder and the red gray value of the red channel as inputs to a subtracter for subtraction, connecting the output of the subtracter and the blue gray value of the blue channel as inputs to another subtracter for difference, finally comparing the output of the subtracter with a threshold T, and obtaining the final classification result through a data selector.

The output of the subtractor is compared with a threshold T, and when the output of the subtractor is greater than the threshold T, the result obtained by the selector is 255, and when the output of the subtractor is less than the threshold T, the result obtained by the selector is 0. In this way, the green color of the field image can be isolated.

3. FPGA implementation of Canny edge detection.

The invention realizes the Canny algorithm on the FPGA, thereby meeting the requirement of real-time image processing. The flow of the algorithm is shown in fig. 21.

The invention uses the FPGA platform to realize the Canny algorithm, and mainly carries out hardware design on the following three steps:

(1) image smoothing: image smoothing is achieved by gaussian convolution. Firstly, line buffering is carried out on an image, after the buffering of a frame of image data is finished, the line buffered data is gradually written into a window buffer area, and a 5 multiplied by 5 moving window Gaussian operator is used for carrying out convolution on all pixels in the 5 multiplied by 5 window buffer area until the data in the window buffer area is completely processed. And finally, shifting the pixel points of the convolved image by eight bits to the right, and outputting the image data after being smoothed according to lines. The output of this stage will be used as input for the next stage.

(2) Solving gradient direction and module value: the invention creates a3 x 3 filtered image calculation kernel window, and convolves the calculation kernel with Sobel horizontal and vertical 3 x 3 window operators respectively to obtain horizontal and vertical gradient partial derivatives G _x And G _y Since more resources are needed to implement square roots in hardware, through G _x And G _y The sum of the absolute values of (c) determines the magnitude of the gradient G. The direction of the gradient is:

θ＝arctan(G _y /G _x ) (4-1)

the hardware implementation of the Sobel operator is shown in fig. 22.

(3) Non-maxima suppression: and calculating pixel points around the central point of the kernel central point of the image, comparing the high threshold Th with the low threshold Tl, and if the central point of the kernel is greater than Th, determining the image strong edge point. And if the central point of the calculation kernel is between Tl and Th, the calculation kernel is a weak edge point. Otherwise they are not counted as edge points of the image. This step is simple in the present invention, and is a simple comparison operation, so the speed is relatively fast in the execution process.

3.1 algorithm implementation.

In the HLS platform, pixels in the image can only be read one by one, table 4 (data in the table is from Xilinx Vivado HLS2017.4 software) shows the time required for accelerating the algorithm through the FPGA, and as can be seen from the figure, 276994 time periods are required for the whole algorithm to be executed, as shown in the figure, the utilization of resources is reasonable, 7088 LUTs (look-up tables) are used totally, the total utilization rate of the resources is 40%, and the utilization rates of other resources are also low.

TABLE 4 Canny Algorithm resource occupancy Table

In the present invention, the AXIS2gray array function, the gray array2AXIS function, the gaussian filter function, the Sobel function, and the non-maxima suppression loop operation all use II ═ 1, which illustrates that there is one loop body per clock cycle. The loop body pipeline takes advantage of parallelism among loop iterations, thereby improving performance of hardware. .

The pipeline optimization diagram of the Canny algorithm in the present invention is shown in fig. 23.

As can be seen from FIG. 23, C0-C14 are 14 clock cycles, and in the 14 clock cycles, the functions of the algorithm are executed in a pipeline manner, and there is no waiting time between each function, so the execution speed is high.

4. And (4) realizing FPGA of median filtering.

The main idea of median filtering is to traverse each pixel point of the image item by item and sort the pixel points, and replace each pixel point with the median of the adjacent pixel points. When the median filtering is realized in a software mode, the whole image can be accessed. The FPGA is different from software, images in the FPGA are transmitted in a streaming mode and do not support the whole image to be stored in a memory, so partial image data needs to be buffered firstly, the invention adopts a median filter with the window size of 5 multiplied by 5 to buffer the images in lines firstly, after the buffering of one frame of image data is finished, the data buffered in lines are gradually written into a window buffer area, and the design of the median filter window is shown in figure 24.

The invention adopts an FPGA internal memory (BRAM) to cache the image line. The first 4 lines of image data are firstly buffered by using BRAM, and then the data in the line buffer is written into a sliding window buffer. The line buffer and window buffer are described in detail above and will not be described here. The invention can simultaneously read the pixel values in the 5 multiplied by 5 window by using the sliding window, and can simultaneously read and process 25 pixel data in one clock period under the hardware environment, thereby improving the execution efficiency of the algorithm.

As shown in fig. 25, for the median filtering hardware design. The invention adopts 5 adders to respectively add the pixel values in each BRAM, and takes the output of the 5 adders as the input of another adder, so the output of the adder is the sum of the pixel points of the whole image. Because the window adopted by the invention is 5 multiplied by 5, and the image pixels in the window are binary image pixels, in the 5 multiplied by 5 window, the maximum sum of pixel values is 25, and the minimum sum is 0, and the median 12 is selected as a threshold value to be compared with the sum of pixel points of the whole image, so that the image after median filtering can be obtained.

5. FPGA implementation of OTSU threshold segmentation.

According to the algorithm steps of OTSU threshold segmentation, the FPGA can be divided into four modules when the FPGA is implemented: the field image pixel gray scale calculation module comprises a field image pixel gray scale grading module, a target and background pixel ratio calculation module, a target and background average gray value calculation module and an inter-class variance calculation module. Starting from the four modules, the implementation of OTSU threshold segmentation on the FPGA is explained.

And 5.1, carrying out field image pixel gray scale grading.

According to the invention, firstly, field image pixels are compared with 256 single-channel gray levels, due to the parallel characteristic of FPGA, 256 comparison calculations are carried out in parallel, as shown in FIG. 26, the image pixels are compared with 0, then the compared value is stored in a Reg0 register through a data selector, meanwhile, the output of the comparator 0 is firstly subjected to negation operation, then passes through an AND gate with the output of the comparator 1 and then is input into the data selector, and the output of the data selector is stored in a Reg1 register. And so on until all pixel values are stored in the corresponding registers.

5.2 target to background pixel ratio calculation.

The invention respectively calculates the ratio of foreground pixels to background pixels of the image according to the algorithm steps of the OTSU. As shown in FIG. 27, the input value of Reg0 is derived from the value of Reg0 register in the previous step, sum is the total number of pixels, Reg0 and sum are used as the input of the divider, and the output of the divider is the background pixel ratio ω ₁ . The output sum 1 of the divider is used as the input of the subtracter, namely the target pixel ratio can be outputω ₂ . In the FPGA, 256 times of repeated calculation of the target pixel ratio and the background pixel ratio are carried out simultaneously, and the design improves the calculation speed to a large extent.

5.3 target and background mean gray level values.

Fig. 28 shows a hardware design scheme for calculating the mean gray-scale values of the target and background according to the present invention. Firstly, inputting an M multiplied by N resolution picture, and performing division operation with Reg0, wherein calculated and graded image pixel gray levels are stored in registers of Reg 0-Reg 255. The output of the divider and 0 are multiplied, the output of the multiplier is stored in the Reg register, at the moment, the average gray value mu of the background is stored in the Reg register ₁ . Mean gray value μ of the object ₂ The calculation scheme is the mean gray value μ of the background ₁ The inverse process of the calculation scheme of (1).

5.4 between-class variance meter.

The calculation of the between-class variance is the last step of the algorithm, and the calculation formula of the between-class variance is as follows:

g＝ω ₁ ×(μ-μ ₁ ) ² +ω ₂ ×(μ-μ ₂ ) ² (4-2)

the simplified formula is:

g＝ω ₁ ×ω ₂ ×(μ ₁ -μ ₂ ) ² (4-3)

the hardware scheme of the inter-class variance designed by the invention is simpler, the ratio of image foreground and background pixels calculated in the first three steps of the algorithm and the average gray value of the target and the background are used as input, and the calculation of the inter-class variance can be completed by using three multipliers and a subtracter.

The invention applies these optimization schemes to the hardware design using algorithms of the invention. The FPGA implementation of the green feature extraction algorithm, Canny edge detection, median filtering and OTSU threshold algorithm suitable for the field video image is respectively analyzed, the algorithm structures are redesigned, so that the algorithm structures can be executed in parallel, the advantages of FPGA calculation are fully exerted, and the algorithm execution efficiency is improved to a greater extent.

The invention is further described below in connection with the experiments.

1. And (5) building a video path.

After the design of an image sensor acquisition IP core is finished, the image sensor acquisition IP core is connected with a VTC (Video Timing controller) IP core, a VID _ IN (Video IN to AXI4-Stream) IP core, a VDMA (AXI Video Direct Memory access) IP core VID _ OUT (AXI4-Stream to Video OUT) IP core and an RGB2DVI (RGB to DVI Video Encoder) IP core, and after time sequence constraint and pin constraint are carried OUT, the whole project is integrated and executed, and then a bit Stream file can be generated. The entire project is then exported to the SDK. And performing read-write operation programming on a corresponding register at the PS end in the SDK. And finally, downloading the bit stream file into an FPGA board by using the J-Link so as to build a hardware environment of the whole video channel. The software program can then be downloaded to the FPGA board. The hardware set-up of the entire video path is shown in fig. 29.

The image data collected by the OV7725 camera complies with the SCCB protocol. The image data is transmitted to the FPGA board according to SCL timing, in the format RGB 565. However, after the image is collected by the IP core designed by the invention, the image data can be converted into the RGB888 format, the VID _ IN IP core can be converted into the AXI4-Stream, and then the image is read out or written into the DDR memory through the read-write channel of the VDMA, thereby realizing the storage design of the image. If the image is to be displayed, the image data is required to be read from the DDR through the read channel of the VDMA and then is transmitted to the VID _ OUT IP core through the format of AXI4-Stream to display the image. Since the present invention displays an image using an HDMI display screen, it is necessary to decode a video stream and output 24-bit RGB video data using the RGB2dvi IP core of the deluxe corporation, and recover a pixel clock and a synchronization signal from a TMDS link, thereby displaying the video image on the HDMI display screen.

2. The design of the camera image acquisition module is realized.

Fig. 30 is a block diagram of a system of a camera image capturing module according to the present invention. Since the OV7725 camera complies with the SCCB protocol, the IIC module needs to be written to initialize the OV7725 camera. Through the initial configuration of the camera, the image data in the RAW format is acquired and converted into the image data in the RGB888 format by a method of shifting and adding zero. And writing the image data in the RGB888 format into a DDR3 storage through an S2MM write channel of the VDMA. If image data in the DDR memory is to be read and written, the VDMA IP core will use two HP AXI Slave ports to operate on the image data.

After the whole module is designed, the hardware unit needs to be instantiated and analyzed. If the hardware code has errors which do not accord with the HDL language rules, the analysis is failed, and further code modification correction is needed. After the analysis of the code is completed, in order to run a program on the FPGA board, the circuit layout and wiring are required to be added with I/O constraints, namely pin constraints and level standard constraints are added according to a schematic diagram of the FPGA board, and then the whole project is integrated to generate a board-level download file Bit stream. And finally, connecting the FPGA development board with the OV7725 camera to drive the camera and perform board-level verification and debugging on the camera image acquisition module. The first column in the pin allocation diagram is the pin name, the second column is the pin input/output port characteristic, the third column is the interface to which the pin belongs, and the fifth column is the pin number of the corresponding schematic diagram allocated to the pin by the invention. After the pin allocation is finished, board-level verification can be carried out by synthesizing and generating the download file.

An RTL circuit of the camera image acquisition module of the present invention is shown in fig. 31. As can be seen from FIG. 31, the image data cmos _ data [7:0] collected from the OV7725 camera is first temporarily stored in a register in the FPGA, and after passing through the image coding module cmos _ decode, the original eight-bit image data will become 24-bit image data, and some other line-field signals will be transmitted to the next module along with the image data in RGB888 format.

The invention connects the camera image acquisition module to the whole video image channel, analyzes the whole video channel, configures pin constraint, synthesizes, generates a bit stream file and downloads the bit stream file to the FPGA board to complete the whole debugging process. The RTL circuit diagram of the resulting video path is shown in fig. 32.

As can be seen from fig. 32, two GPIO pins are led out as input/output ports, and these two pins are used to drive the OV7725 camera to normally acquire image data, and according to the SCCB protocol, the present invention is programmed to make these two pins simulate SCL and SDA signals to complete initialization of the camera.

3. And (5) verifying the result of the field crop segmentation algorithm.

Different corn plants are selected to test the field crop segmentation algorithm, all the corn plant images are shot in a test field of the university of inner Mongolia, and the field crop segmentation algorithm is realized on the FPGA through the hardware optimization design of the fourth chapter. The segmentation effect of crops on the FPGA is shown in fig. 33, and the segmentation algorithm segments the green crops in the image through ultragreen feature extraction, Canny edge detection and OTSU threshold segmentation, and filters the green crop noise in the image through median filtering. Although the segmentation effect of the algorithm is different from the software implementation effect to a certain extent, the accurate segmentation of the green plants is realized to a certain extent.

As can be seen from fig. 33, after the maize plants were subjected to ultragreen feature extraction, Canny edge detection, median filtering and OSTU thresholding, the plants were separated from the soil background. The image segmentation algorithm is realized by less resources, 2824 triggers FF and 4625 look-up tables LUT are occupied by hardware of the segmentation algorithm, a block register BRAM is only 4KB, and the segmentation effect is good.

6. Software is contrasted with hardware implementations.

The invention realizes the field crop segmentation algorithm through the FPGA hardware architecture. Table 5 is a Canny algorithm software and hardware implementation comparison table, and table 6 is a software and hardware comparison table of three algorithms of field plant supergreen feature extraction, threshold segmentation and median filtering. Table 5 compares the time and the frame rate for implementing the Canny algorithm under three different software platforms of the FPGA hardware platform and the CPU used in the present invention, and it can be seen that the implementation efficiency of the Canny algorithm on the FPGA ZYNQ-7010 experimental board is high, and the frame rate implemented on the FPGA board is about 3 times of the frame rate for implementing the equivalent algorithm by the experimental CPU.

TABLE 5 Canny algorithm software and hardware implementation contrast table

Table 6 threshold segmentation software and hardware implementation comparison table

Table 6 shows the software and hardware comparison results of the field plant ultragreen feature extraction, threshold segmentation, and median filtering algorithms used in the present invention, and it can be seen from the table that the average speed of the image segmentation algorithm implemented on the FPGA in the present invention is about dozens of times of the speed of the software implemented algorithm on the experimental PC, and the smaller the image size is, the better the speed improvement effect is.

7. And (5) analyzing the cost performance.

By 3 months in 2019, the market price of the Intel Core i7-8500H (2.2Ghz) type CPU used by the invention is about 3000 RMB, while the price of the ZYNQ-7000 series development board used by the invention is about 600 RMB, so that the cost is lower compared with the cost of the traditional complete machine using the CPU. And secondly, the development board used by the invention has small volume and low power consumption, and is more suitable for complex field working environment. Finally, the development board used by the invention is an ARM + FPGA architecture, and compared with the traditional CPU chip, the development board can realize large-scale professional application at lower cost in practice.

The present invention will be further described with reference to effects.

The method firstly analyzes the image segmentation algorithm under the software condition, combines related theoretical knowledge, changes the structure of the algorithm to be suitable for hardware environment to a certain extent, adds instructions such as data, a production line and the like, accelerates the execution speed of the algorithm, and is more suitable for the application scene of real-time image segmentation.

Secondly, a video channel based on the OV7725 camera is built through the design of each module of the image acquisition display system. Meanwhile, HLS software of Xilinx company is used for correspondingly optimizing the image segmentation algorithm, after simulation, synthesis and verification are passed, the project of the whole algorithm is packed into an IP core, and the IP core is connected to the whole video channel, so that the real-time segmentation and display function of the image is realized.

And finally, after sufficient preparation work and deep theoretical technology learning are carried out, Vivado 2017.4 and Vivado HLS2017.4 are used as development platforms, Verilog and C + + program design languages are used, and the FPGA-based field video image segmentation system is designed and realized by combining the flows of field plant ultragreen feature extraction, Canny edge detection algorithm, median filtering and threshold segmentation and FPGA hardware development. The invention realizes the real-time segmentation function of the field video image and can provide certain help for the field robot to acquire data.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims

1. A field video image real-time segmentation method based on FPGA is characterized by comprising the following steps:

firstly, combining an image segmentation algorithm under a software condition with relevant theoretical knowledge, modifying the structure of the image segmentation algorithm suitable for a hardware environment, and adding data and pipeline instructions to enable the image segmentation algorithm to be suitable for an application scene of real-time image segmentation;

secondly, designing each module of the image acquisition display system to build a video channel based on a camera; meanwhile, the image segmentation algorithm is correspondingly optimized, the project of the image segmentation algorithm is packaged into an IP core, and the IP core is connected into the whole video channel, so that the real-time segmentation display of the image is realized;

thirdly, designing and realizing a field video image segmentation system based on the FPGA by using Verilog and C + + programming languages and combining the flows of field plant supergreen feature extraction, a Canny edge detection algorithm, median filtering and threshold segmentation and FPGA hardware development;

the method comprises the following steps of firstly, modifying the structure of an image segmentation algorithm suitable for a hardware environment, including improving a memory structure, wherein the method for improving the memory structure comprises the following steps: reading and writing image data in a memory by adopting a line buffer area and a memory window, wherein the line buffer area is used for storing complete image lines, and the memory window is used for carrying out operation on an image matrix;

the line buffer area continuously stores new image data, the window moves to the right, image pixel points in the window move to the left, and the window is updated by the new image data;

the buffer area has no image data, and the whole line buffer area starts to buffer data from the initial address of the buffer area until the first line of data is completely written into the line buffer area; after the data buffering of the first row is finished, the image data of the first row and the first column can move upwards until the data buffering of the last row is finished, and the pixels of the rightmost column are from the current column of the buffer area and the newly input image pixel point data; repeating the circulation, wherein the line buffer zone keeps three lines of image data until the whole picture is completely buffered;

when the line buffer is full, writing the image data in the line buffer into the window buffer,

2. The FPGA-based field video image real-time segmentation method as claimed in claim 1, wherein in the first step, data and pipeline instructions are added to adapt the image segmentation algorithm to the application scene of real-time image segmentation, including pipeline optimization, and the pipeline optimization is performed through an instruction # pragma HLSPIPELINE II-1; the pipeline instruction defines a number of clock cycles between loop body starts using a loop start interval factor.

3. The FPGA-based field video image real-time segmentation method of claim 1, wherein in the first step, data and pipeline instructions are added to adapt the image segmentation algorithm to the application scenario of real-time image segmentation, further comprising array splitting optimization,

4. The FPGA-based field video image real-time segmentation method of claim 1, wherein in the second step, the video channel construction method comprises:

firstly, after the design of an image sensor acquisition IP core is finished, connecting the image sensor acquisition IP core with a VTCIP core, a VID _ INIP core, a VDMA IP core, a VID _ OUTIP core and an rgb2 dvisiP core, and generating a bit stream file after integrating and executing the whole project after timing constraint and pin constraint are carried out;

then, exporting the whole project to the SDK; performing read-write operation programming on a corresponding register of the PS end in the SDK;

5. The FPGA-based field video image real-time segmentation method of claim 1, wherein in the third step, the method for extracting the ultragreen features comprises the following steps:

firstly, dividing a field crop image into R, G, B three channels;

then, calculating a green value which is twice of the green channel by using an adder; connecting the output of the adder and the red gray value of the red channel as inputs to a subtracter for subtraction;

connecting the output of the subtracter and the blue gray value of the blue channel as input to another subtracter for difference;

finally, comparing the output of the subtracter with a threshold value T, and obtaining the final classification result through a data selector; the output of the subtracter is compared with a threshold value T, when the output of the subtracter is larger than the threshold value T, the result obtained by the selector is 255, and when the output of the subtracter is smaller than the threshold value T, the result obtained by the selector is 0; the green color of the field image was isolated.

6. The FPGA-based field video image real-time segmentation method as set forth in claim 1, wherein in the third step, the implementation method of the FPGA for Canny edge detection comprises:

(1) image smoothing: smoothing the image by gaussian convolution; firstly, line buffering is carried out on an image, after the data buffering of a frame of image is finished, the line buffered data is gradually written into a window buffer area, and a 5 multiplied by 5 moving window Gaussian operator is used for carrying out convolution on all pixels in the 5 multiplied by 5 window buffer area until all the data in the window buffer area are processed; finally, the pixel points of the convolved image are shifted to the right by eight bits, and the smoothed image data is output according to rows;

(2) solving gradient direction and module value: creating a3 x 3 filtered image calculation kernel window, convolving the calculation kernel with Sobel horizontal and vertical 3 x 3 window operators respectively to obtain horizontal and vertical gradient partial derivatives G _x And G _y Through G _x And G _y The sum of the absolute values of the two gradient signals is used for solving the amplitude of the gradient G; the direction of the gradient is:

θ＝arctan(G _y /G _x )；

(3) non-maxima suppression: calculating pixel points around the kernel center point of the image, comparing a high threshold Th with a low threshold Tl, and if the kernel center point is greater than Th, determining the kernel center point as a strong edge point of the image; and if the central point of the calculation kernel is between Tl and Th, the calculation kernel is a weak edge point.

7. The FPGA-based field video image real-time segmentation method of claim 1, wherein in the third step, the FPGA implementation method of median filtering comprises:

adopting a median filter with a window size of 5 multiplied by 5, firstly carrying out line buffering on an image, and gradually writing line-buffered data into a window buffer area after finishing buffering of a frame of image data;

1) the field image pixel gray level grading is that the field image pixel is compared with 256 single-channel gray levels, the image pixel is compared with 0, then the compared value is stored in a Reg0 register through a data selector, meanwhile, the output of the comparator 0 is firstly inverted, then passes through an AND gate with the output of the comparator 1 and then is input into the data selector, and the output of the data selector is stored in a Reg1 register; and so on until all pixel values are stored in the corresponding registers;

2) calculating the ratio of the target pixel to the background pixel, respectively calculating the ratio of the foreground pixel to the background pixel of the image, wherein the input value of a Reg0 register is the value of a Reg0 register in the previous step, sum is the total number of pixels, Reg0 and sum are used as the input of a divider, and the output of the divider is the ratio omega of the background pixel to the ratio omega ₁ (ii) a The output sum 1 of the divider is used as the input of the subtracter, and the target pixel ratio omega is output ₂ ；

3) Calculating the average gray values of the target and the background, inputting a picture with the size of M multiplied by N resolution, and performing division operation on the picture and Reg0, wherein the calculated graded image pixel gray values are stored in registers of Reg 0-Reg 255;

the output of the divider is multiplied by 0, the output of the multiplier is stored in the Reg register, the average gray value mu of the background stored in the Reg register ₁ ；

Mean gray value μ of the object ₂ Calculate the average gray value μ by background ₁ Calculating the inverse process of (1);

4) calculating the variance between classes:

the calculation formula is as follows:

g＝ω ₁ ×(μ-μ ₁ ) ² +ω ₂ ×(μ-μ ₂ ) ² ；

the simplified formula is:

g＝ω ₁ ×ω ₂ ×(μ ₁ -μ ₂ ) ² 。

8. an FPGA-based field video image real-time segmentation system implementing the FPGA-based field video image real-time segmentation method of claim 1, wherein the FPGA-based field video image real-time segmentation system image data is transmitted within FPGA logic over an AXI bus;

image data acquired by a camera is changed into AXI Stream through a Video in to AXI-Stream IP core;

the field video image real-time segmentation method is executed in HLS image processing; the AXI VDMA performs data interaction with the PS end through AXI interconnection, and stores or reads image data into or out of the DDR;

9. A field work robot implementing the FPGA-based field video image real-time segmentation method of claim 1.