CN109558817B - Airport runway detection method based on FPGA acceleration - Google Patents

Airport runway detection method based on FPGA acceleration Download PDF

Info

Publication number
CN109558817B
CN109558817B CN201811369168.4A CN201811369168A CN109558817B CN 109558817 B CN109558817 B CN 109558817B CN 201811369168 A CN201811369168 A CN 201811369168A CN 109558817 B CN109558817 B CN 109558817B
Authority
CN
China
Prior art keywords
image
data
threshold
sub
kernel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811369168.4A
Other languages
Chinese (zh)
Other versions
CN109558817A (en
Inventor
侯彪
焦李成
金晓飞
马晶晶
马文萍
白静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201811369168.4A priority Critical patent/CN109558817B/en
Publication of CN109558817A publication Critical patent/CN109558817A/en
Application granted granted Critical
Publication of CN109558817B publication Critical patent/CN109558817B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Abstract

The invention discloses an airport runway detection method based on FPGA acceleration, which is characterized in that firstly, an airport runway detection algorithm based on local threshold segmentation is compiled into Kernel; then compiling Kernel into an AOCX executable file through an AOC compiler; the method comprises the steps that a host end converts original high-resolution SAR image data into one-dimensional array data through a PCIE interface, the data are sent to a memory of an FPGA board card through calling a clenqueWriteBuffer function provided by OpenCL language, and the obtained AOCX executable file runs on the FPGA board card; obtaining a final processing result; and finally, the host reads back the execution result of the airport runway detection algorithm based on local threshold segmentation to the host in a Buffer reading mode, and displays the processing result. The invention can quickly provide the operation result of the algorithm and has great advantages in the aspect of processing mass data.

Description

Airport runway detection method based on FPGA acceleration
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to an airport runway detection method based on FPGA acceleration.
Background
Target detection in the SAR image is a research hotspot problem which is generally concerned at home and abroad, and the detection and identification of the airport runway as a specific target have special significance in terms of military use or civil use. In the high-resolution SAR image, because the data volume of the image is large, when the size of the image is large, the conventional airport runway detection method has difficulty in meeting the requirement of real-time processing due to low convergence rate.
Aiming at the characteristic of uneven illumination of the SAR image, the idea of local threshold segmentation is adopted, and the SAR image segmentation method has the advantages of high segmentation accuracy and good robustness. The method has a good effect on the high-resolution SAR image, but the SAR image has high resolution and a complex algorithm, so that the convergence time is long, and the method is not applicable to a system with high real-time requirement.
Disclosure of Invention
The technical problem to be solved by the invention is to provide an airport runway detection method based on FPGA acceleration aiming at the defects in the prior art, solve the problem that the detection speed is slow due to large data volume and complex algorithm structure when the existing detection algorithm processes SAR images with large data volume, and achieve the purpose of real-time processing.
The invention adopts the following technical scheme:
an airport runway detection method based on FPGA acceleration comprises the steps of firstly compiling an airport runway detection algorithm based on local threshold segmentation into Kernel; then compiling Kernel into an AOCX executable file through an AOC compiler; the method comprises the steps that a host end converts original high-resolution SAR image data into one-dimensional array data through a PCIE interface, the data are sent to a memory of an FPGA board card through calling a clenqueWriteBuffer function provided by OpenCL language, and the obtained AOCX executable file runs on the FPGA board card; obtaining a final processing result; and finally, the host reads back the execution result of the airport runway detection algorithm based on local threshold segmentation to the host in a Buffer reading mode, and displays the processing result.
Specifically, the airport runway detection algorithm based on local threshold segmentation specifically comprises the following steps:
s201, preprocessing the high-resolution SAR image, and enhancing the correlation of pixels in a target and a background region by performing median filtering on the high-resolution SAR image;
s202, performing local threshold segmentation on the filtered images to obtain an airport runway area;
s203, performing closed operation on the obtained binary image, and removing edge burrs;
and S204, judging the trend of the detected runway by applying a ratio edge detection algorithm to obtain the trend of the runway.
Further, in step S201, the step of writing the median filter into the Kernel function is as follows:
s2011, setting the size of a working group of a median filtering Kernel function, dividing data to be processed into a plurality of working groups, and processing image data in parallel by the plurality of working groups;
s2012, importing data from the global memory of the FPGA board card to a local memory by adopting a float2 vector mode to obtain data of each working group, and synchronizing the data among the groups by using a barrier function;
and S2013, performing median value taking operation on the data of each working group obtained in the step S2012 to obtain a filtered value of each pixel point in each working group, and finally obtaining a filtered image.
Further, in step S202, the local threshold segmentation step is as follows:
s2021, estimating the width ω of the runway according to the resolution of the image as follows:
ω=ω0
wherein, ω is0The true width of the runway is shown, and lambda is the resolution of the SAR image;
s2022, partitioning the image from left to right, wherein the line number of each sub-image is the same as that of the original image, the column number is 2 times of the width of the runway, and the image is divided into { A }1,A2,...,A2nN sub-picture blocks, and then, each sub-picture AiDivided into two halves B2i-1And B2iTo obtain 2n half blocks { B1,B2,B3,B4,...,B2n-1,B2nGet B out of2i-1And B2iForm the ith sub-image block A1iA 1 to B2iAnd B2i+1Form the ith sub-image block A2i
S2023, performing histogram statistics on each sub-image to obtain a statistical image histogram;
s2024, aiming at the histogram of each sub-image, applying a maximum inter-class variance method OTSU to obtain an optimal segmentation threshold;
s2025, optimizing the algorithm by using a loop expansion instruction, and taking the minimum value of the corresponding thresholds in the two threshold arrays obtained in step S2024 as a final segmentation threshold T as follows:
T=min(T1(A1),T2(A2))
wherein, T1(A1) Threshold array for the first sub-image sequence, T2(A2) A threshold array for a second sequence of sub-images;
s2026, vectorizing the optical compensation function by using a num _ simd instruction, enabling each work item to be responsible for processing for multiple times, and performing threshold correction on the threshold array obtained in the step S2025 to obtain a corrected threshold T*The following were used:
T*=f(T,d)
wherein T represents a threshold value before correction, and d is an optical compensation function;
s2027, the threshold-value-divided image obtained in step S2026 is divided as follows:
Figure BDA0001869370130000031
wherein f (x, y, i) is the image after segmentation, g (x, y, i) is the image after median filtering, i is the number of the sub-image block where (x, y) is located after partitioning, Ti *Is the threshold for the ith sub image block.
Further, in step S2023, when writing a kernel function for histogram statistics, copying the sub-image data from the global memory of the FPGA board card to the corresponding working group, optimizing the kernel in the group by adopting a complete cycle expansion method, and implementing intra-group synchronization by using a barrier function.
Further, in step S2024, when writing the kernel function of the OTSU, a float4 data type is used to implement access aggregation, so that the number of times of accessing the memory is reduced, and the kernel is optimized in a circular expansion manner.
Further, in step S2026, the obtained threshold value arrays are arranged from left to right, then the position with the smallest element in the threshold value array is found, the position is compared from the smallest position to the left, the threshold values on the left are sequentially subtracted by d, and then the comparison is performed to the right, if the threshold value on the right is greater than the threshold value on the left, and the difference value is smaller than 2d, the value is kept unchanged, otherwise d is added;
let Ind be the minimum mark of the element in the threshold array, i represents the ith sub-image, and for sub-image blocks with i < Ind, the specific steps are as follows:
Figure BDA0001869370130000041
for sub image blocks with i > Ind, specifically:
Figure BDA0001869370130000042
wherein, Ti *The threshold of the ith sub image block after the correction,
Figure BDA0001869370130000043
representing the threshold value before correction.
Further, in step S203, the closing operation is performed by performing one expansion and then one etching, specifically as follows:
s2031, performing primary expansion on the divided binary image, when writing Kernel of an expansion algorithm, firstly distributing the number of working groups, then copying data to be processed from the global memory of the FPGA board card to the local memory, wherein each working group is responsible for processing the data in an expansion window, and performing intragroup synchronization by using a barrier function;
s2032, clothing operation is performed on the expanded binary image once, when Kernel of a corrosion algorithm is written, the number of working groups is firstly distributed, then data to be processed is copied to a local memory from a global memory of the FPGA board card, each working group is responsible for processing the data in the expansion window, and barrier functions are used for group synchronization.
Further, in step S204, the specific steps of the ratio edge detection algorithm are as follows:
s20411, performing horizontal and vertical edge detection with radius of 7 on the image obtained in the third step in a sliding window mode;
s20412, adding the edge ratios in the horizontal direction and the vertical direction respectively;
and S20413, determining the direction with the larger addition result as the trend of the runway.
Further, the specific process of writing the Kernel function is as follows:
s20421, filling 0 in the periphery of the image to enable the global _ work _ size to be evenly divided by the size of the working group;
s20422, adopting an OpenCL built-in function: solving a square root by using a native _ sqrt function, and performing division operation by using a native _ divide function;
s20423, the number of times of loop needed is informed to the compiler by adopting a complete loop expansion mode.
Compared with the prior art, the invention has at least the following beneficial effects:
the invention provides an airport runway detection method based on FPGA acceleration.A Kernel program is obtained by performing high-level language description on a detection algorithm based on local threshold segmentation; and compiling the Kernel program into a Kernel executable file. Running the generated executable file at the FPGA end to realize an airport runway detection algorithm and obtain a processing result; according to the method and the device, the generated executable file is operated at the FPGA end, so that the time required by the operation of the airport runway detection algorithm is greatly shortened, and the operation speed of the algorithm is obviously improved.
Furthermore, when the high-level language description is carried out on the target algorithm, the FPGA board is set, for example, a working group and a working item are reasonably set to divide target data, the optimization of storage access is realized, the execution efficiency of the Kernel program is improved, and the application data division greatly improves the program operation speed when the median filtering and the subimage division in the algorithm are executed.
Furthermore, when the high-level language description is performed on the target algorithm, the FPGA board is set, for example, a vectorization and memory access aggregation mode is adopted, so that the throughput of the FPGA board card is improved. The running performance of the algorithm is greatly improved.
Furthermore, when the high-level language description is carried out on the target algorithm, the FPGA board is set, and if the Buffer type is reasonably used, unnecessary data transmission is avoided.
Further, when the high-level language description is performed on the target algorithm, the FPGA board is set, for example, when the memory is allocated at the host side, the aligned _ malloc function is used for allocating the memory, and because the memories are aligned at this time, a large amount of time can be saved when the memories are read, and the convergence speed of the algorithm is accelerated.
In conclusion, the invention performs parallel processing on the detection algorithm and performs proper optimization on the detection algorithm based on the FPGA, so that the convergence speed of the algorithm is obviously accelerated. For the high-resolution SAR image, the method can quickly give the operation result of the algorithm, and has great advantages in the aspect of processing mass data.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a flow chart of the operation of the target algorithm on the FPGA board.
Detailed Description
The invention provides an airport runway detection method based on FPGA acceleration, which is characterized in that firstly, an airport runway detection algorithm is compiled into a Kernel program; then compiling Kernel into an AOCX executable file through an AOC compiler; the method comprises the steps that a host side sends a high-resolution SAR image to an FPGA board card through a PCIE interface, and the obtained AOCX executable file runs on the FPGA board card; obtaining a final processing result; and finally, the host reads the algorithm execution result back to the host in a Buffer reading mode and displays the processing result. The invention accelerates the traditional algorithm by using the FPGA board card, and the convergence speed of the algorithm is obviously improved after the acceleration.
Referring to fig. 1, the method for detecting an airport runway based on FPGA acceleration of the present invention includes the following steps:
s1, writing an airport runway detection algorithm based on local threshold segmentation into a Kernel function named cl file by using OpenCL language, and compiling the cl file into an AOCX file through an AOC compiler;
and compiling the Kernel program by utilizing Altera SDK for OpenCL to generate a corresponding AOCX executable file which can be executed by the FPGA.
S2, sending the high-resolution SAR image data to be processed to an FPGA board card through a PCIE interface, and executing the program control step S1 by a CPU to generate an AOCX executable file to run on the FPGA board card; finally, a processing result of the target data is obtained;
the specific steps of sending the high-resolution SAR image data to be processed to the FPGA board card through the PCIE interface are as follows: and the host side sends the high-resolution SAR image data to the memory of the FPGA board card through a write Buffer mode provided by the OpenCL language.
The method comprises the steps that an original high-resolution SAR image data is converted into one-dimensional array data by a host terminal, and the data are sent to a memory of an FPGA board card by calling a clenqueWriteBuffer function provided by an OpenCL language.
Referring to fig. 2, the operation process of the airport runway detection algorithm on the FPGA board based on local threshold segmentation includes the following steps:
s201, preprocessing the SAR image, mainly performing median filtering on the SAR image, enhancing the correlation between the target and the internal pixels of the background area, and compiling the median filtering into a Kernel function, wherein the specific process of compiling into Kernel is as follows:
and S2011, setting the size of a working group of the median filtering Kernel function, dividing the data to be processed into a plurality of working groups, and enabling the plurality of working groups to process the image data in parallel. In this embodiment, the data to be processed is a 1000 × 1128 SAR image, the size of the working group is set to 8 × 8, and the image is further divided into 125 × 141 working groups;
s2012, data is led into a local memory from a global memory of the FPGA board card by adopting a float2 vector mode to obtain data of each working group, and data between groups are synchronized by a barrier function. The access and storage aggregation is realized by using the vector data type, and two accesses are changed into one access, so that the access and storage efficiency is greatly improved, and the access and storage conflict is avoided;
s2013, setting a filtering window to be 3 x 3, sorting data in the window, taking a median value, using a cycle expansion instruction, and designating the cycle expansion times to be 3; the compiler can increase the workload of the kernel in each clock cycle through the optimization mode of loop unrolling.
S202, performing local threshold segmentation on the filtered airport runway area to obtain the airport runway area, and specifically comprising the following steps:
s2021, estimating the width of a runway according to the resolution of the image;
ω=ω0
wherein, ω is0Is the true width of the runway, and λ is the resolution of the SAR image
S2022, dividing the sub-image:
partitioning the image from left to right, wherein the row number of each sub-image is the same as that of the original image, the column number is 2 times of the width of the runway (the last sub-image can be different), and partitioning the image into { A }1,A2,...,A2nN sub-picture blocks, and then, each sub-picture AiDivided into two halves B2i-1And B2iThus, 2n half blocks { B } are obtained1,B2,B3,B4...,B2n-1,B2nGet B out of2i-1And B2iForm the ith sub-image block A1i(this sub image block sequence is the same as the sub image block sequence obtained in the first step), B2iAnd B2i+1Form the ith sub-image block A2i
S2023, performing histogram statistics on each sub-image to obtain a statistical image histogram;
when writing a Kernel function of histogram statistics, copying sub-image data from a global memory of the FPGA board card to a corresponding working group, optimizing the Kernel in the group by adopting a complete cycle expansion mode, and realizing intra-group synchronization by using a barrier function.
S2024, aiming at the histogram of each sub-image, applying a maximum inter-class variance method (OTSU) to obtain an optimal threshold value for segmentation;
when the Kernel function of the OTSU is compiled, a float4 data type is adopted to realize access and storage aggregation, so that the number of times of accessing the memory is reduced, and the Kernel is optimized in a circular expansion mode;
s2025, in terms of gray characteristics, the gray value of the runway area is very low, the difference between the gray value of the runway area and the gray value of the surrounding scene is large, a cyclic expansion instruction is adopted to optimize an algorithm, and the minimum value of corresponding threshold values in the two threshold value arrays obtained in the S2024 is taken as a final segmentation threshold value;
T=min(T1(A1),T2(A2))
wherein, T1(A1) Threshold array for the first sub-image sequence, T2(A2) Threshold array for the second sub-image sequence
S2026, optimizing the light compensation algorithm by adopting a vectorization method, and correcting a threshold value;
the num _ simd instruction is used for vectorizing the optical compensation function, each work item is responsible for processing for multiple times, the operation amount of each work item is increased, and single instruction multiple data are achieved. Before vectorization, a red word group size instruction is used for setting the size of a working group, and the setting principle is that the vectorization times can be divided by the size of the working group, in the embodiment, the set vectorization times are 2, and the set working group size is 6;
the threshold value array obtained in step S2025 is subjected to threshold value correction, and the threshold value before correction is represented by T, T*And d is an optical compensation function, and the details are as follows:
T*=f(T,d)
arranging the obtained threshold value arrays from left to right, then finding the position with the minimum element in the threshold value arrays, comparing the position with the left from the minimum position, sequentially subtracting d from the threshold value positioned on the left, and then comparing the position with the right, if the threshold value on the right is larger than the threshold value on the left, and the difference value is smaller than 2dThen keep its value unchanged, otherwise add d.
Let Ind be the minimum mark of the element in the threshold value array, i represents the ith sub-image,
for sub-image blocks with i < Ind,
Figure BDA0001869370130000101
for sub-image blocks where i > Ind,
Figure BDA0001869370130000102
wherein, Ti *The threshold of the ith sub image block after the correction,
Figure BDA0001869370130000103
represents a threshold value before correction, and d is an optical compensation constant;
s2027, applying a threshold to segment the image: the image is segmented using the threshold obtained in S2026, and the segmentation formula is as follows:
Figure BDA0001869370130000104
wherein f (x, y, i) is the image after segmentation, g (x, y, i) is the image after median filtering, i is the number of the sub-image block where (x, y) is located after partitioning, Ti *Is the threshold for the ith sub image block.
S203, performing closed operation on the obtained binary image, removing edge burrs, and compiling a closed operation Kernel function in the specific process:
bit operation is used for replacing division operation, and the number of times of cyclic expansion is set to be 4;
when closed operation is carried out, the closed operation is realized by carrying out one-time expansion and then carrying out one-time corrosion, and the programming method comprises the following steps:
s2031, performing primary expansion on the divided binary image, when writing Kernel of an expansion algorithm, firstly distributing the number of working groups, then copying data to be processed from the global memory of the FPGA board card to the local memory, wherein each working group is responsible for processing the data in an expansion window, and performing intragroup synchronization by using a barrier function;
s2032, clothing operation is performed on the expanded binary image once, when Kernel of a corrosion algorithm is written, the number of working groups is firstly distributed, then data to be processed is copied to a local memory from a global memory of the FPGA board card, each working group is responsible for processing the data in the expansion window, and barrier functions are used for group synchronization.
S204, the trend of the detected runway is judged by applying a ratio edge detection algorithm to obtain the trend of the runway, wherein the ratio edge detection algorithm comprises the following specific steps:
s20411, performing horizontal and vertical edge detection with radius of 7 on the image obtained in the third step in a sliding window mode;
s20412, adding the edge ratios in the horizontal direction and the vertical direction respectively;
and S20413, determining the direction with the larger addition result as the trend of the runway.
The specific process of writing the Kernel function is as follows:
s20421, filling 0 in the periphery of the image to enable the global _ work _ size to be evenly divided by the size of the working group;
s20422, adopting an OpenCL built-in function: the native _ sqrt function obtains a square root, and the native _ divide function is used for division operation, and the instructions can be mapped to one or more native device instructions, so that the efficiency is higher than that of a common sqrt function and a divide function.
S20423, the number of times of loop needed is informed to the compiler by adopting a complete loop expansion mode.
S3, the host reads the algorithm execution result back to the host by reading the Buffer and displays the processing result, and the process of sending the FPGA processing result to the host comprises the following steps:
and the host side sends the processing result of the FPGA from the memory of the FPGA to the host side through a Buffer reading mode provided by the OpenCL language.
And (3) calling a clenqueReadBuffer function provided by OpenCL, and sending the processing result of the FPGA from the memory of the FPGA to a host terminal, wherein the model of the used FPGA board card is de5net _ a 7.
The execution process of the CPU comprises the following steps:
initializing an OpenCL operating environment;
creating a Buffer by calling clCreateBuffer provided by OpenCL;
sending data to the FPGA board card by calling a cleenqueueWriteBuffer function provided by OpenCL;
executing a Kernel function on the FPGA board card by calling a cleenqueNDRangeKernel function provided by OpenCL;
and sending the processing result of the FPGA from the memory of the FPGA to the host side by calling a clenqueReadBuffer function provided by OpenCL.
Table 1 shows the convergence time comparison before and after the acceleration of the algorithm of this time:
Figure BDA0001869370130000121
because the data in the FPGA board card is subjected to grouping processing every time of acceleration, and the data is copied from the global memory to the local memory in the FPGA board card, when the data needs to be processed in parallel in a workitem, the local memory can be accessed, so that the time consumption of memory access is greatly saved, and the convergence time is greatly improved. The closed operation algorithm and the runway orientation algorithm adopt bottom-layer bit operation instead of multiplication and division operation, and adopt a built-in function in the runway orientation algorithm, so that the two algorithms have the maximum lifting amplitude.
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (7)

1. An airport runway detection method based on FPGA acceleration is characterized in that an airport runway detection algorithm based on local threshold segmentation is compiled into Kernel; then compiling Kernel into an AOCX executable file through an AOC compiler; the method comprises the following steps that the host side sends original high-resolution SAR image data to an FPGA board card through a PCIE interface, and specifically comprises the following steps: the method comprises the steps that a host computer converts high-resolution SAR image data into one-dimensional array data through a write Buffer mode provided by OpenCL language and sends the one-dimensional array data to an internal memory of an FPGA board card, the one-dimensional array data are sent to the internal memory of the FPGA board card through calling a clenqueWriteBuffer function provided by OpenCL language, and the obtained AOCX executable file runs on the FPGA board card; obtaining a final processing result; finally, the host reads back the airport runway detection algorithm execution result based on local threshold segmentation to the host in a Buffer reading mode, and displays the processing result;
the airport runway detection algorithm based on local threshold segmentation specifically comprises the following steps:
s201, preprocessing the high-resolution SAR image, enhancing the correlation between the target and the internal pixels of the background area by performing median filtering on the high-resolution SAR image, and writing the median filtering into a Kernel function as the following steps:
s2011, setting the size of a working group of a median filtering Kernel function, dividing data to be processed into a plurality of working groups, and processing image data in parallel by the plurality of working groups;
s2012, importing data from the global memory of the FPGA board card to a local memory by adopting a float2 vector mode to obtain data of each working group, and synchronizing the data among the groups by using a barrier function;
s2013, performing median value taking operation on the data of each working group obtained in the step S2012 to obtain a filtered value of each pixel point in each working group, and finally obtaining a filtered image;
s202, performing local threshold segmentation on the filtered image to obtain an airport runway area, wherein the local threshold segmentation comprises the following steps:
s2021, estimating the width ω of the runway according to the resolution of the image as follows:
ω=ω0
wherein, ω is0The real width of the runway is used, and lambda is the resolution of the SAR image;
s2022, partitioning the image from left to right, wherein the line number of each sub-image is the same as that of the original image, the column number is 2 times of the width of the runway, and the image is divided into { A }1,A2,...,AnN sub-picture blocks, and then, each sub-picture AiDivided into two halves B2i-1And B2iTo obtain 2n half blocks { B1,B2,B3,B4,...,B2n-1,B2nGet B out of2i-1And B2iConstituting sub-image block A1iA 1 to B2iAnd B2i+1Constituting sub-image block A2i
S2023, pair A1iAnd A2iMaking histogram statistics on each sub-image to obtain a statistical image histogram;
s2024, for A1iAnd A2iApplying the maximum inter-class variance method OTSU to obtain the optimal threshold value of the segmentation of the histogram of each sub-image;
s2025, optimizing the algorithm by using a loop expansion instruction, and taking the minimum value of the corresponding threshold values in the two threshold value arrays obtained in step S2024 as a final partition threshold value array T as follows:
T=min(T1(A1),T2(A2))
wherein, T1(A1) Threshold array for the first sub-image sequence, T2(A2) A threshold array for a second sequence of sub-images;
s2026, vectorizing the optical compensation function by using a num _ simd instruction, enabling each work item to be responsible for processing for multiple times, and performing threshold correction on the threshold array T obtained in the step S2025 to obtain a corrected threshold array T*The following were used:
T*=f(T,d)
wherein T represents a threshold value array before correction, and d is an optical compensation function;
s2027, using the threshold-value-divided image obtained in step S2026, divides the image as follows:
Figure FDA0002780517660000021
wherein f (x, y, i) is the divided image, g (x, y, i) is the median-filtered image, i is the sub-image block A where (x, y) is located after the block divisioniNumber of (1), Ti *Is the corrected sub image block AiA threshold value of (d);
s203, performing closed operation on the obtained binary image, and removing edge burrs;
and S204, judging the trend of the detected runway by applying a ratio edge detection algorithm to obtain the trend of the runway.
2. The method according to claim 1, wherein in step S2023, when a kernel function of histogram statistics is written, sub-image data composed of two half blocks is copied from a global memory of the FPGA board card to a corresponding working group, kernel optimization is performed in the group by adopting a complete cycle expansion method, and intra-group synchronization is realized by using a barrier function.
3. The method according to claim 1, wherein in step S2024, when writing the kernel function of the OTSU, a float4 data type is used to implement access aggregation, so that the number of times of accessing the memory is reduced, and the kernel is optimized in a loop expansion manner.
4. The method according to claim 1, wherein in step S2026, the obtained threshold value arrays T are arranged from left to right, then the position with the smallest element in the threshold value array is found, the position is compared from the smallest position to the left, the threshold values located at the left are sequentially subtracted by d, and then compared to the right, if the threshold value at the right is greater than the threshold value at the left, and the difference is less than 2d, the value is kept unchanged, otherwise d is added;
let Ind be the subscript of the minimum element in the threshold array T, and when i < Ind, specifically:
Ti *=Ti-d
when i > Ind, the following are specific:
Figure FDA0002780517660000031
wherein, Ti *Representing the modified sub-image block AiThreshold value of (1), TiAnd Ti-1Representing the threshold value before correction.
5. The method according to claim 1, wherein the step S203 is implemented by performing an expansion operation and then performing an erosion operation, and the method comprises the following steps:
s2031, performing primary expansion on the divided binary image, when writing Kernel of an expansion algorithm, firstly distributing the number of working groups, then copying data to be processed from the global memory of the FPGA board card to the local memory, wherein each working group is responsible for processing the data in an expansion window, and performing intragroup synchronization by using a barrier function;
s2032, performing corrosion operation on the expanded binary image once, when writing Kernel of a corrosion algorithm, firstly distributing the number of working groups, then copying data to be processed from the global memory of the FPGA board card to the local memory, wherein each working group is responsible for processing the data in the corrosion window, and performing intragroup synchronization by using a barrier function.
6. The method according to claim 1, wherein in step S204, the specific steps of the ratio edge detection algorithm are:
s20411, performing horizontal and vertical edge detection with the radius of 7 on the image obtained in the step S203 in a sliding window mode;
s20412, adding the edge ratios in the horizontal direction and the vertical direction respectively;
and S20413, determining the direction with the larger addition result as the trend of the runway.
7. The method of claim 6, wherein the specific process of writing the Kernel function is:
s20421, filling 0 in the periphery of the image to enable the global _ work _ size to be evenly divided by the size of the working group;
s20422, adopting an OpenCL built-in function: solving a square root by using a native _ sqrt function, and performing division operation by using a native _ divide function;
s20423, the number of times of loop needed is informed to the compiler by adopting a complete loop expansion mode.
CN201811369168.4A 2018-11-16 2018-11-16 Airport runway detection method based on FPGA acceleration Active CN109558817B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811369168.4A CN109558817B (en) 2018-11-16 2018-11-16 Airport runway detection method based on FPGA acceleration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811369168.4A CN109558817B (en) 2018-11-16 2018-11-16 Airport runway detection method based on FPGA acceleration

Publications (2)

Publication Number Publication Date
CN109558817A CN109558817A (en) 2019-04-02
CN109558817B true CN109558817B (en) 2021-01-01

Family

ID=65866371

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811369168.4A Active CN109558817B (en) 2018-11-16 2018-11-16 Airport runway detection method based on FPGA acceleration

Country Status (1)

Country Link
CN (1) CN109558817B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110782438B (en) * 2019-10-18 2022-10-04 苏州中科全象智能科技有限公司 Image detection method based on maximum inter-class variance method of FPGA (field programmable Gate array)
CN112396031B (en) * 2020-12-04 2023-06-30 湖南傲英创视信息科技有限公司 Target detection method and system based on heterogeneous operation platform

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7796780B2 (en) * 2005-06-24 2010-09-14 Objectvideo, Inc. Target detection and tracking from overhead video streams
CN101982835B (en) * 2010-11-12 2012-02-08 西安电子科技大学 Level set method for edge detection of SAR images of airport roads
CN103295420B (en) * 2013-01-30 2015-12-02 吉林大学 A kind of method of Lane detection
CN104142845B (en) * 2014-07-21 2018-08-17 中国人民解放军信息工程大学 CT image reconstructions back projection accelerated method based on OpenCL-To-FPGA
CN108596885B (en) * 2018-04-16 2021-12-28 西安电子科技大学 CPU + FPGA-based rapid SAR image change detection method

Also Published As

Publication number Publication date
CN109558817A (en) 2019-04-02

Similar Documents

Publication Publication Date Title
US10984286B2 (en) Domain stylization using a neural network model
US11631239B2 (en) Iterative spatio-temporal action detection in video
US20230410375A1 (en) Temporally stable data reconstruction with an external recurrent neural network
US9235769B2 (en) Parallel object detection method for heterogeneous multithreaded microarchitectures
US20190244329A1 (en) Photorealistic Image Stylization Using a Neural Network Model
US7619623B2 (en) Perfect multidimensional spatial hashing
CN109558817B (en) Airport runway detection method based on FPGA acceleration
CN110059793B (en) Gradual modification of a generative antagonistic neural network
CN103177414B (en) A kind of node of graph similarity parallel calculating method of structure based
Kyo et al. An integrated memory array processor architecture for embedded image recognition systems
Hou et al. Highly efficient compensation-based parallelism for wavefront loops on gpus
CN104751485A (en) GPU adaptive foreground extracting method
CN110246201B (en) Pencil drawing generation method based on thread-level parallelism
CN113657393B (en) Shape prior missing image semi-supervised segmentation method and system
Moura et al. LSHSIM: a locality sensitive hashing based method for multiple-point geostatistics
Allegretti et al. Optimizing GPU-based connected components labeling algorithms
CN113792621B (en) FPGA-based target detection accelerator design method
CN106484532B (en) GPGPU parallel calculating method towards SPH fluid simulation
CN112906800A (en) Image group self-adaptive collaborative saliency detection method
CN110751150A (en) FPGA-based binary neural network license plate recognition method and system
CN104992425A (en) DEM super-resolution method based on GPU acceleration
KR20190118023A (en) Apparatus and Method for Interplating Image Autoregressive
CN109472777B (en) Bridge detection method based on FPGA heterogeneous computation
CN107424154B (en) Watershed image segmentation parallel method based on dynamic distribution
Xu et al. Research and implementation of parallel lane detection algorithm based on gpu

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant