CN114694063B

CN114694063B - Hardware implementation method and system for extracting and selecting feature points of video stream in real time

Info

Publication number: CN114694063B
Application number: CN202210284325.1A
Authority: CN
Inventors: 耿莉; 李佳霖; 龚一帆; 张良基; 雷莹
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2022-03-22
Filing date: 2022-03-22
Publication date: 2024-04-02
Anticipated expiration: 2042-03-22
Also published as: CN114694063A

Abstract

The invention discloses a hardware implementation method and a system for extracting and selecting characteristic points of a video stream in real time, which effectively improve the speed of an algorithm by simplifying combination logic on a key path and designing a 4-point parallel processing and pipeline structure; aiming at the problems of poor instantaneity and low precision in the characteristic point selection process, the invention designs the characteristic point register and the block selection storage structure by optimizing the image buffer structure, thereby effectively reducing the calculated amount and improving the precision and speed of the algorithm.

Description

Hardware implementation method and system for extracting and selecting feature points of video stream in real time

Technical Field

The invention belongs to the technical field of computer vision, and particularly relates to a hardware implementation method and system for extracting and selecting feature points of a video stream in real time.

Background

And meanwhile, positioning and map construction (SLAM, simultaneous Localization and Mapping) which is to carry a specific sensor mobile platform, construct an environment map model in the process of movement without current environment information and estimate the positioning of the environment map model. The technology is always a hot spot for the research of the robot field as a precondition of the autonomous navigation task of the mobile robot. The visual SLAM which uses the camera as a main sensor has wide application prospect in the future by virtue of the advantages of low cost, rich information and the like.

In the visual SLAM, a feature point processing part in the front-end visual odometer is a crucial module in the whole system, because the quality of feature point selection directly affects the accuracy of camera pose estimation and map construction. The traditional visual SLAM scheme usually adopts ORB, SIFT, SURF and other characteristic points for extraction, but the characteristic extraction methods are complex, and the time and resource consumption are large.

Aiming at the problem, the prior hardware implementation method for ORB characteristic point extraction with good real-time performance realizes the ORB characteristic point extraction by utilizing a pipeline structure on the FPGA, thereby improving the speed. However, in the result, the region with rich textures has more extracted features, the region with weak textures has no feature points, and the uneven distribution of the feature points in space easily causes the increase of errors of pose estimation, so that the method has poor precision and is not suitable for engineering application in actual scenes.

For this reason, aiming at the problem of uneven feature point selection, the other method uses four-way and sorting to realize uniform feature point selection, but the multiple uncertainty iterative computation introduced by the method, the complicated logic branches and repeated memory reading are caused, so that the real-time performance is poor, and the real-time processing task of the high-speed video stream cannot be completed well.

Disclosure of Invention

Aiming at the defects in the prior art, the technical problem to be solved by the invention is to provide a hardware implementation method and a system for extracting and selecting the characteristic points of the video stream in real time, which are successfully deployed to an FPGA platform, have good effects and can be used for improving the instantaneity and the accuracy of a visual SLAM system.

The invention adopts the following technical scheme:

a hardware implementation method for extracting and selecting video stream feature points in real time comprises the following steps:

s1, converting an original image in a video stream into a gray image, downsampling the gray image, and integrating pixel units of the downsampled gray image;

s2, reading one pixel unit obtained in the step S1, constructing a sliding window, performing FAST feature judgment on the central point of the sliding window, and calculating the score of the FAST feature point;

s3, performing non-maximum suppression operation, mask shielding judgment and image rectangular area judgment on the neighborhood of the FAST feature points by using the FAST feature point score obtained in the step S2, and taking the center point meeting the condition as a candidate feature point to participate in subsequent selection;

s4, calculating Harris corner response values of 4 center points of the sliding window and the blocks where the 4 center points are located by using the sliding window obtained in the step S2, and then carrying out edge correction on the Harris corner response values of the 4 center points to obtain corrected response values;

s5, based on the corrected response value in the step S4, if the response value R of the current feature point _curr The response value R of the corresponding column in the characteristic point register is larger than or equal to _col And the judgment condition in the step S3 is met, the current characteristic point is stored in the corresponding position in the shared memory, and the characteristic point register is updated;

s6, repeating the steps S1 to S5, and traversing all pixel units of the current image until a primary selected feature point set in the current image is obtained;

and S7, after the processing of the video image of the frame in the step S6, reading the initially selected feature point set, screening the number of feature points required by the task, and outputting the number of feature points to finish the hardware implementation of extracting and selecting the feature points of the video stream in real time.

Specifically, in step S1, 24-bit color information of the original image is converted into 8-bit gray information, the storage size of the original image in the video stream is reduced to 1/12 of the original size through downsampling operation, and every 4 adjacent 8-bit pixel points in the downsampled gray image are integrated into 1 32-bit pixel unit and stored in the DDR3 chip in sequence.

Specifically, in step S2, a sliding window is constructed by using a line buffer and a shift register, and the Score of the center point of the sliding window is specifically:

wherein V is _p The gray value of the pixel, V, is the center point _x The radius is a pixel value of 16 pixel points on 3 with the center point as the center.

Specifically, in step S2, FAST feature determination is specifically performed on the center point of the sliding window:

constructing a constant matrix of the angle point detector according to the threshold t, and classifying all pixels into 3 classes; and then, judging the non-characteristic points according to the classification according to the pairs of 8 pairs of diagonal angles by taking the central point as the center of a circle and 16 pixel points with the radius of 3, and continuously judging the points meeting the threshold value, wherein the classification standard is as follows:

wherein 2' b01,2' b00 and 2' b10 are classification labels represented by 2-bit fixed-point numbers, and t is a preset threshold.

Specifically, in step S3, performing non-maximum suppression operation, mask judgment, and image rectangular region judgment on the FAST feature points specifically includes:

comparing the scores of 4 center points and 14 adjacent candidate corner points, and scoring Score of the center points _p Meet the Score of all neighboring candidate corner points more than or equal to _x When the method is used, a central point is reserved, and the non-maximum value inhibition operation is completed; and reading data on the corresponding position of the current pixel to perform mask shielding judgment, then cutting the peripheral edge of the gray level image after downsampling, and eliminating the characteristic points of the cut part.

Specifically, in step S4, the corner response value R is:

wherein I is _x ，I _y The gradient of the image pixels in x and y directions, respectively, W is an image window of size 7 x 7, k=0.04, scale= 1.732873552846874.

Specifically, in step S4, the edge correction specifically includes:

dividing the image residual part into M multiplied by N blocks according to the size of 14 multiplied by 12 according to the image resolution, and finally, only outputting an optimal characteristic point as a primary characteristic point by each block, multiplying the response value of the characteristic point within 2 pixels at the edge of the block by a penalty factor p, and guiding the selection of the primary characteristic point.

Specifically, in step S5, a response value of N is designedRegisters, with initial value of 0, response value R of current feature point _curr A response value R greater than or equal to the corresponding column in the register _col When the current feature point is stored in the shared memory at the position corresponding to the block, and the response value R is obtained _curr Updating the characteristic point register; the feature point register is refreshed to all 0's each time 14 rows of pixels are all detected.

Specifically, step S6 specifically includes:

and continuously reading the next pixel unit in the image stream every clock period, realizing pipeline=1, simultaneously processing 4 pixels at a time, and circulating the steps S2-S5 in a pipeline mode until a primary characteristic point set in the current frame is obtained, and then releasing the storage space of the current image in the DDR3 chip.

The other technical scheme of the invention is that the hardware implementation system for extracting and selecting the characteristic points of the video stream in real time comprises:

the conversion module converts an original image in the video stream into a gray image, downsamples the gray image, and integrates pixel units of the downsampled gray image;

the computing module reads one pixel unit obtained by the conversion module, constructs a sliding window, judges the FAST characteristic of the central point of the sliding window and computes the score of the FAST characteristic point;

the judging module is used for carrying out non-maximum value inhibition operation, mask shielding judgment and image rectangular area judgment on the neighborhood of the FAST feature points by using the FAST feature point score obtained by the calculating module, and taking the center point meeting the condition as a candidate feature point to participate in subsequent selection;

the correction module is used for calculating Harris corner response values of 4 center points of the sliding window and the blocks where the 4 center points are located by utilizing the sliding window obtained by the calculation module, and then carrying out edge correction on the Harris corner response values of the 4 center points to obtain corrected response values;

the updating module is used for updating the response value R of the current characteristic point based on the response value corrected by the correction module _curr The response value R of the corresponding column in the characteristic point register is larger than or equal to _col The judgment conditions in the judgment module are met, the current feature points are stored in the corresponding positions in the shared memory, and the feature point register is updated;

the processing module continues to read the next pixel unit obtained in the conversion module in sequence, and traverses all pixel units of the current image through the calculation module, the judgment module, the correction module and the updating module until a primary selected characteristic point set in the current image is obtained;

and the implementation module is used for reading the initially selected feature point set after the processing module processes a frame of video image, screening the number of feature points required by a task, outputting the feature points and completing the hardware implementation of extracting and selecting the feature points of the video stream in real time.

Compared with the prior art, the invention has at least the following beneficial effects:

the invention discloses a hardware implementation method for extracting and selecting characteristic points of video streams in real time, which divides the whole implementation scheme into a preprocessing module, a real-time processing module and a post-processing module according to the characteristic that one pixel is input in one clock period of an image input signal, not only meets the real-time processing requirement of the image input signal through pipeline design and parallel design, but also saves hardware resource consumption by fast screening in the time gap of two frames of images by utilizing the post-processing module, and meanwhile, under the condition of not affecting the real-time performance, the characteristic point selection mechanism is introduced, so that uneven distribution of the characteristic points is further avoided, and the algorithm precision is effectively improved.

Furthermore, with increasing of camera resolution, the embedded system occupies a large amount of precious memory space if storing the original images, therefore, the invention converts 24-bit color information of the original images into 8-bit gray information and performs downsampling operation, and finally reduces the storage space of the images stored in the embedded system to 1/12 of the original storage space required by the original images acquired by the camera.

Furthermore, in order to facilitate subsequent rapid screening, a FAST score function is set to evaluate the quality of the extracted FAST feature points, and the score function has high parallelism and is easy to realize rapidly by hardware.

Furthermore, FAST feature judgment is carried out on 18 points in the center of the window at the same time, so that data preparation is carried out on subsequent maximum value suppression operation on 4 points in the center of the window.

Furthermore, non-maximum value inhibition operation is carried out in the 3×3 neighborhood of the 4 points in the center of the window, and the center point is reserved as a candidate feature point only when the score of the center point is not smaller than the score of the adjacent point, so that the problem that the extracted feature points are adjacent to each other is avoided. Meanwhile, due to the movement of a camera, the feature points at the edge of the image cannot be accurately matched in the next frame, so that errors are generated, and therefore, the invention only extracts the feature points of the effective rectangular area in the image. In addition, the mask shielding interface can further avoid excessive concentration of feature points through top-level enabling control.

Furthermore, the characteristic quality of the candidate characteristic points can be more accurately described by calculating the Harris angular point response value and used for carrying out fine selection on the residual candidate characteristic points.

Furthermore, the invention divides the preprocessed image into blocks according to the size of 14 multiplied by 12, and finally, each block can only output one characteristic point at most, but the same edge positioned on the adjacent blocks still can extract two characteristic points with the distance of not more than 4 pixels.

Furthermore, in hardware, the image signal is actually input line by line, and only after 14 lines of pixels are all input, the feature point extraction and the preliminary selection of one line of blocks can be completed, so in hardware implementation, the invention designs a response value register for temporarily storing the maximum response value in the current blockBy combining the response value R of the current feature point in the pipeline _curr And response value R in register _col And compared with the prior art, the method avoids frequent reading and writing of the shared memory, thereby ensuring the processing efficiency of a pipeline in hardware implementation.

Furthermore, according to the signal characteristics of one pixel input in one period of the image signal, the real-time module integrally adopts a pipeline design, one pixel unit can be processed in each clock period, and finally the real-time characteristic point processing of the image input signal is realized on FPGA hardware.

In summary, the invention provides a new hardware implementation structure for the traditional corner detection method to carry out hardware acceleration design, and the scheme for uniformly selecting the feature points is optimized again, so that the calculation amount is effectively reduced, the algorithm parallelism degree is improved, and the accuracy and the speed of the algorithm are finally improved by means of designing 4-point parallel processing and a pipeline structure, optimizing an image cache structure, designing a feature point register, a block storage structure and the like.

The technical scheme of the invention is further described in detail through the drawings and the embodiments.

Drawings

FIG. 1 is a diagram of an image data matrix buffer shift architecture design;

FIG. 2 is a schematic diagram of a shift register matrix and FAST detection;

FIG. 3 is a schematic diagram of image blocking and response value correction;

FIG. 4 is a block diagram of a feature point extraction and selection algorithm according to the present invention;

FIG. 5 is a graph showing an example of the results of the present invention;

FIG. 6 is a graph of accuracy versus raw data of the present invention;

FIG. 7 is a timing simulation waveform diagram of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it will be understood that the terms "comprises" and "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

It should be understood that although the terms first, second, third, etc. may be used to describe the preset ranges, etc. in the embodiments of the present invention, these preset ranges should not be limited to these terms. These terms are only used to distinguish one preset range from another. For example, a first preset range may also be referred to as a second preset range, and similarly, a second preset range may also be referred to as a first preset range without departing from the scope of embodiments of the present invention.

Depending on the context, the word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection". Similarly, the phrase "if determined" or "if detected (stated condition or event)" may be interpreted as "when determined" or "in response to determination" or "when detected (stated condition or event)" or "in response to detection (stated condition or event), depending on the context.

Various structural schematic diagrams according to the disclosed embodiments of the present invention are shown in the accompanying drawings. The figures are not drawn to scale, wherein certain details are exaggerated for clarity of presentation and may have been omitted. The shapes of the various regions, layers and their relative sizes, positional relationships shown in the drawings are merely exemplary, may in practice deviate due to manufacturing tolerances or technical limitations, and one skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions as actually required.

The invention provides a hardware implementation method for extracting and selecting video stream feature points in real time, which aims at the conditions of large calculated amount and long time consumption in the feature point extraction process, and effectively improves the speed of an algorithm by simplifying combination logic on a key path and designing a 4-point parallel processing and pipeline structure; aiming at the problems of poor instantaneity and low precision in the characteristic point selection process, the invention designs the characteristic point register and the block selection storage structure by optimizing the image buffer structure, thereby effectively reducing the calculated amount and improving the precision and speed of the algorithm.

The invention discloses a hardware implementation method for extracting and selecting video stream feature points in real time, which comprises the following steps:

s1, converting an original image into a gray level image, downsampling, integrating 4 8-bit pixel points into 1 32-bit pixel units, and storing the pixel units, wherein the pixel units are aligned with the width of a data bus. Initializing a shared memory and a characteristic point register to 0;

before feature point extraction, data preprocessing and module initialization work are performed. If the original image is directly processed, the storage efficiency is low, and a large amount of logic resources are needed, so the invention converts 24-bit color information of the original image into 8-bit gray information, and then downsampling operation is carried out to reduce the storage size of the image to 1/12 of the original size. Meanwhile, the data bus width of a processor part on the FPGA is 32 bits, so that 4 pixels are combined into one pixel unit, and the throughput and the parallelism of an algorithm are improved. And finally, initializing the shared memory and the key register to ensure the normal transmission of the data on the pipeline.

S2, reading a pixel unit, constructing a sliding window with the size of 9 multiplied by 12 in a real-time module by utilizing a line buffer and a shift register, and simultaneously carrying out FAST feature judgment and score calculation of 18 points in the center;

referring to fig. 1, a 32-bit pixel unit is read from a data stream, and an image data sliding window is obtained by organizing a line buffer with a line width formed by 8 64-bit units and a shift register matrix with a size of 9×12 formed by 8-bit registers in an on-chip memory. After obtaining the pixel window shown in fig. 2, if the pixel unit (P ₄₄ -P ₄₇ ) If the maximum value suppression operation is performed, it is necessary to perform FAST feature point determination for pixels in the 3×3 region. For this purpose, a total of 18 points (P ₃₃ -P ₃₈ ,P ₄₃ -P ₄₈ ,P ₅₃ -P ₅₈ ) And (5) performing calculation.

In the overall implementation of the FAST algorithm, six-cycle parallel expansion and three-stage pipeline design are adopted, and 6 pixels of one row are processed at the same time, and all 18 pixels are processed in three times. In a specific hardware implementation, the invention firstly constructs a constant matrix of the angle point detector according to the threshold t, stores the constant matrix in the SROM, and classifies all pixels into 3 types. And then, judging the non-characteristic points by taking the central point as the center and the radius as 16 pixel points on 3, dividing the central point into 8 pairs according to the diagonal angle according to the previous classification, and continuously judging the points meeting the threshold value, wherein the direct scoring of the non-characteristic points is 0. The classification criteria are as follows:

and (3) carrying out cyclic parallel unfolding processing on the rest processes, and screening whether the continuous 9 pixel values are larger than or smaller than a threshold value than the pixel value of the central point in parallel, simultaneously calculating the absolute value of the pixel value difference between the central point and 16 points on the circle, and selecting the maximum value as the final score of the FAST corner, wherein the mathematical expression is as follows:

wherein V is _p The gray value of the pixel, V, is the center point _x The pixel value of 16 pixel points with the center point as the center and the radius of 3.

S3, performing non-maximum value suppression operation, mask shielding judgment and image rectangular region judgment in the 3X 3 neighborhood of the FAST feature point by using the result obtained in the step S2;

in order to avoid that a plurality of characteristic points detected by the algorithm are close together, the corner Score array Score [18 ] obtained in the step S2 is utilized]By comparing the scores of the center 4 point and its 14 neighboring candidate corner points, only the Score at the center point is calculated _p Satisfy not less than all neighboring point scores Score _x And reserving the center point for subsequent judgment. Meanwhile, considering the feature points at the edge of the image, the camera cannot find accurate matching in the next frame due to the motion of the camera, so that errors are generated. Therefore, the invention cuts the peripheral edges of the image, reduces the width by 2 delta omega, reduces the height by 2 delta h, and eliminates the characteristic points positioned at the edges of the image.

The invention also designs a mask shielding interface for the upper layer application, and can further avoid excessive concentration of the characteristic points through upper layer control. The mask is stored by BRAM composed of bit width as image width w and total number as image height h. A decision is made by reading the data on the corresponding bit of the current pixel, 1b '1 representing the strobe and 1b'0 representing the mask.

S4, simultaneously calculating Harris angular point response values of 4 points of the center and a block (row, col) with the size of 14 multiplied by 12, then carrying out edge correction, and multiplying the response values by corresponding penalty coefficients p to obtain corrected response values if the characteristic points are close to the block boundary;

in the overall implementation of the Harris corner response function, ten times of cyclic parallel expansion and seven-stage pipeline design are adopted, an autocorrelation matrix A of 10 pixels in one row is calculated at the same time, an A matrix of all pixel points on a 7X 10 window W is obtained through seven times of processing, and the A matrix can be used for representing gray level change conditions in different directions of an area where a center point is located. Designing four-time circulation parallel structure, respectively and simultaneously calculating 7X 7 adjacent areas where each central point isI _x I _y ,/>And calculating the corner response value R by using the three parameter sums.

The invention uses 3 registers with 10 to accumulate the matrix parameters calculated by parallel structure to optimize time sequence. In order to facilitate hardware implementation, a scale coefficient scale corresponding to a 7×7 window is accessed in advance by adopting a fixed point number, and calculation errors caused by quantization are reduced as much as possible by utilizing shift operation. The calculation formula of the response value R adopted in the invention is as follows:

scale=1.73287352846874 and k=0.04 in the present invention.

After the angular point response value is obtained by the original software algorithm, the characteristic points are stored in a dynamic memory mode, and in the iterative process, the dynamic memory area is repeatedly read and the control logic is judged, so that the real-time performance of the algorithm is reduced to a great extent. To solve this problem, the algorithm parallelism is improved.

Referring to fig. 3, the peripheral edges of the image are removed, and the invention divides the rest of the image into m×n blocks according to the size of 14×12 according to the resolution of the image, and each block finally outputs only one optimal feature point as a primary feature point of the real-time module, so as to reduce the computational complexity in the subsequent screening process. However, in this process, in order to further avoid the excessive concentration of the initially selected feature points, the present invention multiplies the response value of the feature points within 2 pixels at the edge of the block by a penalty factor P, so as to avoid the situation of selecting P2 and P3 or P2 and P4 simultaneously. Under the condition that the response values are not different greatly, the algorithm is guided to select P1 and P5 as the initially selected feature points as much as possible.

S5, combining the results in the step S4, if the response value R of the current feature point _curr A response value R greater than or equal to the corresponding column in the register _col Storing the current feature point to a corresponding position in the shared memory, and updating a feature point register;

since the image stream is actually input line by line in order from top left to bottom right of the image, the initially selected feature points within a line segment can be obtained only after 14 lines of pixels are all input. Therefore, in the hardware implementation, the response value registers with the size of N are designed, the initial values are all 0, and only the response value R of the current feature point exists _curr A response value R greater than or equal to the corresponding column in the register _col And storing the current characteristic point to a position corresponding to the partition in the shared memory, and updating a larger value to the characteristic point register. And refreshing the characteristic point register to be 0 after all 14 rows of pixels are detected, and detecting the next row of blocks.

S6, continuing to read the next pixel unit in the image stream, and circulating the steps S2 to S5 until all initially selected feature points in the current frame are obtained;

the next pixel element in the image stream is read continuously every clock cycle, i.e. pipeline=1 is implemented, processing 4 pixels at a time. And (5) circulating the steps S2 to S5 in a pipeline mode until a primary selected characteristic point set in the current frame is obtained, and storing the primary selected characteristic point set in a shared memory among the modules. Because the invention is limited by the input signal, each clock period needs to process the corresponding effective data, the invention calls the real-time characteristic point detection and preliminary screening pipeline part for realizing the steps S2-S5 as a real-time processing module.

S7, reading the initial characteristic point set in the shared memory in the post-processing module, eliminating the zero point in the initial characteristic point set, screening the number of characteristic points required by the task through sequencing, and outputting the number of characteristic points in a BRAM interface mode.

The method is realized by a post-processing module, firstly, the initially selected characteristic points in the shared memory are read out, the zero points are removed, then, the number of the characteristic points required by the task is selected through sequencing, and the next stage of IP or CPU is connected in a BRAM interface mode. Since this part has no strict throughput limitation and is less computationally intensive, it is only necessary to complete in the gap between two frames.

In still another embodiment of the present invention, a hardware implementation system for extracting and selecting feature points of a video stream in real time is provided, where the system can be used to implement the above hardware implementation method for extracting and selecting feature points of a video stream in real time, and in particular, the hardware implementation system for extracting and selecting feature points of a video stream in real time includes a conversion module, a calculation module, a judgment module, a correction module, an update module, a processing module, and an implementation module.

The conversion module converts an original image in the video stream into a gray image, downsamples the gray image, integrates every 4 adjacent 8-bit pixel points in the downsampled gray image into 1 32-bit pixel units, and then stores the 32-bit pixel units in the DDR3 chip according to the sequence;

the calculating module reads one pixel unit stored in the DDR3 chip in the converting module, a sliding window is constructed in the FPGA by utilizing a line buffer and a shift register, FAST characteristic judgment is carried out on 18 points in the center of the sliding window, and the score of the FAST characteristic point is calculated;

the judging module is used for carrying out non-maximum value inhibition operation, mask shielding judgment and image rectangular area judgment on the 3 multiplied by 3 neighborhood of the FAST characteristic points by utilizing the FAST characteristic point score obtained by the calculating module, and taking the center point meeting the condition as a candidate characteristic point to participate in subsequent selection;

the correction module calculates Harris angular point response values of 4 points in the center of the sliding window and blocks (row, col) with the size of 14 multiplied by 12 where the 4 points in the center of the sliding window are located by utilizing the sliding window obtained in the step S2, then performs edge correction on the Harris angular point response values of the 4 points, and if the characteristic point distance block boundary is smaller than 2 pixels, multiplies the response value of the characteristic point by a corresponding penalty coefficient p to obtain a corrected response value;

the processing module is used for sequentially continuously reading the next pixel unit obtained in the conversion module, traversing all pixel units of the current image through the calculation module, the judgment module, the correction module and the updating module until a primary characteristic point set in the current image is obtained, then releasing the storage space of the current image in the DDR3 chip, and continuously processing the next frame of video image stored in the conversion module;

and the realization module is used for reading the initially selected feature point set in the shared memory through the FPGA after the processing module processes the video image of one frame, eliminating the zero point, selecting the number of feature points required by the task through sequencing, outputting the feature points of the current image to the FPGA in the form of a BRAM interface, and completing the hardware realization of the real-time extraction and selection of the feature points of the video stream.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention carries out hardware acceleration design aiming at the traditional corner detection method, and optimizes a uniform selection scheme of feature points aiming at hardware key points. Based on an ICE-BA open source algorithm framework, the ICE-BA open source algorithm framework is successfully deployed into the FPGA, the whole algorithm frame is shown in FIG. 4, and the single-point FPGA resource consumption of the hardware system is shown in Table 1. The method has good effects on speed and precision, and well balances the consumption of hardware resources.

Table 1 statistics of hardware resource consumption of the present invention

Name of the name	Used resources
		Number of Slice LUTs	34585
Number of Slice FF	30582
		Number of DSP48Es	37
Number of BRAM/FIFO	6.5

In speed, the real-time module realizes pipeline design of pipeline=1, and realizes real-time processing of image streams. The post-processing module utilizes the time gap of two frames of images to carry out rapid screening, so that the pipeline work is not influenced, and the area is saved. The accuracy of the algorithm is improved, and the positioning accuracy is improved by more than 20% compared with the original algorithm by adopting the characteristic point extraction and selection method in the invention through multiple experimental verification. In addition, the algorithm in the invention has good expansibility, can realize the design of multipoint parallel processing in one clock period, and can well meet the real-time processing requirement of single-path high-resolution or multi-path common high-speed video streams.

Taking mh_05 in the EuRoC dataset as an example, the algorithm herein was tested using an image stream with a resolution of 752 x 480 in the test. Fig. 5 shows the results of the algorithm running in the test, fig. 7 shows the timing waveforms of the hardware, and the results of the precision evaluation are shown in fig. 6 and table 2.

Table 2 comparison of algorithm accuracy before and after modification

The invention can realize real-time processing of the data in the image stream in each clock period, and the average error and the root mean square error are improved by more than 30% compared with those before modification, so that the actual requirement of a visual odometer hardware system can be met.

In summary, the invention discloses a hardware implementation method and a system for extracting and selecting feature points of a video stream in real time, and please summarize and explain effects of the invention.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above is only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited by this, and any modification made on the basis of the technical scheme according to the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. The hardware implementation method for extracting and selecting the video stream feature points in real time is characterized by comprising the following steps:

2. The method according to claim 1, wherein in step S1, 24-bit color information of an original image is converted into 8-bit gray scale information, the storage size of the original image in the video stream is reduced to 1/12 of the original size by downsampling operation, and every 4 adjacent 8-bit pixel points in the downsampled gray scale image are integrated into 1 32-bit pixel units and sequentially stored in a DDR3 chip.

3. The method for implementing real-time extraction and selection of video stream feature points according to claim 1, wherein in step S2, a sliding window is constructed by using a line buffer and a shift register, and a Score of a center point of the sliding window is specifically:

4. The method for implementing real-time extraction and selection of feature points of a video stream according to claim 1, wherein in step S2, performing FAST feature judgment on a center point of a sliding window specifically includes:

5. The method for implementing real-time extraction and selection of video stream feature points according to claim 1, wherein in step S3, performing non-maximum suppression operation, mask judgment and image rectangular region judgment on FAST feature points is specifically:

comparing the scores of 4 center points and 14 adjacent candidate corner points, and scoring Score of the center points _p Meet the Score of all neighboring candidate corner points more than or equal to _x When the central point is reserved, the non-polar is completedA large value inhibit operation; and reading data on the corresponding position of the current pixel to perform mask shielding judgment, then cutting the peripheral edge of the gray level image after downsampling, and eliminating the characteristic points of the cut part.

6. The hardware implementation method for extracting and selecting feature points of a video stream in real time according to claim 1, wherein in step S4, the corner response value R is:

7. The method for implementing real-time extraction and selection of video stream feature points according to claim 1, wherein in step S4, edge correction specifically comprises:

8. The method for real-time extraction and selection of video stream feature points according to claim 1, wherein in step S5, a response value register with size N is designed, initial values are all 0, and response value R of current feature point _curr A response value R greater than or equal to the corresponding column in the register _col When the current feature point is stored in the shared memory at the position corresponding to the block, and the response value R is obtained _curr Updating the characteristic point register; the feature point register is refreshed to all 0's each time 14 rows of pixels are all detected.

9. The hardware implementation method for extracting and selecting feature points of a video stream in real time according to claim 1, wherein step S6 is specifically:

10. A hardware implementation system for extracting and selecting feature points of a video stream in real time, comprising: