CN114943891A - Prediction frame matching method based on feature descriptors - Google Patents

Prediction frame matching method based on feature descriptors Download PDF

Info

Publication number
CN114943891A
CN114943891A CN202210417188.4A CN202210417188A CN114943891A CN 114943891 A CN114943891 A CN 114943891A CN 202210417188 A CN202210417188 A CN 202210417188A CN 114943891 A CN114943891 A CN 114943891A
Authority
CN
China
Prior art keywords
gradient
matching
descriptor
prediction
prediction frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210417188.4A
Other languages
Chinese (zh)
Inventor
邵巍
李帅
肖扬
王光泽
姚文龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University of Science and Technology
Original Assignee
Qingdao University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University of Science and Technology filed Critical Qingdao University of Science and Technology
Priority to CN202210417188.4A priority Critical patent/CN114943891A/en
Publication of CN114943891A publication Critical patent/CN114943891A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep learning prediction frame matching method based on feature descriptors, and relates to the technology of target object recognition and matching in the field of image recognition. According to the method, the support region of the prediction frame is identified through determining deep learning, the region gradient is obtained to construct the feature descriptor with scale invariance, and the formed feature vector is used for matching the identification result. Simulation results show that the method has stronger robustness on translation, rotation, scaling and scale transformation, and has important significance on development in the fields of target tracking, visual navigation and the like.

Description

Prediction frame matching method based on feature descriptors
Technical Field
The invention belongs to the technology of identifying and matching a target object in the field of image identification, relates to the field of image identification and the field of image feature matching based on artificial intelligence, and particularly relates to a prediction frame matching method based on feature descriptors.
Background
With the development of artificial intelligence, deep learning networks are widely applied to image recognition, and an artificial intelligence technology based on deep learning is one of the development directions of future image recognition. The deep features of the image are mined by adopting the feature extraction network, so that the detection rate of the target object is effectively improved, but the matching of the target object is difficult to complete by the identification network, and the robustness is not high under large-scale view angle transformation.
In view of this, how to design an algorithm capable of matching the recognition results is a problem to be solved urgently by those skilled in the art.
Disclosure of Invention
The invention solves the technical problem that the target matching is difficult to complete by the existing identification network, combines the artificial intelligent identification network with the feature descriptor matching algorithm and provides a prediction frame matching algorithm based on the feature descriptor.
The deep learning prediction Frame matching algorithm based on the feature descriptors provided by the invention comprises selection of a prediction Frame Support area (Frame Support Range), generation of the feature descriptors and matching of the feature descriptors. Firstly, down-sampling and Gaussian blurring are carried out on an image to construct a scale space pyramid. The purpose of the scale space pyramid is to simulate multi-scale features of image data, describe a prediction frame of each image in the pyramid, and select the minimum circumcircle of the prediction frame as a prediction frame support area, so that pixel information contained in a sampling area is not changed after the image is rotated. The prediction block described in this way is then guaranteed to have scale invariance. Then, the main direction of the descriptor is determined by using the pixel gradient of the support region, the sector neighborhood is uniformly divided into 8 sectors by taking the main direction as a starting point, and the support region is equally divided into eight sector regions (F) 1 ,F 2 ,......,F m ) Each sector is a sub-region of the FSR. Each sub-region is used as a component vector, all pixels in the sub-region are counted respectively, 8 × 8 feature vectors are formed according to the gray gradient projection values of the eight directions of 0-315 degrees of image coordinates, and normalization processing is performed according to the main direction. And performing Gaussian weighting operation on the gradient of each pixel to reduce the influence of noise of peripheral points on the characteristic value to form a 64-dimensional vector descriptor. Performing brightness normalization on the 64-dimensional vector to reduce illumination variationThe impact caused by the heat; matching of targets is completed by two-point set prediction box descriptor comparison. The similarity measure of descriptors with 128 dimensions is represented by the euclidean distance. And circularly acquiring the prediction frames from the previous frame image, and matching the corresponding prediction frame in the next frame image for each prediction frame in the previous frame image. And secondly, setting Euclidean distance threshold values between descriptors, and judging whether prediction frames extracted from the two images are similar or not according to the threshold values.
Compared with the prior art, the invention has the advantages and positive effects that:
the invention provides a prediction frame matching method based on a feature descriptor, and provides a solution to the problems that the matching of a target object is difficult to complete by an identification network and the robustness is not high under the large-scale visual angle transformation.
The method provided by the invention has invariance when the image is subjected to scaling transformation, translation rotation transformation and illumination matching, and is irrelevant to the size of a prediction frame.
Drawings
Various aspects of the invention will become more apparent to the reader upon reading the detailed description with reference to the accompanying drawings. Wherein the content of the first and second substances,
fig. 1 shows an overall flow chart of the algorithm of the present invention.
Fig. 2 is a schematic view of the support area.
Fig. 3 is a relationship between the number of support region partitions and the correct matching rate.
Fig. 4 is a schematic diagram of support area division.
FIG. 5 is a diagram illustrating the weight of the support region of the prediction box.
Fig. 6 is a feature descriptor gradient histogram.
Fig. 7 is a graph of the matching result in the scaling.
Fig. 8 is a diagram showing a matching result in the rotation transformation.
Fig. 9 is a diagram of a matching result in the shift transform.
Detailed Description
Fig. 1 shows a general flow chart of the algorithm of the present invention. The embodiments of the present invention will be described in further detail with reference to the drawings.
Referring to fig. 1, first in the process of support area determination: and performing down-sampling and Gaussian blur on the image to construct a scale space pyramid. The purpose of the scale space pyramid is to simulate the multi-scale features of image data and describe the prediction frame of each image in the pyramid, so that the described prediction frame is ensured to have scale invariance. As shown in fig. 2, the minimum circumscribed circle of the prediction box is taken as the prediction box support area. When the descriptor of the shape has a slight proportion change in the corresponding region, each fan-shaped region can keep more pixels as corresponding pixels compared with a rectangle, and the areas of the changed regions are the same; in the feature descriptor generation process: when the descriptor is generated for the information in the prediction frame, fuzzy processing is firstly carried out, and the influence of noise on matching is reduced. In order to avoid local information from being damaged, a Gaussian filtering algorithm is adopted to replace average filtering; the gaussian function of the two-dimensional normal distribution is used when weighting the pixels later:
Figure BDA0003605290680000021
and collecting gradient and direction distribution characteristics of pixels in the support area of the prediction frame where the pyramid is located on each layer of pyramid. Each prediction box is assigned a reference direction using the support region pixel gradient. In order to achieve complete rotational invariance, when the gradient and direction of a certain point are calculated, adaptive adjustment is performed according to the position of the certain point. For a binary function f (x, y), at point (x) 0 ,y 0 ) In the horizontal direction, gradient g (x) 0 ,y 0 ) h Gradient g (x) from vertical 0 ,y 0 ) v Can be expressed as:
Figure BDA0003605290680000022
Figure BDA0003605290680000031
the image is at a pixel point (x) 0 ,y 0 ) The gradient amplitude m (x, y) and the direction α (x, y) are:
Figure BDA0003605290680000032
Figure BDA0003605290680000033
after gradient calculation is completed on all pixel points, summation processing needs to be performed on gradient amplitudes according to the divided gradient direction intervals so as to construct a gradient histogram. If the gradient direction division granularity is too fine, the calculation amount is increased when the descriptor is constructed, and the noise interference is stronger when the descriptor is matched; conversely, if the gradient division size is too coarse, the image detail features will be lost, and the matching effect will be poor. Fig. 3 shows the relationship between the number of patches and matching accuracy. When the gradient direction is divided into 1 interval at every 45 degrees and is divided into 8 intervals, a better matching effect can be obtained under the condition of small calculated amount. Fig. 4 is a schematic diagram of support area division in the case of 8 divisions.
For the ith directional interval, the gradient magnitude summation can be expressed as:
Figure BDA0003605290680000034
wherein, w (x, y) is a weight coefficient, which represents the influence of the pixel gradient amplitude in the i interval, and the calculation process is shown in formula (7). When the image rotates, the edge information of the sensing area changes correspondingly, and correspondingly, the pixels close to the central area can be kept stable relatively. And giving corresponding weight coefficients when summing the gradient amplitudes according to the position conditions of the pixel points.
Figure BDA0003605290680000035
For a rectangular sensing region, the sensing region is divided into 4 × 4 meshes, and the weight coefficient of each mesh is obtained according to the gaussian function shown in formula (1), and the result is shown in fig. 5. And further, combining the circumscribed circle approximate perception target in the step one, and adjusting the network weight coefficient according to the network coverage relation between the circle and the rectangular frame. The circumscribing circle support region is shown in fig. 2. And calculating the sum of gradient amplitudes of all directional intervals according to a network weight coefficient grid, and constructing a gradient histogram, wherein the horizontal axis is the divided eight gradient directional intervals, and the vertical axis is the sum of the gradient amplitudes in the corresponding intervals. And then selecting the direction interval with the maximum gradient amplitude sum as the main direction interval, and traversing all intervals from the interval counterclockwise to construct the descriptor. For the histogram shown in fig. 6, where the k-th bin is the primary direction bin, the corresponding descriptor can be expressed as:
Figure BDA0003605290680000041
when the image is changed in scale, the whole pixel number is changed, and when the image is changed in brightness, the pixel value is changed integrally, but the proportion of the sum of the gradient amplitudes of the intervals in all directions can be kept relatively stable. Therefore, the descriptor D is subjected to normalization processing. Let the maximum value in D be M max Minimum value of M min Then the normalized descriptor D can be expressed as:
Figure BDA0003605290680000042
and finally, matching the prediction frames in the two images, and establishing a prediction frame description subset through the feature descriptors. Matching of targets is completed by two-point set prediction box descriptor comparison. The similarity measure of descriptors with 128 dimensions is represented by the euclidean distance. And circularly acquiring the prediction frames from the previous frame image, and matching the corresponding prediction frame in the right image for each prediction frame in the previous frame image. Secondly, setting Euclidean distance threshold values among descriptors, and considering vectors among different descriptorsThe length is different, and it is difficult to set a threshold value with generality, and the matching effect is measured by using the relative distance. For descriptor
Figure BDA0003605290680000043
When matching, if the best matching item is
Figure BDA0003605290680000044
Judging that the matching establishment conditions are as follows:
Figure BDA0003605290680000045
and judging whether the prediction frames extracted from the two images are similar or not according to the threshold value. Meanwhile, the nearest neighbor ratio method is adopted to reduce the mismatching, and the matching result is shown in figures 7-9. Under the translation change, the correct matching rate is 83.05%, and the mismatching rate is 1.69%; the correct matching rate is 81.36% under the rotation change, and the mismatching rate is 6.78%; the correct matching rate under the scale change is 89.25%, and the mismatching rate is 2.94%.
In conclusion, the method performs prediction frame matching on the image identified by the deep learning network, and the final result shows that the prediction frame matching method based on the feature descriptor realizes prediction frame matching and has important significance for target identification, target tracking, visual navigation and the like.
The above description is only for the preferred embodiment of the present invention and is not intended to limit the present invention in other forms, and any person skilled in the art may apply the above modifications or changes to the equivalent embodiment and apply it in other fields without departing from the technical scope of the present invention.

Claims (4)

1. A deep learning prediction box matching method based on feature descriptors is characterized by comprising the following steps:
step A, selecting a prediction frame support area;
taking a circular area with the diagonal line of the prediction Frame as the diameter and the intersection point as the center of a circle as a prediction Frame Support area (Frame Support Range), wherein the FSR contains all pixel information inside the prediction Frame;
step B, generating a feature descriptor for the extracted prediction frame, wherein the step B comprises the following steps;
b1, constructing a Frame descriptor (previous) in the support area;
and C, performing prediction frame matching on the generated feature descriptors.
2. The support area determination method according to claim 1, characterized in that:
and A1, taking the minimum circumcircle of the prediction box as a prediction box support area. When the descriptor of the shape changes in rotation in the corresponding region, each fan-shaped region can hold a larger proportion of initial pixels compared with the rectangle, and the areas of the changed regions are the same;
a2, adopting Gaussian weighting to reduce the influence of the peripheral sub-region, so that the descriptor has scaling invariance, and the descriptor has scale invariance under the action of a Gaussian pyramid;
3. a descriptor schema method according to claim 2, characterized in that:
b11, a method for constructing descriptors in a circular supporting area by using sector area division is provided; when the descriptor is generated for the information in the prediction frame, fuzzy processing is firstly carried out, and the influence of noise on matching is reduced. In order to avoid local information from being damaged, a Gaussian filtering algorithm is adopted to replace average filtering; then, a gaussian function of two-dimensional normal distribution is used in weighting the pixels:
Figure FDA0003605290670000011
collecting on each layer of pyramidThe gradient and direction distribution characteristics of the pixels in the support area of the prediction frame. The support region pixel gradient and the highest direction are taken as the main directions of the prediction box. In order to achieve complete rotational invariance, when the gradient and direction of a certain point are calculated, adaptive adjustment is performed according to the position of the certain point. For a binary function f (x, y), at point (x) 0 ,y 0 ) In the horizontal direction, gradient g (x) 0 ,y 0 ) h Gradient g (x) from vertical 0 ,y 0 ) v Can be expressed as:
Figure FDA0003605290670000012
Figure FDA0003605290670000013
the image is at a pixel point (x) 0 ,y 0 ) The gradient amplitude m (x, y) and the direction α (x, y) are:
Figure FDA0003605290670000014
Figure FDA0003605290670000015
respectively calculating the horizontal gradient g (x, y) of each pixel point h And g (x) 0 ,y 0 ) v And then, performing gradient vector synthesis according to a formula in B11 to obtain the gradient direction and amplitude of the pixel point.
And B12, after gradient calculation is completed on all the pixel points, summing processing needs to be carried out on the gradient amplitude values according to the divided gradient direction intervals so as to construct a gradient histogram. If the gradient direction division granularity is too fine, the calculation amount is increased when the descriptor is constructed, and the noise interference is stronger when the descriptor is matched; conversely, if the gradient division size is too coarse, the image detail features will be lost, and the matching effect will be poor. Through experiments, the gradient direction is divided into 1 interval at every 45 degrees, and when the gradient direction is divided into 8 intervals, a good matching effect can be obtained under the condition of small calculated amount.
For the ith directional interval, the gradient magnitude summation can be expressed as:
Figure FDA0003605290670000021
wherein, w (x, y) is a weight coefficient, represents the influence of the pixel gradient amplitude in the i interval, and is obtained by the normalization operation of a Gaussian function. When the image is rotated and changed, the gradient amplitude is made more by the gradient of the central area under the action of the weight coefficient.
Figure FDA0003605290670000022
4. The matching method according to claim 2: in the step C, matching according to the descriptor includes the steps of:
c1, establishing a prediction box descriptor set through the feature descriptors;
the similarity measure of the descriptor with 128 dimensions, C2, is represented by the euclidean distance. And circularly acquiring the prediction frames from the previous frame image, and matching the corresponding prediction frame in the next frame aiming at each prediction frame in the previous frame image. Secondly, setting Euclidean distance threshold values among descriptors, considering the difference of vector lengths among different descriptors, setting the threshold values with generality is difficult, and here, the matching effect is measured by using relative distance. For descriptor
Figure FDA0003605290670000023
When matching, if the best matching item is
Figure FDA0003605290670000024
Then judgeThe matching conditions are as follows:
Figure FDA0003605290670000025
CN202210417188.4A 2022-04-20 2022-04-20 Prediction frame matching method based on feature descriptors Pending CN114943891A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210417188.4A CN114943891A (en) 2022-04-20 2022-04-20 Prediction frame matching method based on feature descriptors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210417188.4A CN114943891A (en) 2022-04-20 2022-04-20 Prediction frame matching method based on feature descriptors

Publications (1)

Publication Number Publication Date
CN114943891A true CN114943891A (en) 2022-08-26

Family

ID=82906316

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210417188.4A Pending CN114943891A (en) 2022-04-20 2022-04-20 Prediction frame matching method based on feature descriptors

Country Status (1)

Country Link
CN (1) CN114943891A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116229318A (en) * 2023-02-24 2023-06-06 云贵亮 Information analysis system based on branch data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116229318A (en) * 2023-02-24 2023-06-06 云贵亮 Information analysis system based on branch data
CN116229318B (en) * 2023-02-24 2023-09-22 湖北联投咨询管理有限公司 Information analysis system based on branch data

Similar Documents

Publication Publication Date Title
CN113592845A (en) Defect detection method and device for battery coating and storage medium
CN110334762B (en) Feature matching method based on quad tree combined with ORB and SIFT
CN107945111B (en) Image stitching method based on SURF (speeded up robust features) feature extraction and CS-LBP (local binary Pattern) descriptor
CN111145228B (en) Heterologous image registration method based on fusion of local contour points and shape features
Ma et al. Region-of-interest detection via superpixel-to-pixel saliency analysis for remote sensing image
CN108304883A (en) Based on the SAR image matching process for improving SIFT
CN112150520A (en) Image registration method based on feature points
CN110309808B (en) Self-adaptive smoke root node detection method in large-scale space
CN111369605B (en) Infrared and visible light image registration method and system based on edge features
Musci et al. Assessment of binary coding techniques for texture characterization in remote sensing imagery
CN113392856B (en) Image forgery detection device and method
Yang et al. Robust semantic template matching using a superpixel region binary descriptor
CN108550165A (en) A kind of image matching method based on local invariant feature
CN110929598B (en) Unmanned aerial vehicle-mounted SAR image matching method based on contour features
Li et al. Deformable dictionary learning for SAR image change detection
CN111199245A (en) Rape pest identification method
del-Blanco et al. Robust people indoor localization with omnidirectional cameras using a grid of spatial-aware classifiers
CN103413312A (en) Video target tracking method based on neighborhood components analysis and scale space theory
CN114359591A (en) Self-adaptive image matching algorithm with edge features fused
CN112907580A (en) Image feature extraction and matching algorithm applied to comprehensive point-line features in weak texture scene
CN114943891A (en) Prediction frame matching method based on feature descriptors
CN111105436B (en) Target tracking method, computer device and storage medium
CN104268550A (en) Feature extraction method and device
CN116129191B (en) Multi-target intelligent identification and fine classification method based on remote sensing AI
CN112101283A (en) Intelligent identification method and system for traffic signs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination