CN112926593A

CN112926593A - Image feature processing method and device for dynamic image enhancement presentation

Info

Publication number: CN112926593A
Application number: CN202110193780.6A
Authority: CN
Inventors: 李毅; 吴益剑; 范林龙
Original assignee: Wenzhou University
Current assignee: Wenzhou University
Priority date: 2021-02-20
Filing date: 2021-02-20
Publication date: 2021-06-08

Abstract

The present invention provides an image feature processing method used in dynamic image enhancement and presentation, which includes acquiring a template image and a recognition image; detecting the feature points of the template image and the recognition image respectively according to a preset FAST algorithm; The method is to smooth and blur the template image and the recognition image containing the feature key points, so that the feature points of the template image and the recognition image can be represented by binary codes; , using the Hamming distance calculation to obtain several feature points whose matching degree between the template image and the recognition image meets the predetermined conditions, and output them. The implementation of the invention can reduce the algorithm difficulty of feature extraction of augmented reality template images, fully meet the real-time requirements of feature extraction and matching, and is not affected by image environment noise and transformation to a certain extent.

Description

Image feature processing method and device for dynamic image enhancement presentation

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image feature processing method and an image feature processing device for use in enhanced presentation of a dynamic image.

Background

The Augmented Reality (Augmented Reality) technology is a rendering technology which combines computer graphics and computer vision, detects and matches the characteristics of images, and maps a three-dimensional model and the virtual and real space above the images.

As a dimension extension of human beings observing things in the real world, the augmented reality technology can enhance the acquisition of digital information (including characters, images, three-dimensional models, voice and the like) of things and realize dynamic interaction with virtual objects through analog simulation. Therefore, the augmented reality technology is widely applied to a plurality of fields of media entertainment, education and medical training, operation virtual simulation, industrial manufacturing assistance, intelligent driving guidance and the like, and intelligent digital interaction convenience is provided for human production and life.

It is well known that augmented reality technology requires the acquisition and extraction of valid features of a template image. Due to the influence of limited hardware equipment acquisition efficiency and resolution, the image recognition and model enhancement presentation effect is directly related. In addition, when the augmented reality technology is applied to practical augmented applications, the influence of various factors in a complex environment on the recognition accuracy of the template image needs to be overcome. Therefore, how to quickly and accurately detect the features of the template image, perform space positioning, virtual and real shielding interaction and extract the natural features of the key region, and overcome the influence of factors such as environment on the augmented presentation becomes a hotspot concerned by researchers on the augmented reality technology.

At present, relevant scholars at home and abroad carry out deep research on the problems. The traditional augmented reality template image feature extraction mainly comprises simple label feature extraction and natural feature extraction. The simple label feature extraction is difficult to bring good experience to users in practical application scenes due to the crude form, and is gradually replaced by natural feature extraction. However, although natural feature extraction gradually becomes a leading role in augmented reality application, the algorithm complexity is high, the requirement of real-time feature matching in augmented reality application cannot be met, and the method is also influenced by image environment noise and transformation to a certain extent.

Therefore, it is necessary to provide an image feature processing method, which can reduce the difficulty of the algorithm for extracting the features of the augmented reality template image, fully meet the real-time requirement for feature extraction and matching, and is not affected by the noise and transformation of the image environment to a certain extent.

Disclosure of Invention

The technical problem to be solved by the embodiments of the present invention is to provide an image feature processing method and apparatus for use in dynamic image enhancement presentation, which can reduce the difficulty of an algorithm for extracting features of an augmented reality template image, fully meet the real-time requirement for feature extraction and matching, and are not affected by image environment noise and transformation to a certain extent.

In order to solve the above technical problem, an embodiment of the present invention provides an image feature processing method for use in dynamic image enhanced presentation, where the method includes:

acquiring a template image and an identification image;

respectively detecting the characteristic points of the template image and the identification image according to a preset FAST algorithm;

according to a preset BRIEF method, smooth fuzzy processing is carried out on a template image containing characteristic key points and an identification image, so that the characteristic points of the template image and the identification image can be represented by binary codes;

and calculating and acquiring a plurality of characteristic points with matching degree meeting a preset condition between the template image and the identification image by utilizing a Hamming distance based on the characteristic points of the template image and the identification image which are respectively represented by binary codes, and outputting the characteristic points.

The specific steps of respectively detecting the feature points of the template image and the feature points of the identification image according to a preset FAST algorithm comprise:

reading a current target image; wherein the current target image is the template image or the identification image;

determining a circumference of a current target image within a range of 4 pixels by taking any pixel point P as a circle center, selecting gray values of 16 pixel points on the determined circumference, and further comparing the gray values of the selected 16 pixel points within a preset gray threshold range;

and if the gray value of more than 8 connected pixel points is judged to be larger or smaller than the gray value of the pixel point P, selecting the pixel point P as the key point of the current target image.

The specific steps of performing smooth blurring processing on both a template image containing feature points and an identification image according to a preset BRIEF method so that the feature points of both the template image and the identification image can be represented by binary codes include:

reading a current target image; the current target image is a template image containing characteristic points or an identification image containing the characteristic points;

taking the feature points extracted from the current target image as the center, selecting a window with a certain proportion, randomly selecting N pairs of pixel points in the selected window, and further comparing the pixel values between each pair of the selected pixel points according to a formula (1) to obtain that the feature points in the current target image can be represented by binary codes:

wherein, P (x)₁) Is the pixel value of the random point x1 ═ P (u1, v1) (x)₂) The pixel value of the random point x2 is (u2, v2), and ui is a horizontal coordinate value of the pixel point i in the current target image; and vi is a vertical coordinate value of the pixel point i in the current target image.

Wherein the binary code is a 256-bit binary code.

The specific steps of calculating and acquiring a plurality of feature points with matching degree meeting a preset condition between the template image and the identification image by utilizing a Hamming distance based on the feature points characterized by binary codes of the template image and the identification image respectively and outputting the feature points comprise:

calculating the Hamming distance between the feature point of the template image and the distance center point of the identification image according to formula (2):

wherein, P_nCharacteristic points of the template image are obtained; p_c,iIs the distance center point of the identification image;

if the calculated Hamming distance is smaller than or equal to a preset threshold value, taking the distance center point of the identification image as a clustering center point, and further matching with other feature points under the clustering center point to calculate the Hamming distance so as to obtain the optimal matching result;

and if the calculated Hamming distance is larger than the set threshold, finishing the matching, and recording and outputting the obtained matching result.

The embodiment of the present invention further provides an image feature processing apparatus for enhanced presentation of dynamic images, including:

an image acquisition unit for acquiring a template image and an identification image;

the image feature extraction unit is used for respectively detecting the feature points of the template image and the identification image according to a preset FAST algorithm;

the image feature processing unit is used for performing smooth fuzzy processing on a template image and an identification image which contain feature key points according to a preset BRIEF method, so that the feature points of the template image and the identification image can be represented by binary codes;

and the image feature matching unit is used for calculating and acquiring a plurality of feature points of which the matching degree between the template image and the identification image meets a preset condition by utilizing the Hamming distance and outputting the feature points.

Wherein the image feature extraction unit includes:

the first image reading module is used for reading a current target image; wherein the current target image is the template image or the identification image;

the pixel points face the comparison module and are used for determining a circumference which takes any pixel point P as a circle center and is within a range of 4 pixels in the current target image, selecting gray values of 16 pixel points on the determined circumference, and further comparing the gray values of the 16 selected pixel points within a preset gray threshold range;

and the characteristic key point extraction module is used for selecting the pixel point P as the key point of the current target image if the gray value of more than 8 connected pixel points is judged to be greater than or less than the gray value of the pixel point P.

Wherein the image feature processing unit includes:

the second image reading module is used for reading the current target image; the current target image is a template image containing characteristic points or an identification image containing the characteristic points;

the characteristic point processing module is used for selecting a window with a certain proportion by taking the characteristic points extracted from the current target image as a center, randomly selecting N pairs of pixel points in the selected window, and further comparing the pixel value between each pair of the selected pixel points according to a formula (1) to obtain that the characteristic points in the current target image can be represented by binary codes:

Wherein the image feature matching unit includes:

a hamming distance calculating module, configured to calculate a hamming distance between the feature point of the template image and the distance center point of the recognition image according to formula (2):

wherein, P_nFor said template imageFeature points; p_c,iIs the distance center point of the identification image;

the feature matching module is used for taking the distance center point of the identification image as a clustering center point if the calculated Hamming distance is less than or equal to a preset threshold value, and further performing matching calculation on the distance center point and other feature points under the clustering center point to obtain the optimal matching result;

and the matching result output module is used for finishing matching if the calculated Hamming distance is larger than the set threshold value, and recording and outputting the obtained matching result.

The embodiment of the invention has the following beneficial effects:

the invention is based on FAST (Features from Accelerated Segments Test) feature point detection and BRIEF (Binary Robust Independent basic Features) feature description vector creation algorithm, ensures the rotation invariance of feature points, obviously optimizes and improves the efficiency and accuracy of feature detection, greatly accelerates the speed of feature descriptor creation, and avoids the influence of high-frequency noise points of acquisition equipment environment factors and images on Binary descriptor over sensitivity, thereby reducing the algorithm difficulty of augmented reality template image feature extraction, fully meeting the real-time requirement of feature extraction and matching, and not being influenced by image environment noise points and transformation to a certain extent.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is within the scope of the present invention for those skilled in the art to obtain other drawings based on the drawings without inventive exercise.

FIG. 1 is a flowchart of an image feature processing method for use in a dynamic image enhanced presentation according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a detection result of a FAST feature point in an application scene of an image feature processing method for use in enhanced presentation of a dynamic image according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a feature matching result at different positions of an image in an application scene of an image feature processing method for use in enhanced presentation of a dynamic image according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a feature matching result in an actual augmented reality environment in an application scene of an image feature processing method for use in dynamic image augmented presentation according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an image feature processing apparatus for use in enhanced presentation of dynamic images according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.

As shown in fig. 1, in an embodiment of the present invention, there is provided an image feature processing method for use in a dynamic image enhanced presentation, where the method includes the following steps:

step S1, acquiring a template image and an identification image;

step S2, respectively detecting the characteristic points of the template image and the identification image according to a preset FAST algorithm;

step S3, according to a preset BRIEF method, performing smooth fuzzy processing on a template image and an identification image both containing feature key points, so that the feature points of the template image and the identification image can be represented by binary codes;

and step S4, based on the feature points represented by the template image and the identification image respectively in binary codes, calculating by using a Hamming distance to obtain a plurality of feature points with matching degree between the template image and the identification image meeting a preset condition, and outputting the feature points.

Specifically, in step S1, a template image and a recognition image are acquired.

In step S2, first, the current target image is read; wherein the current target image is the template image or the identification image; secondly, determining a circumference of the current target image within a range of 4 pixels by taking any pixel point P as a circle center, selecting gray values of 16 pixel points on the determined circumference, and further comparing the gray values of the selected 16 pixel points within a preset gray threshold range; and finally, if the gray value of more than 8 connected pixel points is judged to be larger or smaller than the gray value of the pixel point P, selecting the pixel point P as the key point of the current target image.

Experiments have shown that comparing only four equally spaced pixels on the circumference has the same effect as traversing 16 pixels, but the optimized search time can be shortened by a factor of 4, as shown in fig. 2.

In step S3, since the target template image may change in real time during the augmented reality presentation process, especially based on the augmented reality application of the mobile camera capture device, the image feature extraction needs to have invariance in scale for the change of scales such as direction and size, thereby enhancing the robustness of image feature identification. FAST provides the rotation and scale invariance properties of images, and first constructs an image pyramid for a given template image to ensure multi-scale resolution representation of a single image. And extracting key points aiming at graphs at different pyramid levels, and calculating the intensity centroid of the graphs in a box taking the key points as the center, wherein the direction of the key points is a vector from the key points to the intensity centroid.

Therefore, by adopting the BRIEF method, the Gaussian kernel is used for performing smooth fuzzy processing on the input template image so as to prevent the binary descriptor from being too sensitive due to the environmental factors of the acquisition equipment and the influence of the high-frequency noise of the image. Compared with the traditional method for describing the feature points by the regional gray level histogram, the method has the advantages that the speed of establishing the feature descriptors is greatly increased, and the augmented reality feature matching calculation can be performed on mobile terminal equipment with very limited calculation resources.

At this time, firstly, reading a current target image; the current target image is a template image containing characteristic points or an identification image containing the characteristic points; secondly, taking the feature points extracted from the current target image as the center, selecting a window with a certain proportion (such as S), randomly selecting N pairs of pixel points in the selected window, and further comparing the pixel value between each pair of the selected pixel points according to a formula (1) to obtain that the feature points in the current target image can be characterized by binary coding:

Through the feature extraction algorithm, a 256-bit binary feature description code can be obtained for each feature point of the template image and the identification image, and the two images with similar or overlapped parts are registered.

In step S4, feature registration is generally determined by using hamming distance, that is: 1. the number of the same elements on the corresponding positions of the two feature codes is less than 128, and the feature codes are not matched feature points; 2. the template image feature points and the feature points with the maximum number of the same elements on the corresponding positions of the feature codes on the identification image are matched into a pair. The smaller the hamming distance, the higher the matching accuracy is proved.

At this time, first, according to formula (2), a hamming distance between the feature point of the template image and the distance center point of the recognition image is calculated:

then, if the calculated Hamming distance is smaller than or equal to a preset threshold value, taking the distance center point of the recognition image as a clustering center point, and further matching with other feature points under the clustering center point to calculate the Hamming distance so as to obtain the optimal matching result;

and finally, if the calculated Hamming distance is larger than a set threshold value, finishing the matching, and recording and outputting the obtained matching result.

As shown in fig. 3 and fig. 4, the feature matching results are obtained in different scenarios. In fig. 3, feature matching results in different position scenes of the image are presented; in fig. 4, the feature matching results in the actual augmented reality application scenario are presented.

It can be understood that, in order to achieve the enhanced presentation effect of the dynamic image at the terminal, a spatial model to be presented may be further modeled, for example, a Unity3D embedded geometric model tool is used to generate a simple model, or a third-party modeling tool such as 3D MAX, MAYA, bler is used to create a model; and then, extracting key frames of the dynamic images to be displayed and mapping the key frames on the surface of the spatial model. The dynamic images can be created through Photoshop to generate a GIF format image, a key frame animation interpolation technology is used, for the dynamic images such as similar videos and animations, key frame images with maximized features are extracted on average, and the dynamic images among key frames are subjected to supplementing and generating through a linear interpolation algorithm; and finally, packing the Vufaria SDK to generate an augmented reality application program of a corresponding platform for the application of the Android mobile terminal by configuring the running environment of the Vufaria SDK in Unity 3D.

As shown in fig. 5, in an embodiment of the present invention, an image feature processing apparatus for use in a dynamic image enhanced presentation is provided, including:

an image acquisition unit 110 for acquiring a template image and an identification image;

an image feature extraction unit 120, configured to detect feature points of the template image and the identification image according to a preset FAST algorithm;

the image feature processing unit 130 is configured to perform smooth blurring processing on both a template image and an identification image containing feature key points according to a preset BRIEF method, so that the feature points of both the template image and the identification image can be represented by binary codes;

and the image feature matching unit 140 is configured to calculate, based on feature points represented by binary codes of the template image and the identification image, a plurality of feature points, of which matching degrees between the template image and the identification image meet a predetermined condition, by using a hamming distance, and output the feature points.

Wherein the image feature extraction unit includes:

Wherein the image feature processing unit includes:

Wherein the image feature matching unit includes:

The embodiment of the invention has the following beneficial effects:

It should be noted that, in the above device embodiment, each included unit is only divided according to functional logic, but is not limited to the above division as long as the corresponding function can be achieved; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.

It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by relevant hardware instructed by a program, and the program may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims

1. A method of image feature processing for use in dynamic image enhanced rendering, the method comprising the steps of:

acquiring a template image and an identification image;

2. The method as claimed in claim 1, wherein the step of detecting the feature points of the template image and the identified image respectively according to a preset FAST algorithm comprises:

3. The method as claimed in claim 1, wherein the step of performing a smooth blurring process on both the template image and the identified image containing the feature points according to a predefined BRIEF method so that the feature points of the template image and the identified image can be represented by binary codes comprises:

wherein, P (x)₁) Is the pixel value of the random point x1 ═ P (u1, v1) (x)₂) Is a random pointx2 is the pixel value of (u2, v2), and ui is the horizontal coordinate value of the pixel point i in the current target image; and vi is a vertical coordinate value of the pixel point i in the current target image.

4. The method of claim 3, wherein the binary code is a 256-bit binary code.

5. The method as claimed in claim 1, wherein the step of obtaining and outputting a plurality of feature points whose matching degree between the template image and the recognition image meets a predetermined condition by using hamming distance calculation based on the feature points of the template image and the recognition image respectively characterized by binary codes comprises:

6. An image feature processing apparatus for use in a dynamic image enhanced presentation, comprising:

and the image feature matching unit is used for calculating and acquiring a plurality of feature points of which the matching degree between the template image and the identification image meets a preset condition by utilizing a Hamming distance based on the feature points of the template image and the identification image which are respectively represented by binary codes, and outputting the feature points.

7. The image feature processing apparatus for use in dynamic image enhanced rendering according to claim 6, wherein the image feature extraction unit includes:

8. The image feature processing apparatus for use in dynamic image enhanced rendering as recited in claim 6, wherein said image feature processing unit comprises:

9. The image feature processing apparatus for use in dynamic image enhanced rendering according to claim 6, wherein the image feature matching unit includes: