CN107784269A

CN107784269A - A kind of method and system of 3D frame of video feature point extraction

Info

Publication number: CN107784269A
Application number: CN201710756592.3A
Authority: CN
Inventors: 王新宁; 刘恩宝
Original assignee: Shenzhen Nestle Holdings Ltd
Current assignee: Shenzhen Nestle Holdings Ltd
Priority date: 2017-08-29
Filing date: 2017-08-29
Publication date: 2018-03-09
Also published as: WO2019041447A1

Abstract

The invention discloses a kind of method of 3D frame of video feature point extraction, it comprises the following steps：A two field picture is obtained from video, and described image is split along center line；The grey level histogram of statistics segmentation rear left and right image and upper and lower image respectively, and grey level histogram matching primitives are carried out respectively；Comparison match result of calculation, draw the type of image；If described image is 3D, described image is split along its line of symmetry, obtains the first half images and the second half images；The first half images and the second half images are divided into N equal portions respectively；By solving each characteristic point waited in partial image of the first half images, and then corresponding characteristic matching point in the second half images can be obtained.A kind of system of 3D frame of video feature point extraction, including：Split module, computing module, characteristic point and solve module.Using the program, the problem of traditional feature extraction and matching is computationally intensive, the degree of accuracy is inadequate is improved, is widely used in image processing field.

Description

Method and system for extracting feature points of 3D video frame

Technical Field

The invention relates to the field of image processing, in particular to extraction of feature points of a 3D video frame.

Background

A stereoscopic video, i.e., a 3D video, which can be viewed by a special apparatus to exhibit a stereoscopic effect, and a common stereoscopic video includes four types of left and right, right and left, up and down, and down and up;

the characteristic points refer to points with violent change of image gray values or points with larger curvature on the image edges;

feature matching refers to a method for searching similar image targets through analysis of correspondence of image contents, features, structures, relationships, textures, gray levels and the like, and similarity and consistency.

SIFT: scale-innovative feature transform is an algorithm for detecting local features, and the algorithm obtains features and performs image feature point matching by solving feature points in a graph and descriptors related to scaling and direction. The SIFT features not only have scale invariance, but also can obtain good detection effect even if the rotation angle, the image brightness or the shooting visual angle are changed.

However, traditional feature point searching and matching, such as SIFT, is to perform global feature calculation and global search matching for pictures, the existing technical solution is only to apply the traditionally used feature point extraction and matching method to 3D video frames, some pre-known features of 3D videos are not fully utilized, the main disadvantage is huge calculation amount, and the SIFI algorithm adopts a common image pyramid mode to scale and stratify an original image and perform feature extraction on each layer in order to improve the matching of feature points and make the feature points have the invariability of rotation, scale scaling and brightness change, so that the calculation amount is large. SIFT is an excellent method for feature extraction and matching, but its advantages cannot be exploited in the extraction and matching of feature points in 3D video.

On the other hand, video frames in videos are various, the feature richness of the frames is not processed before feature extraction and matching in the traditional feature extraction, and sufficient feature points cannot be extracted by the traditional method when the video frames with single image colors are encountered.

In summary, there is a need for improvement in this technology.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide a method and a system for quickly and accurately extracting feature points of a 3D video frame.

The technical scheme adopted by the invention is as follows:

the invention provides a method for extracting feature points of a 3D video frame, which comprises the following steps:

acquiring a frame of image from a video to be processed, and segmenting the image along a vertical central line and a horizontal central line respectively;

respectively counting the gray level histograms of the vertically divided left image and right image and the gray level histograms of the horizontally divided upper image and lower image, and respectively performing gray level histogram matching calculation;

obtaining the type of the image by comparing the gray histogram matching calculation results;

if the image type is 2D, acquiring a frame of image from the video to be processed again, and repeating the steps;

if the image type is 3D, segmenting the image along a symmetrical line of the image to obtain a first half image and a second half image;

dividing the first half image and the second half image into N equal parts respectively;

and solving the characteristic points of all equal parts of the first half image to further obtain corresponding characteristic matching points in the second half image.

Wherein the method further comprises: carrying out gray processing on the image, and counting the proportion of a pure color area in the image; when the proportion of the pure color area is larger than a preset threshold value, acquiring a frame of image from the video again; and when the proportion of the pure color area is smaller than a preset threshold value, segmenting the image along a vertical central line and a horizontal central line respectively.

As an improvement of the technical scheme, the range of the preset threshold is 0.1-0.9.

As an improvement of the technical solution, the gray histogram matching calculation adopts the following formula:

wherein H ₁ 、H ₂ The gray scales of the two images divided along the central line are respectively; i is the pixel type.

Further, the step of obtaining the type of the image by comparing the histogram of gray matching calculation results includes comparing the histogram of gray matching calculation results of the left and right images with the histogram of gray matching calculation results of the upper and lower images, wherein the smaller the matching calculation result is, the higher the image similarity is.

Further, the step of obtaining the type of the image by comparing the gray histogram matching calculation result also comprises comparing the gray histogram matching calculation result with a preset matching threshold, and if the gray histogram matching calculation result is smaller than the preset matching threshold, outputting the corresponding image type.

Further, if the matching calculation result of the gray level histograms of the left image and the right image is greater than the matching calculation result of the gray level histograms of the upper image and the lower image, and the matching calculation result of the gray level histograms of the upper image and the lower image is less than a preset matching threshold, the image type is output as 3D up-down/3D down-up.

Further, the preset matching threshold is 0.1.

Further, the following formula is adopted for each point to solve the characteristic point:

g(x,y)＝f(x-1,y-1)+f(x,y-1)+f(x,y+1)+f(x-1,y)+

f(x+1,y)+f(x+1,y-1)+f(x+1,y)+f(x+1,y+1)-8*f(x,y)

wherein f (x, y) represents a function of the input image; g (x, y) is the value of the image change rate at point (x, y).

Further, the method comprises the step of judging whether the obtained image change rate extreme value is located at the edge of the image, and if so, rejecting the point and the matching point thereof.

Further, after the extreme points at the edge of the image are removed, the point corresponding to the maximum value of the image change rate at the moment and the corresponding matching point are selected, and the point is the feature point of the image.

In another aspect, the present invention further provides a system for extracting feature points of a 3D video frame, including:

the segmentation module is used for executing the steps to obtain a frame of image from the video to be processed and segmenting the image along a vertical central line and a horizontal central line respectively;

the calculation module is used for performing the steps of respectively counting the gray level histograms of the vertically divided left image and the vertically divided right image and the gray level histograms of the horizontally divided upper image and the horizontally divided lower image, and respectively performing gray level histogram matching calculation; obtaining the type of the image by comparing the gray histogram matching calculation results; if the image type is 2D, acquiring a frame of image from the video to be processed again, and repeating the steps;

the characteristic point solving module is used for executing the step, if the image type is 3D, segmenting the image along a symmetrical line of the image to obtain a first half image and a second half image; dividing the first half image and the second half image into N equal parts respectively; and solving the characteristic points of all equal parts of the first half image to further obtain corresponding characteristic matching points in the second half image.

Further, the system also comprises a gray level processing module which is used for executing the steps to carry out gray level processing on the image and counting the proportion of the pure color area in the image; when the proportion of the pure color area is larger than a preset threshold value, acquiring a frame of image from the video again; and when the proportion of the pure color area is smaller than a preset threshold value, segmenting the image along a vertical central line and a horizontal central line respectively.

The image processing device further comprises an edge characteristic point judging module, a matching module and a judging module, wherein the edge characteristic point judging module is used for executing the steps to judge whether the image change rate extreme value is positioned at the edge of the image, and if so, the point and the matching point are removed; and after eliminating the extreme points positioned at the edge of the image, selecting the point corresponding to the maximum value of the image change rate at the moment and the corresponding matching point thereof, namely the characteristic point of the image.

The invention has the beneficial effects that: according to the method and the system for extracting the feature points of the 3D video frame, the image is segmented by fully utilizing the symmetrical feature conditions of the 3D video, and the matching feature points of the other half of the image are quickly extracted by solving and extracting the feature points of one half of the image. By adopting the scheme, the feature points can be quickly and accurately extracted from the 3D video frame and feature matching is carried out, the problems of large calculation amount and insufficient accuracy of traditional feature extraction and matching are solved, and a technical basis is provided for the problems of video parallax calculation, video reverse-view calculation and the like.

Drawings

The following further describes embodiments of the present invention with reference to the accompanying drawings:

FIG. 1 is a control flow diagram of a first embodiment of the present invention;

fig. 2a and 2b are schematic diagrams of a 3D left-right video and a 3D up-down video, respectively, according to a second embodiment of the present invention;

FIG. 3 is a schematic diagram of left-right segmentation of a 3D image according to a third embodiment of the present invention;

FIG. 4 is a diagram illustrating image processing according to a fourth embodiment of the present invention;

FIG. 5 is a schematic diagram of the vertical segmentation of a 3D image according to a fifth embodiment of the present invention;

fig. 6 is a schematic diagram of module connection according to a sixth embodiment of the present invention.

Detailed Description

It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict.

Fig. 1 is a control flow diagram of the first embodiment of the present invention. The invention provides a method for extracting feature points of a 3D video frame, which comprises the following steps:

obtaining the type of the image by comparing the gray histogram matching calculation results; the types of the images include 2D, 3D up/down/up, 3D left/right/left, etc.

if the image type is 3D, segmenting the image along a symmetrical line of the image to obtain a first half image and a second half image; if it is 3D up/down/up, the segmentation is performed along the horizontal midline.

Dividing the first half image and the second half image into N equal parts respectively; wherein the larger N, the more segmentation.

By solving the feature points in each equally divided image of the first half image, the corresponding feature matching points in the second half image can be directly and quickly obtained according to the symmetry of the 3D image.

As an improvement of the technical scheme, the range of the preset threshold is 0.1-0.9. The optimal selection is 0.9, so that the color complexity of the image is guaranteed, the color richness of the image to be processed is guaranteed, and the image has enough feature richness.

Further, the step of obtaining the type of the image by comparing the gray histogram matching calculation results comprises comparing the gray histogram matching calculation results of the left and right images with the gray histogram matching calculation results of the upper and lower images, wherein the smaller the matching calculation result, the higher the image similarity.

Further, the step of obtaining the type of the image by comparing a histogram of gray scale matching calculation result also includes comparing the histogram of gray scale matching calculation result with a preset matching threshold, and if the histogram of gray scale matching calculation result is smaller than the preset matching threshold, outputting the corresponding image type. The image types include 2D, 3D up and down, 3D left and right, and the like.

Further, if the gray histogram matching calculation results of the left image and the right image are greater than the gray histogram matching calculation results of the upper image and the lower image, and the gray histogram matching calculation results of the upper image and the lower image are less than a preset matching threshold, the image type is output as 3D up-down/3D down-up.

Further, the preset matching threshold is 0.1. When the matching calculation results are all larger than 0.1, outputting the 2D image, returning to reselect the image, and repeating the steps; if at least one gray level histogram matching calculation result is less than 0.1, the image is a 3D image, the size of the gray level histogram matching calculation result of the upper image and the gray level histogram matching calculation result of the lower image and the gray level histogram matching calculation result of the left image and the gray level histogram matching calculation result of the right image are judged, if the result is small, the similarity of the two half images is higher, and the corresponding 3D image type is output.

g(x,y)＝f(x-1,y-1)+f(x,y-1)+f(x,y+1)+f(x-1,y)+

f(x+1,y)+f(x+1,y-1)+f(x+1,y)+f(x+1,y+1)-8*f(x,y)

Further, after the extreme points at the edge of the image are removed, the coordinates of the point corresponding to the maximum value of the image change rate at the moment after the removal and the corresponding matching points are selected, and the coordinates and the corresponding matching points are the feature points of the image.

Referring to fig. 2a and 2b, a 3D video is a special video, and the picture display is special, the 3D video is divided into two categories, namely left-right/left-right and up-down/down-up, the 3D left-right/left video picture is bilaterally symmetrical, and the 3D up-down/down-up video picture is vertically symmetrical, and the predicted feature is the basis of the extraction and matching of the feature points of the 3D video.

As an embodiment, the steps include:

1. decoding a frame of image from a video to be processed, carrying out gray processing on the image, counting the proportion of pure colors (pure black or pure white and the like) in the image, and acquiring the image from the video again when the proportion of the pure colors is greater than a threshold value.

2. The image to be processed is divided, firstly, the image to be processed is divided into a left image and a right image by a vertical middle line, the gray level histograms of the two images are counted, and the gray level histograms are matched. The gray histogram is to count the distribution of gray value of each pixel in the image, count the frequency of each gray value, calculate the gray histogram matching by formula 1, and calculate the matching result d of the left and right image gray histogram _lr In the same manner, the gray histogram d of the upper and lower images is calculated _td The smaller the result, the higher the similarity between the two graphs, N _d Is a gray histogram matching threshold which can be arbitrarily adjusted, preferably, the matching threshold is 0.1, wherein the output is according to table 1Processing results of the images:

TABLE 1

Calculation results	Type of output image
		d _lr >N _d ，d _td <N _d	3D Up and down
d _lr <N _d ，d _td >N _d	About 3D
		d _lr >N _d ，d _td >N _d	2D
d _lr <N _d ，d _td <N _d ，d _lr <d _td	About 3D
		d _lr <N _d ，d _td <N _d ，d _td <d _lr	3D Up and down

And comparing the obtained gray level histogram matching calculation results of the left image and the right image with the gray level histogram matching calculation results of the upper image and the lower image, wherein the smaller the matching calculation result is, the higher the image similarity is.

The method also comprises the steps of comparing the gray level histogram matching calculation result with a preset matching threshold value, and outputting a corresponding image type if the gray level histogram matching calculation result is smaller than the preset matching threshold value.

And if the gray level histogram matching calculation results of the left image and the right image are greater than the gray level histogram matching calculation results of the upper image and the lower image, and the gray level histogram matching calculation results of the upper image and the lower image are less than a preset matching threshold, outputting the image type as 3D up-down/3D down-up.

If the matching calculation results of the two gray level histograms are less than the preset matching threshold value, and d _lr <d _td Then it belongs to 3D or so.

The gray level histogram matching calculation adopts the following formula:

3. Calculation is performed according to the result obtained in step 2, the feature points are solved by taking 3D left and right as an example, the images are cut left and right respectively, the cutting mode is as shown in fig. 3, the left image and the right image are equally divided into N, the larger N represents the more feature points are obtained, N =9 in the present example, and the feature points at a certain position in a exist in the local range in A1 because the 3D left and right images have strict bilateral symmetry.

Solving the characteristic points of each segmented module, and solving each point by using a formula 2:

g(x,y)＝T[f(x,y)] (2)

where f (x, y) is the input image, g (x, y) is the output image, T is an operator defined in the neighborhood of the point (x, y), where the following matrix mask is used for processing,

solving the maximum change value in each direction such as up-down, left-right, diagonal and the like, searching the image extreme point with the maximum change rate, and expanding the matrix to be expressed as the following formula 3:

wherein f (x, y) represents a function of the input image; g (x, y) is the image change rate value at point (x, y).

Fig. 4 is a diagram illustrating image processing according to a fourth embodiment of the present invention. Starting from the origin in the image A, moving from one pixel to another, using equation 3 for each pixel in the image, and outputting the operation result generated at the position, so that for any input (x, y), g (x, y) is correspondingly generated, and then solving the g (x, y) maximum value in the image _max (A) Recording the current position A (x, y), if g (x, y) is located at the image boundary, kicking off, recording the corresponding position coordinate, similarly performing similar operation in the image A1, and searching the corresponding maximum value g _max (A1) And corresponding coordinates A1 (x, y). If the kicking operation is performed in a, the same kicking operation needs to be performed in A1.

After the elimination, the maximum value g of g (x, y) in the image is searched again _max (A) And records the current position a (x, y), and its matching point A1 (x, y). The obtained a (x, y) and A1 (x, y) at this time are the coordinates of the obtained corresponding feature values, and the corresponding feature points and the corresponding feature matching points in the other divided images can be obtained in the same manner.

And solving the image change rate values g (x, y) of the points in each subarea, finding out an extreme value, deleting the extreme value points at the edge of the image partition line, comparing the image change rate values g (x, y) again, and determining the position corresponding to the found extreme value as the position of the feature point.

For the image type of 3D top-bottom mode, the processing method is similar to the above method, the corresponding division method is divided up and down as shown in fig. 5, and a (x, y) and A1 (x, y) are output in the corresponding matching area by the same solving method.

Referring to fig. 6, in another aspect, the present invention further provides a system for extracting feature points of a 3D video frame, including:

The image processing device further comprises an edge characteristic point judging module, a matching point judging module and a judging module, wherein the edge characteristic point judging module is used for executing the steps to judge whether the image change rate extreme value is positioned at the edge of the image or not, and if yes, the point and the matching point are eliminated; and after eliminating the extreme points positioned at the edge of the image, selecting the point corresponding to the maximum value of the image change rate at the moment and the corresponding matching point thereof, namely the characteristic point of the image.

According to the method and the system for extracting the feature points of the 3D video frame, the image is segmented by fully utilizing the symmetrical feature conditions of the 3D video, and the matching feature points of the other half of the image are quickly extracted by solving and extracting the feature points of one half of the image. By adopting the scheme, the feature points can be quickly and accurately extracted from the 3D video frame and feature matching is carried out, the problems of large calculation amount and insufficient accuracy of traditional feature extraction and matching are solved, and a technical basis is provided for the problems of video parallax calculation, video reverse-view calculation and the like.

While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method for extracting feature points of a 3D video frame is characterized by comprising the following steps:

if the image type is 3D, segmenting the image along a symmetrical line of the image to obtain a first semi-image and a second semi-image;

2. The method of 3D video frame feature point extraction according to claim 1, further comprising: carrying out gray processing on the image, and counting the proportion of a pure color area in the image; when the proportion of the pure color area is larger than a preset threshold value, acquiring a frame of image from the video again; and when the proportion of the pure color area is smaller than a preset threshold value, segmenting the image along a vertical central line and a horizontal central line respectively.

3. The method of claim 2, wherein the preset threshold is in a range of 0.1-0.9.

4. The method for extracting feature points of a 3D video frame according to any one of claims 1 to 3, wherein the gray histogram matching calculation adopts the following formula:

5. The method of claim 4, wherein the step of obtaining the type of the image by comparing the histogram matching with gray level calculation results comprises comparing the histogram matching with gray level calculation results of the left and right images with the histogram matching with gray level calculation results of the upper and lower images, wherein the smaller the matching calculation result, the higher the image similarity.

6. The method as claimed in claim 5, wherein the step of obtaining the type of the image comprises comparing a histogram of gray scale matching calculation result with a preset matching threshold, and outputting the corresponding type of the image if the histogram of gray scale matching calculation result is smaller than the preset matching threshold.

7. The method of extracting feature points of a 3D video frame according to claim 6, wherein: and if the gray level histogram matching calculation results of the left image and the right image are greater than the gray level histogram matching calculation results of the upper image and the lower image, and the gray level histogram matching calculation results of the upper image and the lower image are less than a preset matching threshold, outputting the image type as 3D up-down/3D down-up.

8. The method of extracting feature points of a 3D video frame according to claim 7, wherein: the preset matching threshold is 0.1.

9. The method of extracting feature points of a 3D video frame according to claim 8, wherein: wherein, the following formula is adopted for solving the characteristic points for each point:

g(x,y)＝f(x-1,y-1)+f(x,y-1)+f(x,y+1)+f(x-1,y)+f(x+1,y)+f(x+1,y-1)+f(x+1,y)+f(x+1,y+1)-8*f(x,y)

10. The method as claimed in claim 9, wherein the method further comprises determining whether the extreme value of the change rate of the image is located at an edge of the image, and if so, rejecting the point and its matching point.

11. The method of claim 10, wherein after the extreme points at the edge of the image are eliminated, the points corresponding to the maximum value of the image change rate at that time and the corresponding matching points are selected as the feature points of the image.

12. A system for feature point extraction of 3D video frames, comprising:

the calculation module is used for performing the steps of respectively counting the gray level histograms of the vertically divided left image and the vertically divided right image and the gray level histograms of the horizontally divided upper image and the horizontally divided lower image, and respectively performing gray level histogram matching calculation; obtaining the type of the image by comparing the gray level histogram matching calculation results; if the image type is 2D, acquiring a frame of image from the video to be processed again, and repeating the steps;

the characteristic point solving module is used for executing the steps, if the image type is 3D, segmenting the image along a symmetrical line of the image to obtain a first semi-image and a second semi-image;

dividing the first half image and the second half image into N equal parts respectively; and solving the characteristic points of all equal parts of the first half image to further obtain corresponding characteristic matching points in the second half image.

13. The system for extracting feature points of a 3D video frame according to claim 12, further comprising a gray processing module, configured to perform a graying process on the image, and count a proportion of a pure color region in the image; when the proportion of the pure color area is larger than a preset threshold value, acquiring a frame of image from the video again; and when the proportion of the pure color area is smaller than a preset threshold value, segmenting the image along a vertical central line and a horizontal central line respectively.

14. The system for extracting feature points of a 3D video frame according to claim 13, further comprising an edge feature point determining module, configured to execute the steps of determining whether the extreme value of the image change rate is located at an edge of the image, and if so, rejecting the feature point and a matching feature point thereof; and after eliminating the extreme points positioned at the edge of the image, selecting the point corresponding to the maximum value of the image change rate at the moment and the corresponding matching point thereof, namely the characteristic point of the image.