CN109740674B

CN109740674B - Image processing method, device, equipment and storage medium

Info

Publication number: CN109740674B
Application number: CN201910011494.6A
Authority: CN
Inventors: 马福强; 陈丽莉; 楚明磊; 吕耀宇; 薛鸿臻; 闫桂新
Original assignee: BOE Technology Group Co Ltd; Beijing BOE Optoelectronics Technology Co Ltd
Current assignee: BOE Technology Group Co Ltd; Beijing BOE Optoelectronics Technology Co Ltd
Priority date: 2019-01-07
Filing date: 2019-01-07
Publication date: 2021-01-22
Anticipated expiration: 2039-01-07
Also published as: CN109740674A

Abstract

The application discloses an image processing method, an image processing device, image processing equipment and a storage medium. The method comprises the following steps: acquiring a current frame image acquired by a camera, and extracting visual features of the current frame image; generating a feature vector of the current frame image according to the visual features of the current frame image; dividing the feature vector of the current frame image into a plurality of sub-vectors, quantizing the plurality of sub-vectors, and generating a feature index of the visual feature of the current frame image; matching the characteristic index of the visual characteristic of the current frame image with the characteristic index of the visual characteristic of each training image, and determining a matching characteristic pair of the current frame image and each training image; the feature index of the visual feature of each training image is obtained based on the sub-codebook; and determining the training images with the number of the matched feature pairs larger than a first preset threshold value as similar images of the current frame image. The technical scheme can realize the rapid recognition of the image.

Description

Image processing method, device, equipment and storage medium

Technical Field

The present disclosure relates generally to the field of computer technologies, and in particular, to an image processing method, an image processing apparatus, an image processing device, and a storage medium.

Background

In recent years, with the rapid development of semiconductor technology and the promotion of artificial intelligence wave, rapid image recognition and tracking algorithms become research hotspots in the fields of augmented reality, robot positioning and the like.

At present, in the process of image recognition, the image recognition is mainly realized based on a tree-of-words (Bag of words) model. In order to achieve a better recognition effect, a larger-scale tree-shaped visual dictionary needs to be established, so that the time consumption of an image recognition process is longer, the memory occupancy rate of the tree-shaped visual dictionary is high, and the use of the tree-shaped visual dictionary on a platform with limited memory such as an embedded type platform is limited.

Disclosure of Invention

In view of the above-mentioned shortcomings or drawbacks of the prior art, it is desirable to provide a scheme capable of rapidly recognizing an image.

In a first aspect, an embodiment of the present application provides an image processing method, including:

acquiring a current frame image acquired by a camera, and extracting visual features of the current frame image;

generating a feature vector of the current frame image according to the visual features of the current frame image;

dividing the feature vector of the current frame image into a plurality of sub-vectors, quantizing the plurality of sub-vectors, and generating a feature index of the visual feature of the current frame image;

matching the characteristic index of the visual characteristic of the current frame image with the characteristic index of the visual characteristic of each training image in an image training set obtained by training in advance, and determining the matching characteristic pair of the current frame image and each training image; the feature index of the visual feature of each training image is obtained based on a sub-codebook, wherein the sub-codebook is a codebook obtained by dividing the space where the visual feature of each training image is located into a plurality of subspaces and training in each subspace;

and determining the training images with the number of the matching feature pairs larger than a first preset threshold value as similar images of the current frame image.

Optionally, the feature index of the visual feature of each training image is determined as follows:

acquiring an image training set, and extracting visual features of training images in the image training set;

dividing the visual features into M subspaces, and performing cluster analysis in each subspace to obtain the M sub-codebooks composed of k codewords;

and generating a feature index of the visual features of the training image according to at least one of the sub-codebooks.

Optionally, after generating the feature vector of the current frame image, the method further includes:

calculating the similarity between the feature vector of the current frame image and the feature vector of each training image obtained by pre-training, and determining the similarity between the current frame image and each training image;

determining the training images with the similarity greater than a second preset threshold as quasi-similar images; then

Matching the feature index of the visual feature of the current frame image with the feature index of the visual feature of each training image in an image training set obtained by training in advance, and determining the matching feature pair of the current frame image and each training image:

and matching the feature index of the visual feature of the current frame image with the visual feature of the quasi-similar image, and determining a matching feature pair of the current frame image and the quasi-similar image.

Optionally, after determining the training images with the number of the matching feature pairs larger than a first preset threshold as similar images of the current frame image, the method further includes:

determining a first camera pose from the similar image to the current frame image according to the matching feature pair of the current frame image and the similar image;

continuously acquiring a next frame image of the current frame image;

and determining the position of the similar image in the next frame of image according to the first camera pose.

Optionally, determining a first camera pose from the similar image to the current frame image according to the matching feature pair of the current frame image and the similar image, including:

determining matching feature point pairs of the current frame image and the similar image according to the matching feature pairs of the current frame image and the similar image;

and determining the first camera pose according to the 3D coordinates of the matched feature points in the similar image and the 2D coordinates of the matched feature points in the current frame image.

Optionally, determining a position of the similar image in the next frame of image according to the first camera pose includes:

projecting the 3D coordinates of the matched feature points in the similar image to the current frame image according to the first camera pose, and determining the 2D coordinates and the 3D coordinates of the matched feature points in the current frame image;

determining a second camera pose from the current frame image to the next frame image by using a least square method based on photometric errors according to the 2D coordinates and the 3D coordinates of the matched feature points in the current frame image;

projecting the 3D coordinates of the matched feature points in the current frame image according to the second camera position posture to obtain the projected 2D coordinates of the matched feature points in the current frame image;

and determining the position of the similar image in the next frame image according to the projected 2D coordinates of the matched feature points in the current frame image.

Optionally, after obtaining the projected 2D coordinates of the matched feature points in the current frame image, the method further includes:

sequentially judging whether the projected 2D coordinates of each matched feature point in the current frame image are located in the image coordinate range of the next frame image;

determining the number of matched feature points of which the projected 2D coordinates in the current frame image are located in the image coordinate range of the next frame image according to the judgment result; then

Determining the position of the similar image in the next frame image according to the projected 2D coordinates of the matched feature points in the current frame image, including:

and when the number of the matching feature points of the projected 2D coordinates in the current frame image within the image coordinate range of the next frame image is larger than a third preset threshold, determining the position of the similar image in the next frame image according to the projected 2D coordinates of the matching feature points in the current frame image.

In a second aspect, an embodiment of the present application further provides an image recognition apparatus, including:

the characteristic extraction unit is used for acquiring a current frame image acquired by a camera and extracting visual characteristics of the current frame image;

the feature vector generating unit is used for generating a feature vector of the current frame image according to the visual features of the current frame image;

a feature index generating unit, configured to divide the feature vector of the current frame image into a plurality of sub-vectors, quantize the plurality of sub-vectors, and generate a feature index of the visual feature of the current frame image;

the matching unit is used for matching the characteristic index of the visual characteristic of the current frame image with the characteristic index of the visual characteristic of each training image in an image training set obtained by training in advance and determining the matching characteristic pair of the current frame image and each training image; the feature index of the visual feature of each training image is obtained based on a sub-codebook, wherein the sub-codebook is a codebook obtained by dividing the space where the visual feature of each training image is located into a plurality of subspaces and training in each subspace;

and the image identification unit is used for determining the training images with the number of the matching feature pairs larger than a first preset threshold value as the similar images of the current frame image.

In a third aspect, an embodiment of the present application further provides an apparatus, including: at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the image processing method as described above.

In a fourth aspect, the present application further provides a computer-readable storage medium, on which computer program instructions are stored, wherein the computer program instructions, when executed by a processor, implement the image processing method as described above.

The image processing scheme provided by the embodiment of the application provides a new visual feature matching method, namely, a feature index of a visual feature of a current frame image is matched with a feature index of a visual feature of each training image in an image training set obtained by pre-training, wherein the feature index of the visual feature of the current frame image is obtained by dividing a feature vector of the current frame image into a plurality of sub-vectors and quantizing the plurality of sub-vectors, the feature index of the visual feature of each training image is obtained based on a sub-codebook, and the sub-codebook is a codebook obtained by segmenting a space where the visual feature of each training image is located into a plurality of sub-spaces and training in each sub-space. The characteristic index obtained in the mode greatly reduces the storage scale, further improves the matching speed and can quickly realize image recognition.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

fig. 1 illustrates an exemplary flowchart of an image processing method provided in an embodiment of the present application;

FIG. 2 illustrates a schematic diagram of training feature indices of visual features of training images;

FIG. 3 shows a schematic diagram of training a sub-codebook using all visual features of each training image;

FIG. 4 shows a schematic diagram of a feature index of visual features of a training image generated from M sub-codebooks;

FIG. 5 is a schematic diagram of obtaining a pseudo-similar image of a current frame image;

FIG. 6 shows a schematic diagram of image tracking;

fig. 7 is a block diagram illustrating an exemplary structure of an image processing apparatus according to an embodiment of the present application;

FIG. 8 illustrates a schematic structural diagram of a computer system suitable for use in implementing a server according to embodiments of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

As mentioned in the background, the image recognition process is mainly implemented based on a tree BoW model. In order to achieve a better recognition effect, a larger-scale tree-shaped visual dictionary needs to be established, so that the time consumption of an image recognition process is longer, the memory occupancy rate of the tree-shaped visual dictionary is high, and the use of the tree-shaped visual dictionary on a platform with limited memory such as an embedded type platform is limited.

In view of the above-mentioned drawbacks of the prior art, embodiments of the present application provide an image processing scheme. The technical scheme provides a new visual characteristic matching method, namely, a characteristic index of the visual characteristic of a current frame image is matched with a characteristic index of the visual characteristic of each training image in an image training set obtained by pre-training, wherein the characteristic index of the visual characteristic of the current frame image is obtained by dividing a characteristic vector of the current frame image into a plurality of sub-vectors and quantizing the plurality of sub-vectors, the characteristic index of the visual characteristic of each training image is obtained based on a sub-codebook, and the sub-codebook is a codebook obtained by segmenting a space where the visual characteristic of each training image is located into a plurality of sub-spaces and training in each sub-space. The characteristic index obtained in the mode greatly reduces the storage scale, further improves the matching speed and can quickly realize image recognition.

The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Referring to fig. 1, an exemplary flowchart of an image processing method provided in an embodiment of the present application is shown.

The method comprises the following steps:

step 110, acquiring a current frame image acquired by the camera, and extracting visual features of the current frame image.

In the embodiment of the present application, the visual Feature of the current frame image may be extracted based on a Scale-Invariant Feature (SIFT) algorithm, a Speeded Up Robust Features (SURF) algorithm, or an object FAST and Robust Feature (ORB) algorithm, but the method for extracting the visual Feature of the current frame image of the present invention is not limited thereto, and for example, a texture map Feature, a direction gradient histogram Feature, a color histogram Feature, and the like of the current frame image may also be extracted.

Step 120, dividing the visual features of the current frame image into a plurality of sub-vectors, and quantizing the plurality of sub-vectors to generate a feature index of the visual features of the current frame image.

Specifically, each visual feature in the current frame image may be divided into M subspaces according to a vector dimension, and assuming that the visual feature of the current frame image is an SIFT feature, the dimension of the SIFT feature is 128, the 128-dimensional SIFT feature is firstly divided into M sub-vectors, the dimension of each sub-vector is 128/M, then each sub-vector is sequentially quantized, and finally, a feature index is generated according to a quantization result of each sub-vector.

And step 130, matching the feature index of the visual feature of the current frame image with the feature index of the visual feature of each training image in an image training set obtained by pre-training, and determining the matching feature pair of the current frame image and each training image.

Wherein, the feature matching pair refers to two visual features that the feature index can match. For example, the feature index of a certain visual feature in the current frame image is 001, and if the feature index of a certain visual feature of the training image is also 001, the two visual features are a set of matching feature pairs.

It should be noted that the feature index of a certain visual feature in the current frame image is 001, and if there are multiple feature indexes in the training image that are also the visual features of 001, then one visual feature is selected from the multiple visual features and the visual features in the current frame image to form a set of matching feature pairs.

Wherein the feature index of the visual feature of each training image is obtained based on the subcode book.

The sub-codebook is a codebook obtained by dividing the space where the visual features of each training image are located into a plurality of subspaces and training in each subspace.

The codebook refers to k clustering centers obtained by clustering visual features by using a clustering algorithm, each clustering center is called a codeword, and a set of the k clustering centers is called a codebook.

Specifically, the feature index of the visual feature of each training image may be determined in the manner shown in fig. 2:

step 210, obtaining an image training set, and extracting visual features of training images in the image training set.

The method for extracting the visual features of each training image is the same as the method for extracting the visual features of the current frame image, and is not repeated here.

Step 220, dividing the visual features of each training image into M subspaces, and performing cluster analysis in each subspace to obtain M sub codebooks composed of k codewords.

The subspace is a space where the corresponding dimensional subvectors of all visual features of the training image are located.

FIG. 3 is a schematic diagram of training a sub-codebook using all the visual features of each training image. Taking M-3 as an example, all visual features of each training image are divided into 3 subspaces, and clustering analysis is performed in each subspace to obtain 3 sub codebooks composed of k codewords.

And step 230, generating a feature index of the visual features of the training image according to the at least one sub-codebook.

In the embodiment of the application, the feature index of the visual feature of the training image can be generated according to the M sub-codebooks. Fig. 4 is a schematic diagram of a feature index of visual features of a training image generated from M sub-codebooks.

Specifically, a sub-vector of each visual feature of the training image is quantized in each subspace, and a feature index is generated according to a quantization result of M sub-vectors of each visual feature. Wherein, the scale of the characteristic index is shown as formula (1):

wherein qi is a quantization result of the ith sub-vector, and index is a feature index of visual features of a training image generated according to the M sub-codebooks.

Optionally, in order to further reduce the size of the feature index and improve the matching speed, M-1 or M-2 sub-codebooks may also be used to generate the feature index.

Step 140, determining the training images with the number of the matching feature pairs larger than a first preset threshold as similar images of the current frame image.

After the matching feature pairs of the current frame image and each training image are determined, the number is counted, and then the training images of which the number is greater than a first preset threshold value are determined as similar images of the current frame.

The embodiment of the application provides an image processing scheme. The technical scheme provides a new visual characteristic matching method, namely, a characteristic index of the visual characteristic of a current frame image is matched with a characteristic index of the visual characteristic of each training image in an image training set obtained by pre-training, wherein the characteristic index of the visual characteristic of the current frame image is obtained by dividing a characteristic vector of the current frame image into a plurality of sub-vectors and quantizing the plurality of sub-vectors, the characteristic index of the visual characteristic of each training image is obtained based on a sub-codebook, and the sub-codebook is a codebook obtained by segmenting a space where the visual characteristic of each training image is located into a plurality of sub-spaces and training in each sub-space. The characteristic index obtained in the mode greatly reduces the storage scale, further improves the matching speed and can quickly realize image recognition.

Optionally, after the visual features of the current frame image are extracted in step 110, the training image may be initially selected to obtain a pseudo-similar image of the current frame image.

Specifically, as shown in fig. 5, the method includes the following steps:

step 510, generating a bag-of-words vector of the current frame image according to the visual characteristics of the current frame image.

Specifically, firstly, the visual features of the current frame image are extracted and feature descriptors of the visual features are constructed, and then clustering is performed on the feature descriptors through a clustering algorithm (such as a k-means algorithm) training to generate a codebook. Then, the visual features are quantized through a KNN (K-nearest neighbor) algorithm, and finally, an image histogram vector weighted by TF-IDF (term frequency-inverse document frequency), namely a BoW vector, is obtained.

Step 520, calculating the similarity between the feature vector of the current frame image and the feature vectors of the training images obtained by pre-training, and determining the similarity between the current frame image and the training images.

The feature vector of each training image is consistent with the method for acquiring the feature vector of the current frame image, and is not described herein again.

In addition, the euclidean distance or the cosine distance of the two BoW vectors may be calculated as a criterion of the similarity between the feature vector of the current frame image and the feature vector of each training image.

In step 530, the training images with similarity greater than the second preset threshold are determined as quasi-similar images.

Therefore, a part of quasi-similar images can be screened out from a large number of training images, and the time consumption of the subsequent visual feature matching process is further shortened.

Based on the above steps 510 to 530, the step 130 may specifically include:

and matching the characteristic index of the visual characteristic of the current frame image with the visual characteristic of the quasi-similar image to determine a matching characteristic pair of the current frame image and the quasi-similar image.

The image processing method can be applied to the technical field of image recognition and tracking.

Optionally, after determining the training images with the number of the matching feature pairs larger than the first preset threshold as similar images of the current frame image, the embodiment of the present application may further include an image tracking step shown in fig. 6:

and step 610, determining a first camera pose from the similar image to the current frame image according to the matching feature pair of the current frame image and the similar image.

In the embodiment of the application, the matching feature point pair of the current frame image and the similar image can be determined according to the matching feature pair of the current frame image and the similar image; that is, each visual feature corresponds to a feature point, and thus a set of matching feature pairs corresponds to a set of matching feature point pairs.

After the matching feature point pairs of the current frame image and the similar image are determined, the first camera pose can be determined according to the 3D coordinates of the matching feature points in the similar image and the 2D coordinates of the matching feature points in the current frame image.

Specifically, assuming that a plane where matching feature points of similar images are located is a plane z equal to 0, so that 2D pixel coordinates (u, v) become 3D coordinates (u, v, 0), then a first camera pose is calculated by using a PnP algorithm according to corresponding 3D-2D matching feature point pairs, where T is the first camera pose, R is a rotation matrix, and T is a translation matrix.

Step 620, continue to acquire the next frame image of the current frame image.

And step 630, determining the position of the similar image in the next frame of image according to the pose of the first camera.

Step 630 may be implemented as follows:

firstly, projecting the 3D coordinates of the matched feature points in the similar images to the current frame image according to the first camera pose, and determining the 2D coordinates and the 3D coordinates of the matched feature points in the current frame image;

wherein, it can be determined according to the following formulas (2) and (3):

P′＝RP+t； (2)

wherein, P is the 3D coordinate of the matching feature point in the similar image, P is the 3D coordinate of the matching feature point in the current frame image, (u, v) is the 2D coordinate of the matching feature point in the current frame image, and K is the camera internal parameter.

Secondly, determining a second camera pose from the current frame image to the next frame image by using a least square method based on photometric errors according to the 2D coordinates and the 3D coordinates of the matched feature points in the current frame image;

can be determined according to the following equation (4):

wherein, T is the pose of the second camera, Pi is the 3D coordinate of the matching feature point in the current frame image, Pi is the 2D coordinate of the matching feature point in the current frame image, n is the number of the matching feature point pairs, K is the camera internal reference, R, T is the value to be estimated, zi is the depth value (known) in the projection process, and I1() is the image gray value of the corresponding point.

And solving the above formula by using a Gauss Newton method or a Levenberg Marquardt method to obtain the second camera pose from the current frame image to the next frame image.

And thirdly, projecting the 3D coordinates of the matched characteristic points in the current frame image according to the second camera position and posture to obtain the projected 2D coordinates of the matched characteristic points in the current frame image.

And fourthly, determining the position of the similar image in the next frame image according to the projected 2D coordinates of the matched feature points in the current frame image.

After obtaining the projected 2D coordinates of the matching feature points in the current frame image, it may be sequentially determined whether the projected 2D coordinates of each matching feature point in the current frame image are within the image coordinate range of the next frame image; and determining the number of the matching feature points of which the projected 2D coordinates in the current frame image are located in the image coordinate range of the next frame image according to the judgment result.

And if the number of the matched feature points of which the projected 2D coordinates in the current frame image are located in the image coordinate range of the next frame image is too small, the fact that the similar image does not exist in the next frame image is indicated, and then the tracking process is finished. At this point, the method can return to continue to acquire the next frame of image for tracking.

And if the number of the matching feature points of the projected 2D coordinates in the current frame image in the image coordinate range of the next frame image is larger than a third preset threshold, indicating that the similar image still exists in the next frame image, and determining the position of the similar image in the next frame image according to the projected 2D coordinates of the matching feature points in the current frame image.

In the embodiment of the application, the tracking and positioning of the image are realized by a least square method.

It should be noted that while the operations of the method of the present invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the steps depicted in the flowcharts may change the order of execution. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

Further referring to fig. 7, it shows an exemplary structural block diagram of an image processing apparatus provided in an embodiment of the present application.

The device includes:

a feature extraction unit 71, configured to obtain a current frame image acquired by a camera, and extract visual features of the current frame image;

a feature vector generating unit 72, configured to generate a feature vector of the current frame image according to the visual features of the current frame image;

a feature index generating unit 73, configured to divide the feature vector of the current frame image into a plurality of sub-vectors, quantize the plurality of sub-vectors, and generate a feature index of the visual feature of the current frame image;

a matching unit 74, configured to match the feature index of the visual feature of the current frame image with the feature index of the visual feature of each training image in an image training set obtained through pre-training, and determine a matching feature pair between the current frame image and each training image; the feature index of the visual feature of each training image is obtained based on a sub-codebook, wherein the sub-codebook is a codebook obtained by dividing the space where the visual feature of each training image is located into a plurality of subspaces and training in each subspace;

an image recognition unit 75, configured to determine the training images with the number of the matching feature pairs being greater than a first preset threshold as similar images of the current frame image.

Optionally, the apparatus may further include:

a training unit to:

dividing the visual features of the training images into M subspaces, and performing cluster analysis in each subspace to obtain the M sub code books consisting of k code words;

Optionally, the apparatus may further include:

a pseudo-similar image determination unit for:

generating a bag-of-words vector of the current frame image according to the visual characteristics of the current frame image;

calculating the similarity between the bag-of-word vector of the current frame image and the bag-of-word vector of each training image obtained by pre-training, and determining the similarity between the current frame image and each training image;

and determining the training images with the similarity larger than a second preset threshold as quasi-similar images.

The matching unit 74 is specifically configured to:

Optionally, the apparatus may further include:

the first camera pose determining unit is used for determining a first camera pose from the similar image to the current frame image according to the matching feature pair of the current frame image and the similar image;

the acquisition unit is used for continuously acquiring a next frame image of the current frame image;

and the positioning unit is used for determining the position of the similar image in the next frame of image according to the first camera pose.

Optionally, the first camera pose determining unit is specifically configured to:

Optionally, the positioning unit is specifically configured to:

Optionally, the method may further include:

a determination unit configured to:

and determining the number of the matched feature points of which the projected 2D coordinates in the current frame image are located in the image coordinate range of the next frame image according to the judgment result.

The positioning unit is specifically configured to:

It should be understood that the subsystems or units recited in the apparatus 700 correspond to various steps in the method described with reference to fig. 1-6. Thus, the operations and features described above for the method are equally applicable to the apparatus 700 and the units included therein, and are not described in detail here.

Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use in implementing a server according to embodiments of the present application.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, the processes described above with reference to fig. 1-6 may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the methods of fig. 1-6. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules described in the embodiments of the present application may be implemented by software or hardware. The described units or modules may also be provided in a processor. The names of these units or modules do not in some cases constitute a limitation of the unit or module itself.

As another aspect, the present application also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus in the above-described embodiments; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the image processing methods described herein.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. An image processing method, characterized in that the method comprises:

dividing the visual features of the current frame image into a plurality of sub-vectors, quantizing the sub-vectors, and generating a feature index of the visual features of the current frame image;

determining the training images with the number of the matching feature pairs larger than a first preset threshold value as similar images of the current frame image;

continuously acquiring a next frame image of the current frame image;

2. The method of claim 1, wherein the feature index of the visual features of each training image is determined as follows:

3. The method of claim 1, wherein after extracting the visual features of the current frame image, the method further comprises:

4. The method of claim 1, wherein determining a first camera pose from the similar image to the current frame image based on the matching feature pair of the current frame image and the similar image comprises:

5. The method of claim 1, wherein determining the position of the similar image in the next frame image according to the first camera pose comprises:

6. The method of claim 5, wherein after obtaining the projected 2D coordinates of the matching feature points in the current frame image, the method further comprises:

7. An image processing apparatus, characterized in that the apparatus comprises:

the image identification unit is used for determining the training images with the number of the matching feature pairs larger than a first preset threshold value as similar images of the current frame image;

8. An apparatus, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory that, when executed by the processor, implement the method of any of claims 1-6.

9. A computer-readable storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1-6.