CN111612063A

CN111612063A - Image matching method, device and equipment and computer readable storage medium

Info

Publication number: CN111612063A
Application number: CN202010429910.7A
Authority: CN
Inventors: 段强; 李锐; 金长新; 王芳
Original assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Current assignee: Jinan Inspur Hi Tech Investment and Development Co Ltd
Priority date: 2020-05-20
Filing date: 2020-05-20
Publication date: 2020-09-01

Abstract

The invention discloses an image matching method, which comprises the following steps: receiving a source image and a target image to be matched; inputting a source image and a target image into an image matching model obtained by utilizing a geometric constraint algorithm training; and performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result. By applying the technical scheme provided by the embodiment of the invention, the image matching model is obtained by utilizing the geometric constraint algorithm for training, and the massive training data set is fully utilized by utilizing the geometric constraint algorithm for learning and training, so that the high-quality and high-efficiency image matching model is obtained, is suitable for image matching in complicated and changeable application scenes and tasks, and greatly improves the image matching efficiency. The invention also discloses an image matching device, equipment and a storage medium, and has corresponding technical effects.

Description

Image matching method, device and equipment and computer readable storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image matching method, an image matching device, an image matching apparatus, and a computer-readable storage medium.

Background

The image matching technology is widely applied to various fields in daily life, such as analysis and identification of various medical pictures, remote sensing picture identification, satellite cloud picture identification in weather forecast, fingerprint identification, face identification and the like.

In a traditional image matching mode, a rule needs to be manually designed to extract feature points and design feature descriptors, so that image feature extraction and feature description are carried out, such as SIFT, SURF, ORB and the like. Commonly used extreme points are searched in a space scale like SIFT, and invariant of visual angle, scale and rotation is extracted to generate a 128-dimensional feature vector, so that the robustness is high. However, as application scenarios and tasks become more complex, the capability bottleneck becomes more obvious, and the promotion space is small, so that the method is not suitable for complex and variable application scenarios and tasks.

In summary, how to effectively solve the problems that the existing image matching mode has obvious capability bottleneck, small lifting space, and cannot be applied to complicated and variable application scenes and tasks is a problem that needs to be solved by technicians in the field at present.

Disclosure of Invention

The invention aims to provide an image matching method, which obtains an image matching model with high quality and high efficiency, is suitable for matching images in complex and changeable application scenes and tasks, and greatly improves the image matching efficiency; another object of the present invention is to provide an image matching apparatus, a device and a computer-readable storage medium.

In order to solve the technical problems, the invention provides the following technical scheme:

an image matching method, comprising:

receiving a source image and a target image to be matched;

inputting the source image and the target image into an image matching model obtained by utilizing a geometric constraint algorithm;

and carrying out similarity operation on the source image and the target image by using the image matching model to obtain an image matching result.

In an embodiment of the present invention, the training process of the image matching model includes:

projecting the pre-collected 3D point cloud data to different 2D planes through different camera view angles to generate a training set containing a plurality of 2D images;

sampling each 2D image in the training set to obtain each sampling point;

generating a corresponding source image block, a target image block and a non-target image block aiming at each sampling point, and constructing a triple comprising the source image block, the target image block and the non-target image block;

minimizing the distance between the source image blocks and the target image blocks and maximizing the distance between the source image blocks and the non-target image blocks for each triplet.

In one embodiment of the present invention, projecting pre-collected 3D point cloud data to different 2D planes through different camera view angles includes:

restoring the obtained 3D point cloud data to the pre-acquired 2D image sample by utilizing an SFM algorithm;

and projecting the 3D point cloud data to different 2D planes through different camera view angles.

In a specific embodiment of the present invention, after constructing the triples including the source image blocks, the target image blocks, and the non-target image blocks, before minimizing the distance between the source image blocks and the target image blocks and maximizing the distance between the source image blocks and the non-target image blocks for each of the triples, the method further includes:

respectively calculating the image block similarity of the source image block and the target image block in each triple;

and filtering redundant triples with image block similarity higher than a first preset value.

In a specific embodiment of the present invention, after the calculating the image block similarity of the source image block and the target image block in each triplet, the method further includes:

carrying out mean value calculation on the image block similarity of each group of source image blocks and target image blocks respectively corresponding to each 2D image pair to obtain the image similarity of each 2D image pair;

filtering redundant 2D image pairs with image similarity higher than a second preset value; wherein the 2D image pair comprises a projected source 2D image and a generated target 2D image.

In a specific embodiment of the present invention, after filtering the redundant triple whose image block similarity is higher than the first preset value, the method further includes:

selecting a target triple group of which the image block similarity is higher than a third preset value;

minimizing the distance between the source image block and the target image block, maximizing the distance between the source image block and the non-target image block, for each triplet, comprising:

for each target triplet, the distance between the source image block and the target image block is minimized, and the distance between the source image block and the non-target image block is maximized.

In a specific embodiment of the present invention, after selecting the target triple whose image block similarity is higher than a third preset value, the method further includes:

respectively carrying out multi-scale feature extraction on the source image blocks and the target image blocks in each target triple to obtain each sub-source image block and each sub-target image block;

generating corresponding sub non-target image blocks, and constructing sub triples comprising corresponding sub-source image blocks, sub-target image blocks and sub non-target image blocks;

minimizing the distance between the source image blocks and the target image blocks, maximizing the distance between the source image blocks and the non-target image blocks for each target triplet, comprising:

for each sub-triple, minimizing the distance between the sub-source image block and the sub-target image block, and maximizing the distance between the sub-source image block and the sub-non-target image block.

An image matching apparatus comprising:

the image receiving module is used for receiving a source image and a target image to be matched;

the image input module is used for inputting the source image and the target image into an image matching model obtained by utilizing a geometric constraint algorithm;

and the matching result obtaining module is used for carrying out similarity operation on the source image and the target image by utilizing the image matching model to obtain an image matching result.

An image matching apparatus comprising:

a memory for storing a computer program;

a processor for implementing the steps of the image matching method as described above when executing the computer program.

A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the image matching method as set forth above.

By applying the method provided by the embodiment of the invention, a source image and a target image to be matched are received; inputting a source image and a target image into an image matching model obtained by utilizing a geometric constraint algorithm training; and performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result. The image matching model is obtained by training through the geometric constraint algorithm, the massive training data set is fully utilized by learning and training through the geometric constraint algorithm, so that the high-quality and high-efficiency image matching model is obtained, and the method is suitable for image matching in complex and changeable application scenes and tasks, and greatly improves the image matching efficiency.

Correspondingly, the embodiment of the invention also provides an image matching device, equipment and a computer readable storage medium corresponding to the image matching method, which have the technical effects and are not described herein again.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of an implementation of an image matching method according to an embodiment of the present invention;

FIG. 2 is a flow chart of another embodiment of an image matching method according to the present invention;

FIG. 3 is a schematic diagram illustrating a similarity calculation vector of an image block according to an embodiment of the present invention;

FIG. 4 is a block diagram of an image matching apparatus according to an embodiment of the present invention;

fig. 5 is a block diagram of an image matching apparatus according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the disclosure, the invention will be described in further detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The first embodiment is as follows:

referring to fig. 1, fig. 1 is a flowchart of an implementation of an image matching method according to an embodiment of the present invention, where the method may include the following steps:

s101: receiving a source image and a target image to be matched.

When image matching is required to be carried out on a pre-stored source image and a currently acquired target image, the source image and the target image to be matched are sent to an image matching center, and the image matching center receives the source image and the target image to be matched. The source image can be a pre-stored fingerprint image, a pre-stored face image and the like, and the corresponding target can be a currently acquired fingerprint image and a currently acquired face image.

S102: and inputting the source image and the target image into an image matching model obtained by utilizing a geometric constraint algorithm training.

The method comprises the steps of training by using a geometric constraint algorithm in advance to obtain an image matching model, namely obtaining 3D point cloud data, projecting the 3D point cloud data to different planes by changing the view angle of a camera to obtain a training set containing a plurality of 2D images, sampling each 2D image, generating a source image block and a target image block for each sampling point, filtering the source image block patch and the target image block by using the geometric constraint algorithm, and training to obtain the image matching model. After receiving a source image and a target image to be matched, inputting the source image and the target image into an image matching model obtained by utilizing a geometric constraint algorithm training.

S103: and performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result.

After a source image and a target image are input into an image matching model obtained by utilizing geometric constraint algorithm training, similarity operation is carried out on the source image and the target image by utilizing the image matching model to obtain an image matching result. The image matching model is obtained by training through the geometric constraint algorithm, the massive training data set is fully utilized by learning and training through the geometric constraint algorithm, so that the high-quality and high-efficiency image matching model is obtained, and the method is suitable for image matching in complex and changeable application scenes and tasks, and greatly improves the image matching efficiency.

It should be noted that, based on the first embodiment, the embodiment of the present invention further provides a corresponding improvement scheme. In the following embodiments, steps that are the same as or correspond to those in the first embodiment may be referred to each other, and corresponding advantageous effects may also be referred to each other, which are not described in detail in the following modified embodiments.

Example two:

referring to fig. 2, fig. 2 is a flowchart of another implementation of an image matching method according to an embodiment of the present invention, where the method may include the following steps:

s201: and recovering the obtained 3D point cloud data from the pre-acquired 2D image sample by utilizing an SFM algorithm.

When an image matching model is trained, 3D point cloud data obtained by restoring a pre-acquired 2D image sample by using an SFM algorithm for the pre-acquired 2D image sample, namely, when a 3D projection matrix is calculated by sampling the 2D image sample, the 3D point cloud data obtained by restoring can be reserved only when the error between the 2D inverse projection of the non-sampled matching point and an original image is less than a certain threshold value. In addition, 3D point cloud data can be reconstructed (reconstruction) by 3D.

S202: and projecting the 3D point cloud data to different 2D planes through different camera view angles to generate a training set containing a plurality of 2D images.

And projecting the 3D point cloud data to different 2D planes through different camera view angles to generate a training set containing a plurality of 2D images, namely obtaining a large number of 2D candidate training images.

S203: and sampling each 2D image in the training set to obtain each sampling point.

And sampling each 2D image in the training set to obtain each sampling point.

S204: and generating a corresponding source image block, a target image block and a non-target image block aiming at each sampling point, and constructing a triple comprising the source image block, the target image block and the non-target image block.

And generating a corresponding source image block, a target image block and a non-target image block aiming at each sampling point, and constructing a triple comprising the source image block, the target image block and the non-target image block.

S205: and respectively calculating the image block similarity of the source image block and the target image block in each triple.

And respectively calculating the image block similarity of the source image block and the target image block in each triple. Referring to fig. 3, the image block similarity of each group of source image blocks and target image blocks may be specifically calculated by the following formula:

S_patch＝S₁S₂＝g(∠C_iPC_j，σ₁)g(∠C_iPP_n-∠C_jPP_n，σ₂)；

wherein S is₁Representing a source image block, S₂Representing a target image block, C_iAnd C_jRespectively representing the centers of the two cameras, P_nRepresents the normal vector of the 3D point P on the reconstruction plane, ∠ represents the included angle between the head and tail vectors and the middle point₁And σ₂Set according to empirical values.

S206: and filtering redundant triples with image block similarity higher than a first preset value.

The first preset value of the image block similarity can be preset, and the redundant source image blocks and the redundant target image blocks with the image block similarity higher than the first preset value are filtered, so that the effectiveness of the model training samples is guaranteed.

S207: and performing mean value calculation on the image block similarity of each group of source image blocks and target image blocks corresponding to each 2D image pair respectively to obtain the image similarity of each 2D image pair, and filtering redundant 2D image pairs with the image similarity higher than a second preset value.

Wherein the 2D image pair comprises the projected source 2D image and the generated target 2D image.

After the image block similarity of each group of source image blocks and target image blocks is obtained, the image block similarity of each group of source image blocks and target image blocks corresponding to each 2D image pair is subjected to mean value calculation, and the 2D image pairs comprise projection imagesThe obtained source 2D image and the generated target 2D image are used for obtaining the image similarity S of each 2D image pair_image. And a second preset value of image similarity can be preset, for example, 85%, and redundant 2D image pairs with image similarity higher than the second preset value are filtered.

S208: and selecting a target triple group with the image block similarity higher than a third preset value.

And presetting a third preset value of the image block similarity, and selecting target triples with the image block similarity higher than the third preset value, so that the effectiveness of sampling points corresponding to each target triplet is ensured, and the convergence speed of model training is increased.

S209: and for each target triple, minimizing the distance between the source image block and the target image block, and maximizing the distance between the source image block and the non-target image block.

And for each target triple, minimizing the distance between the source image block and the target image block, and maximizing the distance between the source image block and the non-target image block.

S210: receiving a source image and a target image to be matched.

S211: and inputting the source image and the target image into an image matching model obtained by utilizing a geometric constraint algorithm training.

S212: and performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result.

And performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result. The method comprises the steps of inputting a source image and a target image into an image matching model obtained by geometric constraint algorithm training to obtain two groups of 128-dimensional feature vectors, and judging whether the source image is the same as or different from the target image by comparing Euclidean distances or cosine distances of the two groups of vectors and setting a threshold value.

The trained image matching model is a network structure based on L2-Net, L2-Net is a convolutional neural network CNN, and a 128-dimensional feature vector is output by a local response normalization Layer (LRN) layer, wherein the CNN comprises 6 layers of 3x3 convolutional layers and a batch normalization layer, and the CNN comprises 1 layer of 8x8 convolutional layers and the batch normalization layer. The convolutional neural network is used for local feature extraction and mainly comprises a detector and a descriptor, each subtask is an independent convolutional neural network, the convolutional neural networks can be unified into an end-to-end framework for training, and loss calculation is assisted in the model training process.

In an embodiment of the present invention, after step S208, the following steps may be further included:

the method comprises the following steps: and respectively carrying out multi-scale feature extraction on the source image blocks and the target image blocks in each target triple to obtain each sub-source image block and each sub-target image block.

Step two: and generating corresponding sub non-target image blocks, and constructing sub triples comprising corresponding sub-source image blocks, sub-target image blocks and sub non-target image blocks.

Correspondingly, step S209 may include the following steps:

and for each sub-triple, minimizing the distance between the sub-source image block and the sub-target image block, and maximizing the distance between the sub-source image block and the sub-non-target image block.

For convenience of description, the above three steps may be combined for illustration.

And after the triple is filtered to obtain a target triple, performing multi-scale feature extraction on each generated source image block and each generated target image block to obtain each sub-source image block and each sub-target image block, generating each corresponding sub-non-target image block, constructing each sub-triple comprising the corresponding sub-source image block, sub-target image block and sub-non-target image block, minimizing the distance between each sub-triple and the sub-target image block, maximizing the distance between each sub-source image block and each sub-non-target image block, and training to obtain an image matching model. Specifically, a series infrastructure can be used to cut an image block of a first network input into a smaller image block from the center, scale the image block to the same size and input into a second network, and combine the outputs of multiple series networks into a feature vector, thereby implementing multi-scale feature extraction.

Corresponding to the above method embodiments, the embodiments of the present invention further provide an image matching apparatus, and the image matching apparatus described below and the image matching method described above may be referred to in correspondence with each other.

Referring to fig. 4, fig. 4 is a block diagram of an image matching apparatus according to an embodiment of the present invention, where the apparatus may include:

an image receiving module 41, configured to receive a source image and a target image to be matched;

an image input module 42, configured to input the source image and the target image into an image matching model trained by using a geometric constraint algorithm;

and a matching result obtaining module 43, configured to perform similarity calculation on the source image and the target image by using the image matching model, so as to obtain an image matching result.

The device provided by the embodiment of the invention is applied to receive a source image and a target image to be matched; inputting a source image and a target image into an image matching model obtained by utilizing a geometric constraint algorithm training; and performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result. The image matching model is obtained by training through the geometric constraint algorithm, the massive training data set is fully utilized by learning and training through the geometric constraint algorithm, so that the high-quality and high-efficiency image matching model is obtained, and the method is suitable for image matching in complex and changeable application scenes and tasks, and greatly improves the image matching efficiency.

In one embodiment of the present invention, an apparatus comprises a model training module, the model training module comprising:

the training set generating unit is used for projecting the pre-collected 3D point cloud data to different 2D planes through different camera view angles to generate a training set containing a plurality of 2D images;

the sampling point obtaining unit is used for sampling each 2D image in the training set to obtain each sampling point;

the triple construction unit is used for generating a corresponding source image block, a target image block and a non-target image block aiming at each sampling point and constructing a triple comprising the source image block, the target image block and the non-target image block;

and the training unit is used for minimizing the distance between the source image block and the target image block and maximizing the distance between the source image block and the non-target image block for each triplet.

In one embodiment of the present invention, the training set generating unit includes:

the point cloud data acquisition subunit is used for recovering the pre-acquired 2D image sample to obtain 3D point cloud data by utilizing an SFM algorithm;

and the projection shadow unit is used for projecting the 3D point cloud data to different 2D planes through different camera view angles.

In an embodiment of the present invention, the model training module may further include:

the image block similarity calculation unit is used for respectively calculating the image block similarity of the source image block and the target image block in each triple after the triples comprising the source image block, the target image block and the non-target image block are constructed, before the distance between the source image block and the non-target image block is maximized for each triple;

and the triple filtering unit is used for filtering the redundant triples of which the image block similarity is higher than a first preset value.

the image similarity calculation unit is used for performing mean value calculation on the image block similarity of each group of source image blocks and target image blocks corresponding to each 2D image pair respectively to obtain the image similarity of each 2D image pair;

the image filtering unit is used for filtering the redundant 2D image pair with the image similarity higher than a second preset value; wherein the 2D image pair comprises the projected source 2D image and the generated target 2D image.

the triple selecting unit is used for selecting a target triple with the image block similarity higher than a third preset value after filtering the redundant triple with the image block similarity higher than the first preset value;

the training unit is specifically a unit that minimizes the distance between the source image blocks and the target image blocks and maximizes the distance between the source image blocks and the non-target image blocks for each target triplet.

the characteristic extraction unit is used for respectively carrying out multi-scale characteristic extraction on the source image blocks and the target image blocks in each target triple after selecting the target triple of which the image block similarity is higher than a third preset value to obtain each sub-source image block and each sub-target image block;

the sub-triple constructing unit is used for generating corresponding sub non-target image blocks and constructing each sub-triple comprising corresponding sub-source image blocks, sub-target image blocks and sub non-target image blocks;

the training unit is specifically a unit for minimizing the distance between the image block of the sub-source and the image block of the sub-target and maximizing the distance between the image block of the sub-source and the image block of the sub-non-target for each sub-triple.

Corresponding to the above method embodiment, referring to fig. 5, fig. 5 is a schematic diagram of an image matching apparatus provided by the present invention, which may include:

a memory 51 for storing a computer program;

the processor 52, when executing the computer program stored in the memory 51, may implement the following steps:

receiving a source image and a target image to be matched; inputting a source image and a target image into an image matching model obtained by utilizing a geometric constraint algorithm training; and performing similarity operation on the source image and the target image by using the image matching model to obtain an image matching result.

For the introduction of the device provided by the present invention, please refer to the above method embodiment, which is not described herein again.

Corresponding to the above method embodiment, the present invention further provides a computer-readable storage medium having a computer program stored thereon, the computer program, when executed by a processor, implementing the steps of:

The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

For the introduction of the computer-readable storage medium provided by the present invention, please refer to the above method embodiments, which are not described herein again.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device, the apparatus and the computer-readable storage medium disclosed in the embodiments correspond to the method disclosed in the embodiments, so that the description is simple, and the relevant points can be referred to the description of the method.

The principle and the implementation of the present invention are explained in the present application by using specific examples, and the above description of the embodiments is only used to help understanding the technical solution and the core idea of the present invention. It should be noted that, for those skilled in the art, it is possible to make various improvements and modifications to the present invention without departing from the principle of the present invention, and those improvements and modifications also fall within the scope of the claims of the present invention.

Claims

1. An image matching method, comprising:

receiving a source image and a target image to be matched;

2. The image matching method according to claim 1, wherein the training process of the image matching model comprises:

sampling each 2D image in the training set to obtain each sampling point;

3. The image matching method of claim 2, wherein projecting pre-collected 3D point cloud data through different camera views to different 2D planes comprises:

restoring the pre-acquired 2D image sample by utilizing an SFM algorithm to obtain the 3D point cloud data;

4. The image matching method of claim 2, wherein after constructing the triples comprising the source image blocks, the target image blocks, and the non-target image blocks, before minimizing the distance between the source image blocks and the target image blocks and maximizing the distance between the source image blocks and the non-target image blocks for each of the triples, further comprising:

5. The image matching method according to claim 4, further comprising, after separately calculating the image block similarities of the source image block and the target image block in each of the triples:

6. The image matching method according to claim 4 or 5, wherein after filtering the redundant triples with image block similarity higher than the first preset value, the method further comprises:

7. The image matching method according to claim 6, wherein after selecting the target triple whose image block similarity is higher than a third preset value, the method further comprises:

8. An image matching apparatus, characterized by comprising:

9. An image matching apparatus characterized by comprising:

a memory for storing a computer program;

a processor for implementing the steps of the image matching method as claimed in any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, carries out the steps of the image matching method according to any one of claims 1 to 7.