CN111260794A - Outdoor augmented reality application method based on cross-source image matching - Google Patents

Outdoor augmented reality application method based on cross-source image matching Download PDF

Info

Publication number
CN111260794A
CN111260794A CN202010034538.XA CN202010034538A CN111260794A CN 111260794 A CN111260794 A CN 111260794A CN 202010034538 A CN202010034538 A CN 202010034538A CN 111260794 A CN111260794 A CN 111260794A
Authority
CN
China
Prior art keywords
image
local
camera image
cross
augmented reality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010034538.XA
Other languages
Chinese (zh)
Other versions
CN111260794B (en
Inventor
王程
刘伟权
卞学胜
沈雪仑
赖柏锜
李渊
李永川
贾宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen University
Original Assignee
Xiamen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen University filed Critical Xiamen University
Priority to CN202010034538.XA priority Critical patent/CN111260794B/en
Publication of CN111260794A publication Critical patent/CN111260794A/en
Application granted granted Critical
Publication of CN111260794B publication Critical patent/CN111260794B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation

Abstract

The invention provides an outdoor augmented reality application method based on cross-source image matching, which comprises the following steps: acquiring a camera image and a rendering image correspondingly matched with the camera image, and processing the camera image and the rendering image to acquire a local camera image block and a local rendering image block which are matched in pairs; constructing a deep learning model according to the automatic coding machine and the twin network, and training the deep learning model; extracting feature descriptors of local camera image blocks and local rendering image blocks to be matched based on a trained deep learning model, and performing cross-source image matching on the local camera image blocks and the local rendering image blocks to be matched according to the extracted feature descriptors to obtain a cross-source image matching result; acquiring a corresponding relation of the cross-source images according to the cross-source image matching result, and calculating a virtual-real registration transformation relation according to the corresponding relation; and the application of outdoor augmented reality is realized according to the virtual-real registration transformation relation, so that the augmented reality effect is improved.

Description

Outdoor augmented reality application method based on cross-source image matching
Technical Field
The invention relates to the technical field of outdoor augmented reality, in particular to an outdoor augmented reality application method based on cross-source image matching, a computer readable storage medium and computer equipment.
Background
In the related art, augmented reality applications are mainly focused on indoor scenes, virtual and real registration is assisted by pre-placing marks, however, in outdoor scenes, due to the fact that the scale and complexity of the outdoor scenes are increased, the pre-placing marks are unrealistic, most of outdoor augmented reality applications are usually based on positioning and vision methods of sensors and are mainly applied to static scenes, and the fusion precision of multiple sensors is not robust to light change and shading, so that the augmented reality effect is influenced.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the art described above. Therefore, one objective of the present invention is to provide an outdoor augmented reality application method based on cross-source image matching, in which a corresponding relationship is obtained by matching the cross-source images, and a virtual-real registration transformation relationship is obtained according to the corresponding relationship, so as to improve an augmented reality effect.
A second object of the invention is to propose a computer-readable storage medium.
A third object of the invention is to propose a computer device.
In order to achieve the above object, an embodiment of a first aspect of the present invention provides an outdoor augmented reality application method based on cross-source image matching, including the following steps: acquiring a camera image and a rendering image correspondingly matched with the camera image, and processing the camera image and the rendering image to acquire a local camera image block and a local rendering image block which are matched in pairs; constructing a deep learning model according to an automatic coding machine and a twin network, and training the deep learning model according to paired matched local camera image blocks and local rendering image blocks; extracting feature descriptors of local camera image blocks and local rendering image blocks to be matched based on a trained deep learning model, and performing cross-source image matching on the local camera image blocks and the local rendering image blocks to be matched according to the extracted feature descriptors to obtain a cross-source image matching result; acquiring a corresponding relation of the cross-source images according to the cross-source image matching result, and calculating a virtual-real registration transformation relation according to the corresponding relation; and realizing the application of outdoor augmented reality according to the virtual-real registration transformation relation.
According to the outdoor augmented reality application method based on cross-source image matching, firstly, a camera image and a rendering image corresponding to the camera image are obtained, the camera image and the rendering image are processed to obtain a local camera image block and a local rendering image block which are matched in pairs, then, a deep learning model is constructed according to an automatic coding machine and a twin network, the deep learning model is trained according to the local camera image block and the local rendering image block which are matched in pairs, then, the feature descriptors of the local camera image block and the local rendering image block which are to be matched are extracted based on the trained deep learning model, the cross-source image matching is carried out on the local camera image block and the local rendering image block which are to be matched according to the extracted feature descriptors to obtain a cross-source image matching result, and then, the corresponding relation of the cross-source image is obtained according to the cross-source image matching result, calculating a virtual-real registration transformation relation according to the corresponding relation, and finally realizing application to outdoor augmented reality according to the virtual-real registration transformation relation; therefore, the cross-source image is matched to obtain the corresponding relation, and the virtual-real registration transformation relation is obtained according to the corresponding relation, so that the augmented reality effect is improved.
In addition, the outdoor augmented reality application method based on cross-source image matching proposed according to the above embodiment of the present invention may further have the following additional technical features:
optionally, acquiring a camera image and a rendered image correspondingly matched with the camera image includes: acquiring a camera image; acquiring an aerial image, and performing three-dimensional reconstruction on the aerial image by adopting an SFM algorithm to obtain a three-dimensional image point cloud of an outdoor scene; and acquiring image information according to the camera image, and rendering a rendering image which is correspondingly matched with the camera image in the three-dimensional image point cloud according to the image information.
Optionally, processing the camera image and the rendered image to obtain pairs of matched local camera tiles and local rendered tiles includes: acquiring a perspective transformation matrix of the camera image and the rendered image; labeling the segmented sample in the camera image with a LabelMe toolkit; constructing a segmentation network, and training the segmentation network according to the marked camera image; segmenting the camera image based on the trained segmentation network to segment a segmentation sample of the camera image; extracting all key points of the segmentation sample by using a detector with scale-invariant feature transformation, selecting a plurality of key points from all key points so that the distance between each selected key point is greater than a first preset threshold value, and deleting other unselected key points; and taking the selected multiple key points as a center, acquiring corresponding local camera image blocks according to a preset size, and mapping the local camera image blocks onto the rendered image according to the perspective transformation matrix to acquire the corresponding local rendered image blocks.
Optionally, the deep learning model comprises: an encoder, a decoder and an STN block.
Optionally, when the deep learning model is trained according to the pair-matched local camera image block and local rendering image block, the method further includes: and adjusting the optimizer and the hyper-parameters according to the training requirements of the deep learning model, wherein the hyper-parameters comprise a learning step length, a learning rate and a batch size.
Optionally, performing cross-source image matching on the local camera image block to be matched and the local rendering image block according to the extracted feature descriptors, including: acquiring a feature descriptor of a corresponding local rendering image block meeting a first preset condition by using a nearest neighbor retrieval method and taking the feature descriptor of the local camera image block as a reference; and filtering error matching by adopting a RANSAC algorithm according to the retrieved feature descriptors of the matched local camera image blocks and the feature descriptors of the local rendering image blocks, and calculating the central points of the remaining paired matched local camera image blocks and local rendering image blocks to obtain the matching relationship between the local camera image blocks and the local rendering image blocks and obtain the cross-source image matching result.
Optionally, calculating a virtual-real registration transformation relationship according to the correspondence includes: acquiring the three-dimensional image point cloud M to a camera image C according to the image informationICorresponding rendering image RIThe projection matrix P of (i.e. P.M → R)I(ii) a Capturing Camera image CIAnd a corresponding rendered image RIIs T, i.e. T.RI→CI(ii) a Obtaining a three-dimensional image point cloud M to a camera image C according to the projection relation and the matching relationIThe transformation relation of the virtual and real registration of (1), namely the transformation relation of the three-dimensional space and the two-dimensional space T · (P · M) → CI
Optionally, the implementing of the application to outdoor augmented reality according to the virtual-real registration transformation relationship includes: acquiring the position of a three-dimensional virtual target to be superposed in an outdoor scene; placing the three-dimensional virtual target into a three-dimensional image point cloud; and mapping the three-dimensional virtual target to the camera image according to the virtual-real registration transformation relation.
To achieve the above object, a second embodiment of the present invention provides a computer-readable storage medium, on which an outdoor augmented reality application based on cross-source image matching is stored, and when executed by a processor, the outdoor augmented reality application based on cross-source image matching implements the outdoor augmented reality application method based on cross-source image matching as described above.
According to the computer-readable storage medium of the embodiment of the invention, the outdoor augmented reality application program based on cross-source image matching is stored, so that the processor realizes the outdoor augmented reality application method based on cross-source image matching when the outdoor augmented reality application program based on cross-source image matching is executed, and the effect of augmented reality is improved.
In order to achieve the above object, a third embodiment of the present invention provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for applying outdoor augmented reality based on cross-source image matching as described above is implemented.
According to the computer device of the embodiment of the invention, the computer program which can run on the processor is stored through the memory, so that the processor can realize the outdoor augmented reality application method based on cross-source image matching when executing the computer program, and the augmented reality effect is improved.
Drawings
Fig. 1 is a schematic flowchart of an outdoor augmented reality application method based on cross-source image matching according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a three-dimensional image point cloud result of an outdoor scene obtained after an aerial image is three-dimensionally reconstructed by an SFM algorithm according to an embodiment of the invention;
FIG. 3 is a schematic diagram of acquiring a rendered image corresponding to a match of a camera image according to one embodiment of the invention;
FIG. 4 is a schematic diagram of a partitioned network according to one embodiment of the present invention;
FIG. 5 is a cross-source image block matched in pairs according to one embodiment of the invention;
FIG. 6 is a schematic structural diagram of a deep learning model according to an embodiment of the present invention;
FIG. 7 is a block diagram of deep learning model branch 1 according to an embodiment of the present invention;
FIG. 8 is a block diagram of deep learning model branch 2 according to an embodiment of the present invention;
FIG. 9 is a cross-source image matching result according to one embodiment of the invention;
FIG. 10 is a matching result across a source image center point line according to one embodiment of the present invention;
fig. 11 is a diagram illustrating an effect of outdoor augmented reality according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
In order to better understand the above technical solutions, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.
Fig. 1 is a schematic flow diagram of an outdoor augmented reality application method based on cross-source image matching according to an embodiment of the present invention, and as shown in fig. 1, the outdoor augmented reality application method based on cross-source image matching according to the embodiment of the present invention includes the following steps:
step 101, acquiring a camera image and a rendering image corresponding to the camera image, and processing the camera image and the rendering image to acquire a local camera image block and a local rendering image block which are matched in pairs.
As one embodiment, acquiring a camera image and a rendered image corresponding to the camera image includes: acquiring a camera image; acquiring an aerial image, and performing three-dimensional reconstruction on the aerial image by adopting an SFM algorithm to obtain a three-dimensional image point cloud of an outdoor scene; and acquiring image information according to the camera image, and rendering a rendering image which is correspondingly matched with the camera image in the three-dimensional image point cloud according to the image information.
As a specific example, the camera image may be obtained by shooting with a mobile phone.
As a specific example, the aerial images may be captured by a drone.
As a specific example, as shown in fig. 2, an outdoor scene is obliquely photographed by an unmanned aerial vehicle, so as to obtain a large number of aerial images IiAnd i is 1,2, …, N, wherein N is the number of aerial images, and the three-dimensional reconstruction is performed on the aerial images by adopting an SfM algorithm, so as to obtain a three-dimensional image point cloud M of the outdoor scene.
As a specific embodiment, as shown in fig. 3, an outdoor scene is photographed by a mobile phone, so as to obtain a camera image CIVirtually positioning in the reconstructed three-dimensional image point cloud scene M by using the positioning information of the mobile phone, starting from a shooting direction determined by the external parameters of the mobile phone to obtain a projection matrix P from the three-dimensional image point cloud M to a camera image, and rendering a rendering image R with the same size in the reconstructed three-dimensional image point cloud M by taking the size of the image shot by the mobile phone as a standardI
In addition, the camera image C is used as a referenceIA rendered image R rendered from the positioning informationIWith the camera image CIAre matched with each other, and a camera image CIAnd rendering the image RIReferred to as a cross-source image.
As one embodiment, processing a camera image and a rendered image to obtain pairs of matched local camera tiles and local rendered tiles includes: acquiring a perspective transformation matrix of a camera image and a rendering image; marking the segmentation samples in the camera image by adopting a LabelMe toolkit; constructing a segmentation network, and training the segmentation network according to the marked camera image; segmenting the camera image based on the trained segmentation network to segment a segmentation sample of the camera image; extracting all key points of the segmentation sample by using a detector with scale-invariant feature transformation, selecting a plurality of key points from all the key points so that the distance between each selected key point is greater than a first preset threshold value, and deleting other unselected key points; and taking the selected multiple key points as a center, acquiring corresponding local camera image blocks according to a preset size, and mapping the local camera image blocks onto the rendered image according to a perspective transformation matrix to acquire the corresponding local rendered image blocks.
Note that the matched camera image C is presetIAnd rendering the image RIThe transformation relation of (2) is perspective transformation.
As a specific example, at least 4 sets of camera images C are first manually selectedIAnd rendering the image RIThe two perspective transformation matrixes T across the source images are calculated by the matching corresponding points; a segmented network was then constructed using the framework of U-Net, and 200 camera images C were then labeled using the LabelMe toolkitITaking the building as a target segmentation sample, and finally inputting the 200 marked samples into a constructed segmentation network for training; wherein, the constructed segmentation network is shown in FIG. 4; next, the camera image CIInputting the image into a trained segmentation network to segment the building, and extracting a camera image C by using a SIFT (Scale Invariant Feature Transform) detectorIDividing SIFT key points of the divided building, selecting a plurality of key points from all SIFT key points, enabling the distance between each selected SIFT key point to be larger than 30 pixels, and deleting other SIFT key points which are not selected; taking a plurality of selected SIFT key points as a center, acquiring local camera image blocks corresponding to each SIFT key point according to a certain size, and mapping the local camera image blocks onto a rendering image matched with the camera image according to a calculated perspective transformation matrix T to obtain corresponding matched local rendering image blocks; the matching pairs of partial camera patch and partial rendering patch are shown in fig. 5, where the first row in fig. 5 is a partial camera patch,the second action locally renders the image block.
And 102, constructing a deep learning model according to the automatic coding machine and the twin network, and training the deep learning model according to the pair-matched local camera image blocks and the local rendering image blocks.
As one example, as shown in FIG. 6, the deep learning model Y-Net is shaped like the letter Y, and includes: an encoder, a decoder and an STN block.
Wherein, this encoder has two, and the structure of this encoder is: c (32,5,2) -BN-SeLU-C (64,5,2) -BN-SeLU-P (3,2) -C (96,3,1) -BN-SeLU-C (256,3,1) -BN-SeLU-P (3,2) -C (384,3,1) -BN-SeLU-C (384,3,1) -BN-SeLU-C (256,3,1) -BN-SeLU-P (3,2) -C (128,7,1) -BN-SeLU; where C (n, k, s) represents a convolutional layer containing n convolutional kernels of size k and having a step size s; p (k, s) represents the maximum pooling layer with a sliding window of k and a step size of s; BN is batch standardization; SeLU is the activation function.
Wherein, the decoder is a shared decoder, and the structure of the decoder is as follows: FC (128,1024) -TC (128,4,2) -SeLU-TC (64,4,2) -SeLU-TC (32,4,2) -SeLU-TC (16,4,2) -SeLU-TC (8, 4,2) -SeLU-TC (4, 4,2) -SeLU-TC (3, 4,2) -Sigmoid; wherein FC (p, q) represents a full connection layer, FC maps a vector of p dimension to a vector of q dimension; TC (n, k, s) represents the deconvolution layer, the output depth is n, the convolution kernel size is k × k, and the step length is s; SeLU and Sigmoid are activation functions.
It should be noted that the inputs of the deep learning model Y-Net are local camera image blocks and local rendering image blocks matched in pairs, where the input of one branch is a local camera image block and the input of the other branch is a local rendering image block, and these image blocks are first adjusted to 256 × 3 in size before being input into the deep learning model Y-Net; the output of the deep learning model Y-Net is two 128-dimensional feature vectors, namely feature descriptors; the decomposition diagrams of the two branches of the deep learning model Y-Net are shown in FIGS. 7 and 8.
In FIG. 7, the input of the deep learning model Y-Net branch 1 is a local rendering image block R, which is processed by the encoder FEn1Extracted feature fAE1Characteristic fAE1Via a decoder GDe1Recovering the obtained image and recording the image as R'; in FIG. 8, the input to the deep learning model Y-Net branch 2 is the local camera patch C, combining the STN module with the encoder as one large encoder FEn2Local camera image block C is encoded by encoder FEn2The extracted feature is fAE2Characteristic fAE2Via a decoder GDe2The recovered image is denoted as C'.
As a specific embodiment, when the deep learning model is constructed according to the automatic coding machine and the twin network, a cross-source constraint loss function is further designed to optimize the deep learning model Y-Net, wherein the cross-source constraint loss function comprises content loss and characteristic consistent loss.
Firstly, acting on an image block of an input deep learning model Y-Net by using Mean Squared Error (MSE), specifically as follows:
Figure BDA0002365518900000061
Figure BDA0002365518900000062
Figure BDA0002365518900000063
wherein, W × H is the size of the input image block, and N is the number of channels of the image; combining these three MSE losses yields a content loss as follows:
Lcontent=LAE1(R,R′)+LAE2(R,C′)+LGEN(R′,C′)
secondly, the feature consistent loss is constrained by the feature f extracted by two branches of the deep learning model Y-NetAE1And fAE2And constraining by adopting the Euclidean distance, which specifically comprises the following steps:
Figure BDA0002365518900000064
wherein K is the feature fAE1And fAE2In the present invention, K is 128.
And finally, combining the content loss and the characteristic consistent loss to obtain a cross-source constraint loss function, which is as follows:
LY-Net=LContent+λ*LFeature
wherein λ is a weight parameter.
As an embodiment, when the deep learning model is trained according to the pair-matched local camera image block and local rendering image block, the method further includes: and adjusting the optimizer and the hyper-parameters according to the training requirements of the deep learning model, wherein the hyper-parameters comprise a learning step length, a learning rate and a batch size.
As a specific example, the deep learning model Y-Net was implemented by PyTorch, using an optimizer of RMSprop, with an initial value of learning rate of 0.001 and a decay of 0.99 times every 4 cycles.
103, extracting feature descriptors of the local camera image blocks and the local rendering image blocks to be matched based on the trained deep learning model, and performing cross-source image matching on the local camera image blocks and the local rendering image blocks to be matched according to the extracted feature descriptors to obtain a cross-source image matching result.
As an embodiment, before extracting the feature descriptors of the local camera image blocks and the local rendering image blocks to be matched, the building of the camera image is further segmented by the trained segmentation network; extracting SIFT key points of a building segmented from a camera image by using an SIFT detector, selecting a plurality of key points from all the key points so that the distance between each selected key point is more than 30 pixels, deleting other unselected key points, and taking the SIFT key points as the center to obtain a local camera image block; randomly selecting 3000 points on a corresponding rendering image, and taking the random points as a center to obtain a local rendering image block; thereby obtaining the same number of local camera image blocks and local rendering image blocks to be matched.
As a specific embodiment, inputting the obtained local camera image blocks and local rendering image blocks with the same number to be matched into a trained deep learning model Y-Net to extract feature descriptors of the local camera image blocks and the local rendering image blocks; and acquiring the feature descriptors of the local rendering image blocks meeting the following two conditions by adopting a nearest neighbor retrieval method and taking the feature descriptors of the local camera image blocks as a reference: 1) the feature descriptor of the local rendering image block closest to the feature descriptor of the local camera image block, 2) the feature descriptor of the local rendering image block having a pre-similarity to the feature descriptor of the local camera image block greater than 0.92; and filtering out erroneous matching by adopting a RANSAC algorithm according to the retrieved feature descriptors of the matched local camera image blocks and the local rendering image blocks, and then calculating the transformation relation of the two images by using the centers of the left matched local camera image blocks and the local rendering image blocks to complete the cross-source image matching. The final cross-source image block matching result is shown in fig. 9, and the centroids of these image blocks are connected as shown in fig. 10.
And 104, acquiring the corresponding relation of the cross-source images according to the cross-source image matching result, and calculating a virtual-real registration transformation relation according to the corresponding relation.
As an embodiment, calculating the virtual-real registration transformation relationship according to the correspondence includes: obtaining three-dimensional image point cloud M to camera image C according to image informationICorresponding rendering image RIThe projection matrix P of (i.e. P.M → R)I(ii) a Capturing Camera image CIAnd a corresponding rendered image RIIs a perspective transformation matrix T, i.e. T.RI→CI(ii) a Obtaining three-dimensional image point cloud M to camera image C according to projection matrix P and perspective transformation matrix TIThe transformation relation of the virtual and real registration of (1), namely the transformation relation of the three-dimensional space and the two-dimensional space T · (P · M) → CI
And 105, realizing application of outdoor augmented reality according to the virtual-real registration transformation relation.
As an embodiment, the application of outdoor augmented reality is realized according to a virtual-real registration transformation relationship, including: acquiring the position of a three-dimensional virtual target to be superposed in an outdoor scene; placing a three-dimensional virtual target into a three-dimensional image point cloud; and mapping the three-dimensional virtual target to the camera image according to the virtual-real registration transformation relation. Fig. 11 shows several effects of the outdoor augmented reality application based on the present invention, and the real content is real-time information of a library.
In summary, the outdoor augmented reality application method based on cross-source image matching provided by the invention includes firstly acquiring a camera image and a rendering image correspondingly matched with the camera image, processing the camera image and the rendering image to acquire a local camera image block and a local rendering image block which are matched in pairs, then constructing a deep learning model according to an automatic coding machine and a twin network, training the deep learning model according to the local camera image block and the local rendering image block which are matched in pairs, then extracting feature descriptors of the local camera image block and the local rendering image block to be matched based on the trained deep learning model, performing cross-source image matching on the local camera image block and the local rendering image block to be matched according to the feature descriptors to obtain a cross-source image matching result, and then acquiring a corresponding relation of the cross-source image according to the cross-source image matching result, calculating a virtual-real registration transformation relation according to the corresponding relation, and finally realizing application to outdoor augmented reality according to the virtual-real registration transformation relation; therefore, the cross-source image is matched to obtain the corresponding relation, and the virtual-real registration transformation relation is obtained according to the corresponding relation, so that the augmented reality effect is improved.
In addition, the embodiment of the present invention further provides a computer-readable storage medium, on which an outdoor augmented reality application based on cross-source image matching is stored, and when being executed by a processor, the outdoor augmented reality application based on cross-source image matching implements the above outdoor augmented reality application method based on cross-source image matching.
According to the computer-readable storage medium of the embodiment of the invention, the outdoor augmented reality application program based on cross-source image matching is stored, so that the processor realizes the outdoor augmented reality application method based on cross-source image matching when the outdoor augmented reality application program based on cross-source image matching is executed, and the effect of augmented reality is improved.
In addition, the embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the above outdoor augmented reality application method based on cross-source image matching is implemented.
According to the computer device of the embodiment of the invention, the computer program which can run on the processor is stored through the memory, so that the processor can realize the outdoor augmented reality application method based on cross-source image matching when executing the computer program, and the augmented reality effect is improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
In the description of the present invention, it is to be understood that the terms "first", "second" and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the present invention, unless otherwise expressly stated or limited, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through an intermediate. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above should not be understood to necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. An outdoor augmented reality application method based on cross-source image matching is characterized by comprising the following steps:
acquiring a camera image and a rendering image correspondingly matched with the camera image, and processing the camera image and the rendering image to acquire a local camera image block and a local rendering image block which are matched in pairs;
constructing a deep learning model according to an automatic coding machine and a twin network, and training the deep learning model according to paired matched local camera image blocks and local rendering image blocks;
extracting feature descriptors of local camera image blocks and local rendering image blocks to be matched based on a trained deep learning model, and performing cross-source image matching on the local camera image blocks and the local rendering image blocks to be matched according to the extracted feature descriptors to obtain a cross-source image matching result;
acquiring a corresponding relation of the cross-source images according to the cross-source image matching result, and calculating a virtual-real registration transformation relation according to the corresponding relation;
and realizing the application of outdoor augmented reality according to the virtual-real registration transformation relation.
2. The outdoor augmented reality application method based on cross-source image matching of claim 1, wherein acquiring a camera image and a rendered image corresponding matching the camera image comprises:
acquiring a camera image;
acquiring an aerial image, and performing three-dimensional reconstruction on the aerial image by adopting an SFM algorithm to obtain a three-dimensional image point cloud of an outdoor scene;
and acquiring image information according to the camera image, and rendering a rendering image which is correspondingly matched with the camera image in the three-dimensional image point cloud according to the image information.
3. The method for outdoor augmented reality application based on cross-source image matching according to claim 1, wherein processing the camera image and the rendered image to obtain pairs of matched local camera tiles and local rendered tiles comprises:
acquiring a perspective transformation matrix of the camera image and the rendered image;
labeling the segmented sample in the camera image with a LabelMe toolkit;
constructing a segmentation network, and training the segmentation network according to the marked camera image;
segmenting the camera image based on the trained segmentation network to segment a segmentation sample of the camera image;
extracting all key points of the segmentation sample by using a detector with scale-invariant feature transformation, selecting a plurality of key points from all key points so that the distance between each selected key point is greater than a first preset threshold value, and deleting other unselected key points;
and taking the selected multiple key points as a center, acquiring corresponding local camera image blocks according to a preset size, and mapping the local camera image blocks onto the rendered image according to the perspective transformation matrix to acquire the corresponding local rendered image blocks.
4. The outdoor augmented reality application method based on cross-source image matching of claim 1 wherein the deep learning model comprises: an encoder, a decoder and an STN block.
5. The outdoor augmented reality application method based on cross-source image matching of claim 1, wherein when training the deep learning model from pairs of matched local camera patch and local rendering patch, further comprising:
and adjusting the optimizer and the hyper-parameters according to the training requirements of the deep learning model, wherein the hyper-parameters comprise a learning step length, a learning rate and a batch size.
6. The outdoor augmented reality application method based on cross-source image matching of claim 1 wherein cross-source image matching of the local camera image block to be matched and the local rendering image block according to the extracted feature descriptors comprises:
acquiring a feature descriptor of a corresponding local rendering image block meeting a first preset condition by using a nearest neighbor retrieval method and taking the feature descriptor of the local camera image block as a reference;
and filtering error matching by adopting a RANSAC algorithm according to the retrieved feature descriptors of the matched local camera image blocks and the feature descriptors of the local rendering image blocks, and calculating the central points of the remaining paired matched local camera image blocks and local rendering image blocks to obtain perspective transformation matrixes of the local camera image blocks and the local rendering image blocks and obtain a cross-source image matching result.
7. The outdoor augmented reality application method based on cross-source image matching as claimed in claim 2, wherein calculating a virtual-real registration transformation relation according to the correspondence comprises:
acquiring the three-dimensional image point cloud M to a camera image C according to the image informationICorresponding rendering image RIThe projection matrix P of (i.e. P.M → R)I
Capturing Camera image CIAnd a corresponding rendered image RIIs a perspective transformation matrix T, i.e. T.RI→CI
Obtaining a three-dimensional image point cloud M to a camera image C according to the projection matrix P and the perspective transformation matrix TIThe transformation relation of the virtual and real registration of (1), namely the transformation relation of the three-dimensional space and the two-dimensional space T · (P · M) → CI
8. The outdoor augmented reality application method based on cross-source image matching according to claim 1, wherein the application of outdoor augmented reality is realized according to the virtual-real registration transformation relation, and comprises the following steps:
acquiring the position of a three-dimensional virtual target to be superposed in an outdoor scene;
placing the three-dimensional virtual target into a three-dimensional image point cloud;
and mapping the three-dimensional virtual target to the camera image according to the virtual-real registration transformation relation.
9. A computer-readable storage medium having stored thereon a cross-source image matching based outdoor augmented reality application that, when executed by a processor, implements a cross-source image matching based outdoor augmented reality application method according to any one of claims 1-8.
10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the cross-source image matching based outdoor augmented reality application method of any one of claims 1-8.
CN202010034538.XA 2020-01-14 2020-01-14 Outdoor augmented reality application method based on cross-source image matching Active CN111260794B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010034538.XA CN111260794B (en) 2020-01-14 2020-01-14 Outdoor augmented reality application method based on cross-source image matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010034538.XA CN111260794B (en) 2020-01-14 2020-01-14 Outdoor augmented reality application method based on cross-source image matching

Publications (2)

Publication Number Publication Date
CN111260794A true CN111260794A (en) 2020-06-09
CN111260794B CN111260794B (en) 2022-07-08

Family

ID=70950401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010034538.XA Active CN111260794B (en) 2020-01-14 2020-01-14 Outdoor augmented reality application method based on cross-source image matching

Country Status (1)

Country Link
CN (1) CN111260794B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785687A (en) * 2021-01-25 2021-05-11 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and readable storage medium
CN112861952A (en) * 2021-01-29 2021-05-28 云南电网有限责任公司电力科学研究院 Partial discharge image matching deep learning method
CN117078975A (en) * 2023-10-10 2023-11-17 四川易利数字城市科技有限公司 AR space-time scene pattern matching method based on evolutionary algorithm

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426705A (en) * 2011-09-30 2012-04-25 北京航空航天大学 Behavior splicing method of video scene
WO2016071896A1 (en) * 2014-11-09 2016-05-12 L.M.Y. Research & Development Ltd. Methods and systems for accurate localization and virtual object overlay in geospatial augmented reality applications
CN107292965A (en) * 2017-08-03 2017-10-24 北京航空航天大学青岛研究院 A kind of mutual occlusion processing method based on depth image data stream
CN110021029A (en) * 2019-03-22 2019-07-16 南京华捷艾米软件科技有限公司 A kind of real-time dynamic registration method and storage medium suitable for RGBD-SLAM
CN110390302A (en) * 2019-07-24 2019-10-29 厦门大学 A kind of objective detection method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426705A (en) * 2011-09-30 2012-04-25 北京航空航天大学 Behavior splicing method of video scene
WO2016071896A1 (en) * 2014-11-09 2016-05-12 L.M.Y. Research & Development Ltd. Methods and systems for accurate localization and virtual object overlay in geospatial augmented reality applications
CN107292965A (en) * 2017-08-03 2017-10-24 北京航空航天大学青岛研究院 A kind of mutual occlusion processing method based on depth image data stream
CN110021029A (en) * 2019-03-22 2019-07-16 南京华捷艾米软件科技有限公司 A kind of real-time dynamic registration method and storage medium suitable for RGBD-SLAM
CN110390302A (en) * 2019-07-24 2019-10-29 厦门大学 A kind of objective detection method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEIQUAN LIU 等: "Ground Camera Images and UAV 3D Model Registration for Outdoor Augmented Reality", 《2019 IEEE CONFERENCE ON VIRTUAL REALITY AND 3D USER INTERFACES (VR)》, 15 August 2019 (2019-08-15), pages 1050 - 1051 *
黄碧辉 等: "一种改进的户外移动增强现实三维注册方法", 《武汉大学学报(信息科学版)》, 5 December 2019 (2019-12-05), pages 1865 - 1873 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112785687A (en) * 2021-01-25 2021-05-11 Oppo广东移动通信有限公司 Image processing method, image processing device, electronic equipment and readable storage medium
CN112861952A (en) * 2021-01-29 2021-05-28 云南电网有限责任公司电力科学研究院 Partial discharge image matching deep learning method
CN112861952B (en) * 2021-01-29 2023-04-28 云南电网有限责任公司电力科学研究院 Partial discharge image matching deep learning method
CN117078975A (en) * 2023-10-10 2023-11-17 四川易利数字城市科技有限公司 AR space-time scene pattern matching method based on evolutionary algorithm
CN117078975B (en) * 2023-10-10 2024-01-02 四川易利数字城市科技有限公司 AR space-time scene pattern matching method based on evolutionary algorithm

Also Published As

Publication number Publication date
CN111260794B (en) 2022-07-08

Similar Documents

Publication Publication Date Title
Melekhov et al. Dgc-net: Dense geometric correspondence network
US11200424B2 (en) Space-time memory network for locating target object in video content
Zhang et al. Densely connected pyramid dehazing network
Wang et al. 360sd-net: 360 stereo depth estimation with learnable cost volume
CN111260794B (en) Outdoor augmented reality application method based on cross-source image matching
Truong et al. Pdc-net+: Enhanced probabilistic dense correspondence network
CN109416727A (en) Glasses minimizing technology and device in a kind of facial image
AU2019268184B2 (en) Precise and robust camera calibration
CN115797350B (en) Bridge disease detection method, device, computer equipment and storage medium
CN114581571A (en) Monocular human body reconstruction method and device based on IMU and forward deformation field
Chelani et al. How privacy-preserving are line clouds? recovering scene details from 3d lines
Malav et al. DHSGAN: An end to end dehazing network for fog and smoke
CN116012432A (en) Stereoscopic panoramic image generation method and device and computer equipment
Ali et al. Single image Façade segmentation and computational rephotography of House images using deep learning
Basak et al. Monocular depth estimation using encoder-decoder architecture and transfer learning from single RGB image
Qin et al. Depth estimation by parameter transfer with a lightweight model for single still images
CN114241141A (en) Smooth object three-dimensional reconstruction method and device, computer equipment and storage medium
CN112465796B (en) Light field feature extraction method integrating focal stack and full-focus image
CN113744280A (en) Image processing method, apparatus, device and medium
Maiwald A window to the past through modern urban environments: Developing a photogrammetric workflow for the orientation parameter estimation of historical images
CN112070181A (en) Image stream-based cooperative detection method and device and storage medium
CN116311218A (en) Noise plant point cloud semantic segmentation method and system based on self-attention feature fusion
CN115953471A (en) Indoor scene multi-scale vector image retrieval and positioning method, system and medium
JP2022036075A (en) Method for training neural network to deliver viewpoints of objects using unlabeled pairs of images, and corresponding system
Li SuperGlue-Based Deep Learning Method for Image Matching from Multiple Viewpoints

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant