NZ628782B2 - Ar image processing apparatus and method - Google Patents
Ar image processing apparatus and method Download PDFInfo
- Publication number
- NZ628782B2 NZ628782B2 NZ628782A NZ62878212A NZ628782B2 NZ 628782 B2 NZ628782 B2 NZ 628782B2 NZ 628782 A NZ628782 A NZ 628782A NZ 62878212 A NZ62878212 A NZ 62878212A NZ 628782 B2 NZ628782 B2 NZ 628782B2
- Authority
- NZ
- New Zealand
- Prior art keywords
- image
- marker
- captured image
- posture
- analyzer
- Prior art date
Links
- 239000003550 marker Substances 0.000 claims abstract description 120
- 230000000875 corresponding Effects 0.000 claims abstract description 38
- 239000002131 composite material Substances 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000009877 rendering Methods 0.000 claims abstract description 14
- 238000003672 processing method Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 description 33
- 230000001131 transforming Effects 0.000 description 17
- 239000000203 mixture Substances 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000003287 optical Effects 0.000 description 2
- 102100015077 NEUROD6 Human genes 0.000 description 1
- 101710008008 NEUROD6 Proteins 0.000 description 1
- 230000003190 augmentative Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
Classifications
-
- G06K2009/3225—
-
- G06K9/00671—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
Abstract
AR image processing apparatus comprises a camera (1), a storage, an AR marker recognition based first AR analyser (3A), a natural feature tracking based second AR analyser (3B), a CG rendering unit (5) and a display unit (7). The camera (1) captures a first captured image of a scene in a first field of view which includes an AR marker and its surroundings and a second captured image of a scene in a second field of view which includes at least some of the surroundings which included in the first captured image. The storage stores an image data of a CG object. The AR marker recognition based first AR analyser (3A) obtains the first captured image, carries out an AR marker recognition process to find out an AR marker image from the first captured image and determine a position, posture, and scale of the AR marker image in a first view volume space defined in a first coordinate system of this first AR analyser (3A). The natural feature tracking based second AR analyser (3B) obtains the first captured image and data of the determined position, posture, and scale of the AR marker image in the first view volume space, calculates a position, posture, and scale of the AR marker in a second view volume space defined in a second coordinate system of the second AR analyser (3B) corresponding to a current position, center axis direction and field of view of the camera (1), obtains the second captured image, and carries out a natural feature tracking process between the first and the second captured images to determine a position, posture, and scale of the AR marker image in the second captured image in the second view volume space of the second AR analyser (3B). The CG rendering unit (5) reads out the image data of the CG object corresponding to the AR marker from the storage, reproduces an image of the CG object corresponding to the calculated position, posture, and scale in the second view volume space and composites the image of the CG object with the second captured image of the camera (1). The display unit (7) displays the composite image. eld of view which includes an AR marker and its surroundings and a second captured image of a scene in a second field of view which includes at least some of the surroundings which included in the first captured image. The storage stores an image data of a CG object. The AR marker recognition based first AR analyser (3A) obtains the first captured image, carries out an AR marker recognition process to find out an AR marker image from the first captured image and determine a position, posture, and scale of the AR marker image in a first view volume space defined in a first coordinate system of this first AR analyser (3A). The natural feature tracking based second AR analyser (3B) obtains the first captured image and data of the determined position, posture, and scale of the AR marker image in the first view volume space, calculates a position, posture, and scale of the AR marker in a second view volume space defined in a second coordinate system of the second AR analyser (3B) corresponding to a current position, center axis direction and field of view of the camera (1), obtains the second captured image, and carries out a natural feature tracking process between the first and the second captured images to determine a position, posture, and scale of the AR marker image in the second captured image in the second view volume space of the second AR analyser (3B). The CG rendering unit (5) reads out the image data of the CG object corresponding to the AR marker from the storage, reproduces an image of the CG object corresponding to the calculated position, posture, and scale in the second view volume space and composites the image of the CG object with the second captured image of the camera (1). The display unit (7) displays the composite image.
Description
TITLE OF THE INVENTION: AR IMAGE PROCESSING APPARATUS AND METHOD
TECHNICAL FIELD
The present invention relates to an AR image processing
apparatus and method which employ a combination of an AR marker
and a natural feature tracking method.
BACKGROUND ART
In many fields, there have been already used AR image
processing apparatuses configured to composite a CG object on
a target object image such as an AR marker image in real time
by using augmented reality (AR) techniques, the target object
image being captured by a camera which is an image capturing
device such as a web camera or a digital video camera.
A marker based AR technique involves: registering in
advance feature points forming a group having a certain shape
in a digital image; detecting the registered feature points from
a digital image captured by the image capturing device by using
homography or the like; estimating the position, the posture,
and the like of the group; and compositing and displaying a CG
object at the position of an AR marker image corresponding to
the position, the posture, and the like of the group.
In this AR technique, the feature points registered in
advance and having the certain shape are referred to as AR marker
(or simply "marker"). By adding additional information
indicating the size and posture of the marker in the real world
in the registration of the marker, the size of and the distance
6651075_4.doc
to the AR marker in a digital image obtained from the image
capturing device can be accurately estimated to some extent.
Meanwhile, when no recognizable feature points exist in the
digital image, the position and posture of the marker cannot
be estimated as a matter of course.
A natural feature tracking based AR technique as typified
by PTAM ("Parallel Tracking and Mapping for Small AR Workspaces",
Oxford University) is an excellent method which requires no
prior registration of the feature points in the digital image
and which allows the image capturing device to be moved in any
direction and to any position as long as the feature points can
be tracked even when the position of the image capturing device
is continuously moved.
However, since a base position needs to be designated
first, the image capturing device needs to be moved in a special
way to determine the base position from amounts of movement of
the feature points in multiple images captured along with the
movement of the camera, and position and posture information
needs to be additionally provided. In this process, a base
plane cannot be accurately determined unless the image
capturing device is correctly moved. Moreover, in the natural
feature tracking based AR technique, since no prior
registration of feature points is generally performed due to
the nature of the technique, information on the distance among
and the size of feature points in a captured digital image cannot
be accurately known. Hence, there is generally used a method
of manually setting the size, direction and position of the CG
object with respect to the base plane.
6651075_4.doc
PRIOR ART DOCUMENTS
PATENT DOCUMENT
PATENT DOCUMENT 1: Japanese Patent Application Publication No.
2011-141828
PATENT DOCUMENT 2: Japanese Patent Application Publication No.
2012-003598
SUMMARY OF THE INVENTION
An embodiment of the present invention seeks to provide
an AR image processing method and apparatus which incorporate
advantages of both of the conventional marker based AR technique
and the conventional natural feature tracking based AR
technique and which appropriately composite and display a CG
object on a digital image of a natural landscape or the like
captured by a camera.
[0008a]
Alternatively or additionally, an embodiment of the
present invention seeks to at least provide the public with a
useful choice.
Alternatively or additionally, an embodiment of the
present invention seeks to provide an AR image processing method
and apparatus which can composite and display a CG object in
real time on a digital image of a natural landscape or the like
captured by a camera, at an accurate position, size, and posture
without requiring a manual positioning operation and which can
achieve realistic representation even when the camera is moved
to various positions and in various directions.
[0009a]
6651075_4.doc
The present invention provides an AR image processing
method comprising the steps: storing in a storage an image data
of CG object corresponding to an AR marker; causing a camera
to capture from a first capturing position a first captured
image of a first scene which includes the AR marker and its
surroundings; causing an AR marker recognition based first AR
analyzer to obtain the first captured image, carry out an AR
marker recognition process to find out an AR marker image from
the first captured image and determine a position, posture, and
scale of the AR marker image in a first view volume space defined
in a first coordinate system of the first AR analyzer; causing
the camera at a second position to capture a second captures
image of a second scene; causing a natural feature tracking
based second AR analyzer to obtain the first captured image and
data of the determined position, posture, and scale of the AR
marker image in the first view volume space corresponding to
the first captured image, calculate a position, posture, and
scale of the AR marker image in a second view volume space defined
in a second coordinate system of the second AR analyzer
corresponding to a current position, center axis direction and
field of view of the camera and carry out a natural feature
tracking process between the first and the second captured
images to determine a position, posture, and scale of the AR
marker image in the second view volume space of the second AR
analyzer; causing a CG rendering unit to read out the image data
of the CG object corresponding to the AR marker from the storage,
reproduce an image of the CG object corresponding to the
calculated position, posture, and scale in the second view
volume space and composite the image of the CG object with the
second captured image of the camera; and causing a display unit
6651075_4.doc
to display the composite image.
[0009b]
The term ‘comprising’ as used in this specification and
claims means ‘consisting at least in part of’. When
interpreting statements in this specification and claims which
include the term ‘comprising’, other features besides the
features prefaced by this term in each statement can also be
present. Related terms such as ‘comprise’ and ‘comprised’ are
to be interpreted in similar manner.
[0009c]
The present invention further provides an AR image
processing apparatus comprising: a camera capturing a first
captured image of a scene in a first field of view which includes
an AR marker and its surroundings and a second captured image
of a scene in a second field of view which includes at least
some of the surroundings which included in the first captured
image; a storage storing an image data of a CG object; an AR
marker recognition based first AR analyzer obtaining the first
captured image, carrying out an AR marker recognition process
to find out an AR marker image from the first captured image
and determine a position, posture, and scale of the AR marker
image in a first view volume space defined in a first coordinate
system of this first AR analyzer; a natural feature tracking
based second AR analyzer obtaining the first captured image and
data of the determined position, posture, and scale of the AR
marker image in the first view volume space, calculating a
position, posture, and scale of the AR marker in a second view
volume space defined in a second coordinate system of the second
AR analyzer corresponding to a current position, center axis
direction and field of view of the camera, obtaining the second
6651075_4.doc
captured image, and carrying out a natural feature tracking
process between the first and the second captured images to
determine a position, posture, and scale of the AR marker image
in the second captured image in the second view volume space
of the second AR analyzer; a CG rendering unit reading out the
image data of the CG object corresponding to the AR marker from
the storage, reproducing an image of the CG object corresponding
to the calculated position, posture, and scale in the second
view volume space and compositing the image of the CG object
with the second captured image of the camera; and a display unit
displaying the composite image.
There is disclosed herein an AR image processing method
comprising the steps: obtaining a first captured image of a
scene in a first field of view which is captured by a camera
and which includes an AR marker and its surroundings; causing
a first AR analyzer to analyze the first captured image of the
scene which is captured by the camera and which includes an AR
marker image and its surroundings, determine a position,
posture, and scale of the AR marker image in the first field
of view, and virtually place a corresponding CG object at an
appropriate position in the first field of view corresponding
to the position, posture, and scale of the AR marker image;
causing a second AR analyzer to calculate, for the CG object
virtually placed at the appropriate position in the first field
of view, appearance of the CG object in a second field of view
of the camera in a second captured image subsequently captured
in the second field of view by the camera; causing a CG rendering
unit to composite an image of the CG object in the appearance
in the second field of view, at an appropriate position in the
6651075_4.doc
second captured image of the camera; and causing a display unit
to display the composite image.
Moreover, there is disclosed herein an AR image
processing apparatus comprising: a camera; a first AR analyzer
configured to analyze a first captured image of a scene in a
first field of view which is captured by the camera and which
includes an AR marker and its surroundings, determine a position,
posture, and scale of an AR marker image in the first field of
view, and virtually place a corresponding CG object at an
appropriate position in the first field of view corresponding
to the position, posture, and scale of the AR marker image; a
second AR analyzer configured to calculate, for the CG object
virtually placed at the appropriate position in the first field
of view, appearance of the CG object in a second field of view
of the camera in a second captured image subsequently captured
in the second field of view by the camera; a CG rendering unit
configured to composite an image of the CG object in the
appearance in the second field of view, at an appropriate
position in the second captured image of the camera which is
obtained by the second AR analyzer; and a display unit
configured to display an image composited by the CG rendering
unit.
The AR image processing technique according to an
embodiment of the present disclosure can composite and display
a CG object in real time on a digital image of a natural landscape
or the like captured by a camera, at an accurate position in
an accurate size and posture, without requiring a manual
positioning operation, and can achieve realistic
6651075_4.doc
representation even when the camera is moved to various
positions and in various directions.
[0012a]
In the description in this specification reference may
be made to subject matter which is not within the scope of the
appended claims. That subject matter should be readily
identifiable by a person skilled in the art and may assist in
putting into practice the invention as defined in the presently
appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will now be described, by way of
non-limiting example only, with reference to the accompanying
drawings, in which:
[Fig. 1] Fig. 1 is an explanatory view showing a view volume
space of a first AR analyzer and a view volume space of a second
AR analyzer in the present invention.
[Fig. 2] Fig. 2 is an explanatory view showing a relationship
between the view volume space of the first AR analyzer in the
present invention and coordinates with the position of a marker
image detected in the view volume space as an origin.
[Fig. 3] Fig. 3 is an explanatory view of the AR marker used
in the present invention and a CG object image corresponding
to the AR marker.
[Fig. 4] Fig. 4 is an explanatory view showing the marker
image detected in the view volume space of the first AR analyzer
in the present invention and the CG object corresponding to the
marker image.
[Fig. 5] Fig. 5 is an explanatory view of definition of a
view volume in a general pin-hole camera model.
6651075_4.doc
[Fig. 6] Fig. 6 is a block diagram of an AR image processing
apparatus in one embodiment of the present invention.
[Fig. 7] Fig. 7 is a flowchart of an AR image processing
method in the one embodiment of the present invention.
[Fig. 8A] Fig. 8A is an AR composite image in the embodiment
and is an AR composite image for an image captured at such an
angle that a camera can capture the entire AR marker.
[Fig. 8B] Fig. 8B is an AR composite image in the embodiment
and is an AR composite image for an image captured at such an
upward angle that the camera cannot capture the AR marker.
MODE FOR CARRYING OUT THE INVENTION
An embodiment of the present invention is described below
in detail based on the drawings.
First, principles of the present invention are described.
Generally, in order to analyze a digital image captured by a
camera such as a web camera or a digital video camera with an
AR analyzer and then composite and display a CG object on the
digital image on the basis of position information on an image
of a specific target object in the digital image, the CG object
in a space needs to be subjected to projective transformation
to the digital image. In the AR analyzer which performs such
projective transformation, a 4×4 projection matrix P and a 4×4
model view matrix M need to be created. Projective
transformation of a first AR analyzer A configured to detect
the position of the target object image in the digital image
captured by the camera is expressed as follows.
[Math 1]
Ma'=Sa⋅Pa⋅Ma
6651075_4.doc
Meanwhile, projective transformation of a second AR
analyzer B configured to detect the position of the target
object image in the digital image by tracking natural features
is expressed as follows.
[Math 2]
Mb'= Sb⋅Pb⋅Mb
Here, Sa and Sb are constants and are appropriate scaling
parameters for the digital image onto which the CG object is
projected. Projection matrices Pa and Pb are projection
matrices determined by performing camera calibration in advance,
as camera parameters of the camera used for image capturing.
The matrices Pa, Pb may take values different from each other
depending on the characteristics of the first AR analyzer A and
the second AR analyzer B. This is one of the characteristics
of the present invention.
In view of view volumes 11A, 11B respectively of the AR
analyzers A, B which are geometric schematic views of the
projective transformation as shown in Fig. 1, it is possible
to consider that these two different projective transformation
matrices Pa, Pb share the same normalized screen plane SCR-A,
i.e. projection plane when the same image capturing device
(camera) is used.
Initialization processing of the second AR analyzer B is
performed first in the present invention. Specifically, the
second AR analyzer B which performs natural feature tracking
assumes that the digital image captured by the camera is
6651075_4.doc
projected on the screen plane SCR-A, and determines an initial
model view matrix Mb from the known projection matrix Pb. This
operation uses, for example, a publicly-well-known method in
which an image capturing position of the camera capturing the
image is changed and the position of the camera is estimated
from movement amounts of feature points by using epipolar
geometry.
This initial model view matrix Mb determines the position
and posture of the camera in a coordinate system of the second
AR analyzer B, and the natural feature tracking based AR
analyzer estimates the image capturing position of the camera,
i.e. the model view matrix Mb from the thus-determined initial
position, according to the movement amounts of the captured
feature points.
The model view matrix Mb includes scaling elements.
However, the distance among and the size of natural feature
points observed in the digital image cannot be obtained from
information on the natural feature points. Accordingly, in the
conventional technique, a work of manual correction needs to
be performed while the CG image is composited to represent given
values on the digital image.
However, in the present invention, the following
processing is performed as a subsequent step to solve this
problem. In the aforementioned initialization step of the
second AR analyzer B, the first AR analyzer A uses an AR marker
whose scale, posture, and position are known in advance, to
determine the view volume, i.e. the model view matrix Ma
6651075_4.doc
obtained by the projective transformation Pa, forming the
normalized screen plane SCR-A of the digital image captured by
the camera.
As shown in Fig. 2, this model view matrix Ma has
information on a direction, a size, and marker position
coordinates in a space corresponding to the position of a marker
image MRK detected in the digital image captured in the
projective transformation of the first AR analyzer A, and allows
the image capturing position in the view volume space 11A of
the first AR analyzer A to be determined relative to an origin
O3, where the position of the marker image MRK in the view volume
space is set as the origin O3.
In the present invention, the image capturing position
is determined in terms of only appearance. It is only necessary
that a positional relationship in the digital image is correctly
represented and there is no need to represent a
geometrically-precise position.
From the aforementioned processing, the position,
posture, and scale of the marker image MRK projected on the
screen plane SCR-A are estimated in the coordinate system of
the first AR analyzer A and the initial model view matrix Mb
in the coordinate system of the second AR analyzer B is obtained.
However, generally, the coordinate system (origin O1) of the
first AR analyzer A and the coordinate system (origin O2) of
the second AR analyzer B are interpreted totally differently
and, as shown in Fig. 1, the respective configurations of the
view volumes 11A, 11B including optical center axes are also
6651075_4.doc
different from each other.
In the present invention, the normalized screen planes
SCR-A of the view volumes 11A, 11B are considered to be at the
same position and conversion between both coordinate systems
are performed by using spatial position information on the
screen planes SCR-A as a clue. Mappings projected on the screen
planes SCR-A are thereby matched in terms of appearance. This
means that the position, posture, and size of the actual marker
image MRK which are estimated by the first AR analyzer A
determine parameters of the appropriate position, posture, and
scale for the position information on the natural feature points
mapped on the screen plane SCR-A by the second AR analyzer B.
A translation component of the model view matrix Ma in
the coordinate system of the first AR analyzer A is considered
to represent the origin O3 in spatial coordinates of the AR
marker image MRK while scaling and rotation components thereof
are considered to represent the size and posture of the marker
image MRK in the coordinate space of the first AR analyzer A.
The 4×4 projection matrix of the coordinate system of the
first AR analyzer A is expressed as Pa while the 4×4 model view
matrix is expressed as Ma, and Pa and Ma are assumed to be
determined as follows.
[Math 3]
a0 0 a1 0
0 b0 b1 0
Pa =
0 0 c0 c1
0 0 −1 0
6651075_4.doc
e0 e4 e8 e12
e1 e5 e9 e13
Ma =
e2 e6 e10 e14
e3 e7 e11 e15
a0= 2n/(r−l)
b0= 2n/(t−b)
a1= (r+ l)/(r− l)
b1= (t+b)/(t−b)
c0= −(f +n)/(f −n)
c1= −2fn(f −n)
As shown in Fig. 5, the coordinates of an upper left vertex
of a projection plane PJ-A1 on a near side of the view volume
frustum 11A from the origin O1 in the camera coordinate system
(X, Y, Z) of the first AR analyzer A are (l, t, -n), the
coordinates of a lower left vertex are (l, b, -n), the
coordinates of an upper right vertex are (r, t, -n), coordinates
of a lower right vertex are (r, b, -n), and the distance to a
far-side plane PJ-A2 is expressed as f.
Consideration is given of a case where arbitrary spatial
coordinates
[Math 4]
M[X,Y,Z,1]
in the coordinate system of the first AR analyzer A are
affine-converted to an AR marker observed position in the
digital image which corresponds to the screen plane SCR-A. This
is calculated as follows.
6651075_4.doc
First, a translation vector Tr moving through
[Math 5]
Ma ⋅M[X,Y,Z,1]
to the position of the screen plane SCR-A is expressed as follows
by using the model view matrix Ma and n.
[Math 6]
Tr(−e12,−e13,−e14,+n)
A scaling parameter s in consideration of the projective
transformation by the projection matrix Pa is expressed as
follows.
[Math 7]
s = −(1/e14⋅Vb)/(t −b)
Here, Vb is a constant and is a height scale of the screen
plane SCR-A.
A movement amount Tp at the position of the screen plane
SCR-A in consideration of a deflection component of the optical
center axis is expressed as follows.
[Math 8]
Tp(sx/sz⋅Vb⋅ Ax,sy/sz⋅Vb,0)
where sx = a0⋅e12+ a1⋅e14
sy =b0⋅e13+b1⋅e14
sz =c0⋅e14+e14
Here, Ax is a constant expressing an aspect ratio of the
screen plane SCR-A in a horizontal direction. Ax takes a value
of 16/9 if the digital image is a 16:9 image, and takes a value
6651075_4.doc
of 4/3 if the digital image is a 4:3 image.
Mp represents a 4×4 matrix which is used to affine-convert
the arbitrary spatial coordinates
[Math 9]
M[X,Y,Z,1]
in the coordinate system of the first AR analyzer A to the AR
marker observed position in the digital image which corresponds
to the screen plane SCR-A, in homogeneous coordinate
representation by using the parameters described above. By
using [Tp] and [Tr] which are 4×4 matrix homogeneous coordinate
representation of the translation vector, Mp is expressed as
follows.
[Math 10]
Mp =[Tp]⋅s⋅[Tr]⋅Ma
Accordingly, in the coordinate system of the first AR
analyzer A, ma' expressing a mapping of
[Math 11]
M[X,Y,Z,1]
to the screen plane SCR-A can be calculated as follows.
[Math 12]
ma'=Mq⋅Mp⋅M[X,Y,Z,1]
Focusing only on the origin O3 of the marker coordinates,
ma' is calculated as follows.
[Math 13]
ma'=Mq⋅Mp⋅[0,0,0,1]
6651075_4.doc
Here, it is considered that the mapping ma' to the screen
plane SCR-A can be observed in the same fashion also in the
coordinate system of the second AR analyzer B. In this case,
like Pa, the projective transformation matrix Pb of the second
AR analyzer B is defined as follows.
[Math 14]
a0 0 a1 0
0 b0 b1 0
Pb =
0 0 c0 c1
0 0 −1 0
Moreover, as in the case of Pa, vertex parameters of the
view volume 11B of the second AR analyzer B can be calculated
as follows.
[Math 15]
r = n(a1+1)/a0
l = n(a1−1)/a0
t = n(b1+1)/b0
b=n(b1−1)/b0
In a case where the first AR analyzer A and the second
AR analyzer B respectively use the digital images of the same
aspect ratio, projection planes PJ-A, PJ-B respectively of the
view volumes 11A, 11B also have the same aspect ratio.
Accordingly, if S' represents a ratio of scaling interpretation
between the first AR analyzer A and the second AR analyzer B,
it is possible to consider as follows.
6651075_4.doc
[Math 16]
S'= Pb[n(b1+1)/b0]/Pa[n(b1+1)/b0]
Note that Pb[n(b1+1)/b0] represents parameters of Pb in
the coordinate system of the second AR analyzer B while
Pa[n(b1+1)/b0] represents parameters of Pa in the coordinate
system of the second AR analyzer B.
This is directly considered to be the difference in the
scaling interpretation between the first AR analyzer A and the
second AR analyzer B.
When the position of the marker image MRK estimated in
the coordinate system of the first AR analyzer A is considered
to represent the origin position O3 of the spatial coordinates
in the coordinate system of the second AR analyzer B, the origin
position [0,0,0,1] of the coordinate system of the second AR
analyzer B can be observed as ma' by the projective
transformation of the second AR analyzer B. Accordingly,
[Math 17]
S'⋅ma'=Mo⋅Mb[0,0,0,1]
is set. Here, Mo is a 4×4 constant matrix.
Since ma' is known in the formula described above, the
constant matrix Mo can be determined from the following formula.
[Math 18]
Mo =S'⋅ma'/Mb⋅[0,0,0,1]
When the offset matrix Mo is applied to the projective
6651075_4.doc
transformation of the second AR analyzer B, the following
formula can be determined.
[Math 19]
Mb'=Sb⋅Pb⋅Mo⋅Mb
The constant matrix Mo determined as described above is
an offset matrix which represents the posture and the size of
the marker image MRK at an origin in the projective
transformation Mb of the second AR analyzer B, where the
position of the marker image MRK analyzed by the first AR
analyzer A is set as the origin. In the second AR analyzer B
which performs conventional natural feature tracking, a user
manually determines this offset matrix while viewing a
composite screen.
Next, an AR image processing apparatus of one embodiment
of the present invention and an AR image processing method
performed by this apparatus are described by using Figs. 6 and
7. Fig. 6 shows a configuration of the AR image processing
apparatus of the embodiment. The AR image processing apparatus
is mainly formed of a camera 1, an AR marker recognition based
first AR analyzer 3A, a natural feature tracking based second
AR analyzer 3B, a CG rendering unit 5, and a display unit 7.
The AR marker recognition based first AR analyzer 3A
analyzes a captured image of a scene in a field of view which
is captured by the camera 1 and which includes the AR marker
image MRK, determines the position, posture, and scale of the
AR marker image MRK in the field of view, reproduces a
6651075_4.doc
corresponding CG object OBJ at an appropriate position in the
view volume 11A of the camera 1 corresponding to the position,
posture, and scale of the AR marker image, and determines the
coordinates of the AR marker image MRK. The first AR analyzer
3A includes a storage part 3A1 configured to store pieces of
data required for this processing of the storage part 3A1, a
camera calibration part 3A3, an AR marker image analyzing part
3A5, an affine conversion matrix determination part 3A7, a
mapping processing part 3A9, and a projective transformation
processing part 3A11. Spatial coordinate data of the AR marker
image in the view volume space 11A of the first AR analyzer 3A
which is figured out by the projective transformation
processing part 3A11 is outputted to the second AR analyzer 3B.
The second AR analyzer 3B is a natural feature tracking
based AR analyzer and includes a storage part 3B1 configured
to store pieces of data, a camera calibration part 3B3, an
initialization processing part 3B5 configured to perform
initialization processing of the second AR analyzer 3B, a model
view matrix estimation part 3B7, a projective transformation
processing part 3B9, and an offset matrix determination part
3B11.
The CG rendering unit 5 includes a storage part 51
configured to store pieces of data, a camera image input part
53 configured take in the image captured by the camera 1, a CG
object image generation part 55 configured to generate a CG
object image by using the offset matrix Mo of the second AR
analyzer 3B, and a CG image composition part 57. The CG image
composition part 57 of the CG rendering unit 5 composites the
6651075_4.doc
camera captured image of the camera image input part 53 and the
object image of the CG object image generation part 55 with each
other and outputs a composite image to the display unit 7.
As shown in Fig. 8B, the display unit 7 displays an image
in which the CG object OBJ is composited on the image captured
in the current field of view of the camera 1 at a corresponding
position in a corresponding posture.
Next, the AR image processing method performed by the
aforementioned AR image processing apparatus is described by
using Fig. 7. In summary, the AR image processing method of
the embodiment is characterized in that the method includes:
causing the camera 1 to capture a scene in the field of view
which includes the AR marker MRK and its surroundings: causing
the first AR analyzer 3A to analyze the captured image of the
scene which is captured by the camera 1 and which includes the
AR marker image MRK and its surroundings, determine the position,
posture, and scale of the AR marker image MRK in the view volume
11A, virtually place the corresponding CG object OBJ at an
appropriate position in the view volume space corresponding to
the position, posture, and scale of the AR marker image ARK;
causing the second AR analyzer 3B to calculate the appearance
of the CG object OBJ in the field of view of the camera for the
image currently being captured by the camera 1; compositing the
CG object OBJ in appropriate appearance at an appropriate
position in the image captured by the camera 1; and displaying
the composite image on the display 7.
To be more specific, the following steps are performed.
6651075_4.doc
STEP 1: The CG object corresponding to the AR marker is
stored.
STEP 3: The camera parameters Pa, Pb are calculated
through camera calibration respectively in the first AR
analyzer 3A and the second AR analyzer 3B, and are stored
respectively in the storage parts 3A1, 3B1.
STEP 5: In the second AR analyzer 3B, the initialization
processing is performed to determine the model view matrix Mb
and the model view matrix Mb stored.
The steps described above are included in preprocessing.
STEP 7: A scene including the AR marker MRK is captured
by the camera 1 and the captured image is inputted to the first
AR analyzer 3A.
STEPS 9, 11: In the first AR analyzer 3A, the AR marker
image MRK is found from the captured image, the position,
posture, and scale of the AR marker image MRK are figured out,
and the view model matrix Ma is determined.
STEP 13: In the first AR analyzer 3A, the AR marker image
MRK is projected onto the screen SCR-A by using the matrices
Pa, Ma and a result of the projection is outputted to the second
AR analyzer 3B.
STEP 15: In the second AR analyzer 3B, the offset matrix
6651075_4.doc
Mo of the marker image MRK is determined.
STEP 17: In the second AR analyzer, the appearance
(position, posture, and scale) of the CG object corresponding
to the current position of the camera and a center axis direction
is determined, the CG object is projected onto the screen plane
SCR-A, and a result of the projection is outputted to the CG
rendering unit 5.
STEP 19: Image data of the CG object OBJ is read from the
storage part 51, an image of the shape of the CG object as viewed
at the current camera angle for the CG object is generated by
using data of the projective transformation matrix from the
second AR analyzer 3B, and this image is CG composited at a
corresponding spatial coordinate position in the image
currently captured by the camera 1.
STEP 21: The composite image is displayed on the display
unit 7.
In the embodiment of the present invention, the marker
recognition based first AR analyzer 3A can automatically
determine the position, posture, size of the target marker image
MRK, and the natural feature tracking based second AR analyzer
3B can continue position estimation even when the marker image
MRK is out of the screen. Accordingly, as shown in Fig. 8, it
is possible to composite and display, in real time, the CG object
OBJ on a natural landscape in a digital image captured by the
camera 1, at a correct position in a correct size and a correct
posture without requiring a manual positioning operation, and
6651075_4.doc
to move the camera 1 to various positions and in various
directions. In Fig. 8A, the CG object OBJ corresponding to the
marker image MRK in which almost the entire AR marker is captured
is composited and displayed for this marker image MRK. In an
upper right portion of the screen, a small portion of a lower
section of a front bumper of a car CAR is also captured. In
this case, even when the camera is moved upward and set to a
camera angle in which no AR marker is included in the screen,
as shown in Fig. 8B, the CG object OBJ can be composited and
displayed on the camera captured image at a position and in a
posture as viewed from the moved camera. Specifically, the CG
object OBJ shown in the image of Fig. 8A is displayed in a manner
viewed in a line of sight from a higher position, in the CG
composite image of Fig. 8B. Moreover, in Fig. 8B, it is also
notable that almost the entire car CAR is captured in the image
due to the upward movement of the camera 1.
EXPLANATION OF THE REFERENCE NUMERALS
MRK AR marker (image)
OBJ CG object
1 Fixed camera
3A First AR analyzer
3B Second AR analyzer
CG rendering unit
7 Display unit
6651075_4.doc
Claims (2)
- [Claim 1] An AR image processing method comprising the steps: storing in a storage an image data of a CG object corresponding to an AR marker; causing a camera to capture from a first capturing position a first captured image of a first scene which includes the AR marker and its surroundings; causing an AR marker recognition based first AR analyzer to obtain the first captured image, carry out an AR marker recognition process to find out an AR marker image from the first captured image and determine a position, posture, and scale of the AR marker image in a first view volume space defined in a first coordinate system of the first AR analyzer; causing the camera at a second position to capture a second captured image of a second scene; causing a natural feature tracking based second AR analyzer to obtain the first captured image and data of the determined position, posture, and scale of the AR marker image in the first view volume space corresponding to the first captured image, calculate a position, posture, and scale of the AR marker image in a second view volume space defined in a second coordinate system of the second AR analyzer corresponding to a current position, center axis direction and field of view of the camera and carry out a natural feature tracking process between the first and the second captured images to determine a position, posture, and scale of the AR marker image in the second view volume space of the second AR analyzer; causing a CG rendering unit to read out the image data 6651075_4.doc of the CG object corresponding to the AR marker from the storage, reproduce an image of the CG object corresponding to the calculated position, posture, and scale in the second view volume space and composite the image of the CG object with the second captured image of the camera; and causing a display unit to display the composite image.
- [Claim 2] An AR image processing apparatus comprising: a camera capturing a first captured image of a scene in a first field of view which includes an AR marker and its surroundings and a second captured image of a scene in a second field of view which includes at least some of the surroundings which included in the first captured image; a storage storing an image data of a CG object; an AR marker recognition based first AR analyzer obtaining the first captured image, carrying out an AR marker recognition process to find out an AR marker image from the first captured image and determine a position, posture, and scale of the AR marker image in a first view volume space defined in a first coordinate system of this first AR analyzer; a natural feature tracking based second AR analyzer obtaining the first captured image and data of the determined position, posture, and scale of the AR marker image in the first view volume space, calculating a position, posture, and scale of the AR marker in a second view volume space defined in a second coordinate system of the second AR analyzer corresponding to a current position, center axis direction and field of view of the camera, obtaining the second captured image, and carrying out a natural feature tracking process between the first and 6651075_4.doc
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2012036628A JP5872923B2 (en) | 2012-02-22 | 2012-02-22 | AR image processing apparatus and method |
JP2012-036628 | 2012-02-22 | ||
PCT/JP2012/078177 WO2013125099A1 (en) | 2012-02-22 | 2012-10-31 | Augmented reality image processing device and method |
Publications (2)
Publication Number | Publication Date |
---|---|
NZ628782A NZ628782A (en) | 2016-04-29 |
NZ628782B2 true NZ628782B2 (en) | 2016-08-02 |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2864988C (en) | Ar image processing apparatus and method | |
US9965870B2 (en) | Camera calibration method using a calibration target | |
US8970690B2 (en) | Methods and systems for determining the pose of a camera with respect to at least one object of a real environment | |
US10068344B2 (en) | Method and system for 3D capture based on structure from motion with simplified pose detection | |
Tian et al. | Handling occlusions in augmented reality based on 3D reconstruction method | |
KR20120002261A (en) | Apparatus and method for providing 3d augmented reality | |
TW201715476A (en) | Navigation system based on augmented reality technique analyzes direction of users' moving by analyzing optical flow through the planar images captured by the image unit | |
JP6589636B2 (en) | 3D shape measuring apparatus, 3D shape measuring method, and 3D shape measuring program | |
US10460466B2 (en) | Line-of-sight measurement system, line-of-sight measurement method and program thereof | |
Marto et al. | DinofelisAR demo augmented reality based on natural features | |
CN112116631A (en) | Industrial augmented reality combined positioning system | |
JP6061334B2 (en) | AR system using optical see-through HMD | |
WO2013125098A1 (en) | System and method for computer graphics image processing using augmented reality technology | |
Deng et al. | Registration of multiple rgbd cameras via local rigid transformations | |
KR101522842B1 (en) | Augmented reality system having simple frame marker for recognizing image and character, and apparatus thereof, and method of implementing augmented reality using the said system or the said apparatus | |
KR20120091749A (en) | Visualization system for augment reality and method thereof | |
JP6166631B2 (en) | 3D shape measurement system | |
NZ628782B2 (en) | Ar image processing apparatus and method | |
Schweighofer et al. | Online/realtime structure and motion for general camera models | |
Miezal et al. | Towards practical inside-out head tracking for mobile seating bucks | |
JP5168313B2 (en) | Image display device | |
Purnomo et al. | Improved Tracking Capabilities With Collaboration Multimarker Augmented Reality | |
JP2013222190A (en) | Method for rendering solid on screen by camera and face recognition system |