WO2020063100A1 - 增强现实的图像展示方法、装置及设备 - Google Patents

增强现实的图像展示方法、装置及设备 Download PDF

Info

Publication number
WO2020063100A1
WO2020063100A1 PCT/CN2019/098557 CN2019098557W WO2020063100A1 WO 2020063100 A1 WO2020063100 A1 WO 2020063100A1 CN 2019098557 W CN2019098557 W CN 2019098557W WO 2020063100 A1 WO2020063100 A1 WO 2020063100A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera module
human eye
scene
person
position information
Prior art date
Application number
PCT/CN2019/098557
Other languages
English (en)
French (fr)
Inventor
周岳峰
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2020063100A1 publication Critical patent/WO2020063100A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/08Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation

Definitions

  • This specification relates to the field of image processing technology, and in particular, to an image display method, device, and device for augmented reality.
  • Augmented Reality can refer to the technology that allows the virtual world to combine and interact with the real world scene through the position and angle of the camera and the image analysis technology. This technology can superimpose real environment and virtual objects on the same picture and coexist, thus giving users a sensory experience beyond reality.
  • the camera position can be used as one of the rendering parameters to render a three-dimensional model obtained from a virtual object and a real scene to obtain a projected image.
  • the projected image shown is related to the pose of the mobile device, and the device cannot respond to the situation where the device is stationary and the photographer is moving.
  • this specification provides an image display method, device, and device for augmented reality.
  • an image display method for augmented reality includes:
  • Determining human eye position information based at least on the relative position of the human eye and the person camera module and the position information of the scene camera module;
  • the three-dimensional model is based on combining a virtual object with a real scene scanned by the scene camera module And get.
  • the method is applied to an electronic device, the person camera module includes a front camera of the electronic device, and the scene camera module includes a rear camera of the electronic device.
  • the step of constructing the three-dimensional model includes:
  • the virtual object is superimposed on the scene model to obtain a three-dimensional model.
  • the determining human eye position information based on at least a relative position of a human eye and a person camera module and position information of a scene camera module includes:
  • the position information of the human eye is calculated by combining the relative position of the human eye and the scene camera module and the position information of the scene camera module.
  • the method further includes:
  • the relative position of the human eye and the human camera module has been changed according to the current relative position of the human eye and the human camera module and the relative position of the human eye and the human camera module obtained last time.
  • an image display device for augmented reality including:
  • a relative position determining module configured to: obtain a person image collected by a person camera module, and determine a relative position between the human eye and the person camera module by using a relationship between a human eye area in the person image and the person image;
  • the human eye position determining module is configured to determine human eye position information based on at least the relative position of the human eye and the person camera module and the position information of the scene camera module;
  • An image rendering module is configured to: use human eye position information as position information of a scene camera module in rendering parameters to render a three-dimensional model to obtain a projection image projected on a display screen, the three-dimensional model is based on photographing a virtual object and a scene Combined with the real scenes scanned by the module.
  • the device is provided in an electronic device, the person camera module includes a front camera of the electronic device, and the scene camera module includes a rear camera of the electronic device.
  • the apparatus further includes a three-dimensional model construction module, configured to:
  • the virtual object is superimposed on the scene model to obtain a three-dimensional model.
  • the human eye position determination module is specifically configured to:
  • the position information of the human eye is calculated by combining the relative position of the human eye and the scene camera module and the position information of the scene camera module.
  • the device further includes a location judgment module, configured to:
  • the relative position of the human eye and the human camera module has been changed according to the current relative position of the human eye and the human camera module and the relative position of the human eye and the human camera module obtained last time.
  • a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein when the processor executes the program, the program is implemented as The method of any of the above.
  • the embodiment of the present specification obtains a person image collected by a person camera module, and uses a relationship between a human eye area in the person image and the person image to determine a relative position between the human eye and the person camera module; and based on at least the human eye and the person camera module
  • the relative position of the camera and the position information of the scene camera module determine the position information of the human eye; the position information of the scene camera module is used as the position information of the scene camera module in the rendering parameters, and the three-dimensional model is combined with other rendering parameters to obtain a projection on the display screen.
  • the projected image of the camera is used to change the augmented reality display content from the projected image of the camera module perspective to the projected image of the human eye perspective, so that the projected image changes in accordance with the change of the position of the human eye.
  • FIG. 1 is a schematic diagram of shooting an AR scene according to an exemplary embodiment of the present specification.
  • Fig. 2 is a flowchart of an augmented reality image display method according to an exemplary embodiment of the present specification.
  • Fig. 3A is a flowchart illustrating another method for displaying an augmented reality image according to an exemplary embodiment of the present specification.
  • FIG. 3B is a schematic diagram illustrating a display position comparison of a virtual object according to an exemplary embodiment of the present specification.
  • Fig. 4 is a hardware structural diagram of a computer device in which an image display apparatus for augmented reality is shown according to an exemplary embodiment of the present specification.
  • Fig. 5 is a block diagram of an augmented reality image display device according to an exemplary embodiment of the present specification.
  • first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
  • first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information.
  • word “if” as used herein can be interpreted as “at” or "when” or "in response to determination”.
  • Augmented Reality (AR) technology is a new technology that seamlessly integrates real world information and virtual world information. This technology can apply virtual information to the real world, real environment, and virtual objects through computer technology. Real-time superimposed on the same picture or space at the same time.
  • a common application scenario of AR technology is that a user captures a real environment through a camera module in a mobile device such as a handheld or wearing, and software providing AR services can render one or more on the initial image data based on the captured initial image data Virtual object.
  • the key to realizing the above scenarios is how to combine virtual objects with the actual scenes that were actually shot.
  • software that can provide AR services can pre-configure one or more models of corresponding virtual objects, and each virtual object model specifies the virtual object. Corresponding state evolution rules to determine different motion states of the virtual object.
  • the software can also determine the position of the virtual object in the real scene according to the image data captured by the device, and then determine where the virtual object is rendered on the image data. After successful rendering, the user can view The environment is superimposed with pictures of virtual objects.
  • rendering is performed from the perspective of a camera module.
  • the enhanced display solution relies on the device's gyroscope, acceleration, and gravity sensors to sense changes in device angle. Therefore, if the camera module does not move and the photographer / viewer moves, the imaged image will not respond accordingly, and the sense of substitution and stereoscopic effect will be poor.
  • FIG. 1 it is a schematic diagram of AR scene shooting provided by this specification according to an exemplary embodiment.
  • a virtual object is used as a puppy, and an AR system uses an ordinary display as an example.
  • the user can see the fusion effect of the real environment and the virtual object from the display screen without wearing any display device.
  • the photographer / viewer uses the rear camera of the mobile phone to capture the realistic scene, and displays the projected image including the puppy on the screen of the mobile phone.
  • the photographer / viewer keeps the phone still and the relative position of the eyes and the phone changes, the picture displayed on the screen of the phone will not change.
  • this specification provides an augmented reality image display method, by changing the device's augmented reality display content from a composite image from a camera perspective to a composite image from a human eye perspective, thereby making the displayed image closer to the effect of the human eye perspective, Enhance the sense of three-dimensional and substitution.
  • the reason why the camera module can take photographs and images is mainly to form an image of the subject by the lens and project it on the imaging surface of the camera tube or solid-state imaging device.
  • the camera lens can cover a wide range of scenes, which is usually expressed by an angle, which can be called the lens's angle of view.
  • the viewing angles of the human eyes in the embodiments of the present specification do not refer to all the viewing angles that human eyes can see, but may be the viewing angles that can be seen through a display screen.
  • FIG. 2 it is a flowchart of an augmented reality image display method shown in this specification according to an exemplary embodiment.
  • the method includes:
  • step 202 a person image collected by a person camera module is acquired, and a relationship between a human eye area in the person image and the person image is used to determine a relative position between the human eye and the person camera module;
  • step 204 determining human eye position information based on at least the relative position of the human eye and the person camera module and the position information of the scene camera module;
  • step 206 the position information of the scene camera module in the rendering parameters is used to render the three-dimensional model to obtain a projection image projected on the display screen.
  • the three-dimensional model is based on the virtual object and the scene camera module. Combined by scanning real-world scenes.
  • the person camera module and the scene camera module are different shooting modules, and the shooting areas of the two shooting modules are different.
  • the shooting direction of the person camera module and the scene camera module are opposite.
  • the camera of the person camera module and the display screen are on the same side of the electronic device, and even the camera lens of the person camera module is on the same plane as the display screen.
  • two Each camera module is located in the same electronic device.
  • the person camera module can be a front camera
  • the scene camera module can be a rear camera, so that the rear camera is used for augmented reality applications, and the front camera is used to assist AR enhancement.
  • the person shooting module and the scene shooting module are both shooting modules, but they are named differently to distinguish different shooting modules.
  • some terminals may have display screens on the front and back, so the rear camera can be used as a person camera module, and the front camera can be used as a scene camera module; or, the person camera module and the scene camera module are set on Camera modules on different devices, etc.
  • the human eye position information is used to indicate the position of the human eye of the photographer / viewer in space, and may be three-dimensional coordinates of the human eye in the world coordinate system or the scene camera module coordinate system.
  • Step 202 and step 204 describe how to determine human eye position information.
  • the relative position of the human eye and the person camera module may be determined first, and then the position information of the human eye may be determined according to the relative position of the human eye and the person camera module and the position information of the scene camera module.
  • the person camera module can collect person images, especially images of photographers within a range that can be captured by the person camera module.
  • the relative position of the human eye and the person camera module may be a relative pose, including a relative distance and a relative direction.
  • the relative position can be represented by a vector with a direction.
  • the relative position of the human eye and the person camera module can be obtained by performing face detection on the person image using a face detection algorithm. For example, the human face area in a person image may be detected first, and then the human eye area may be determined from the human face area according to the relationship between the human eye and the human face, and the relative position between the human eye and the human camera module may be determined according to the relationship between the human eye area and the image. .
  • a deep learning training model may be used to determine the relative position of the human eye and the person camera module.
  • a training sample may be constructed with a person image indicating the relative position of the human eye and the camera module, and the preset initial model is trained by using the training sample to obtain a detection model for detecting the relative position of the human eye and the camera module.
  • the detection model is used to detect the image to be detected to obtain the relative position of the human eye and the camera module.
  • each group of training samples may also include other sample features that are helpful for improving the relative position detection result, such as a face region frame.
  • other methods may also be adopted to obtain the relative positions of the human eyes and the person camera module through recognition of the person image, which are not described in detail here.
  • the position information of the scene camera module is used to indicate the position of the scene camera module in space, and may be three-dimensional space coordinates of the scene camera module in the world coordinate system or the scene camera module coordinate system, for example, the position information of the scene camera module It can be obtained by performing camera calibration on the scene camera module. It can be understood that the position of the human eye and the position of the scene camera module are coordinates in the same coordinate system.
  • a geometric model of the camera imaging can be established, and the geometric model parameters are camera parameters.
  • Camera parameters can include internal parameters, external parameters, distortion parameters, and so on.
  • the calibration methods in related technologies can be used to calibrate the camera, such as linear calibration method, nonlinear optimization calibration method, Tsai's classic two-step calibration method, etc., which are not limited here.
  • the human eye position information may be determined based on at least the relative position of the human eye and the person camera module and the position information of the scene camera module.
  • the setting position of the person camera module is close to the setting position of the scene camera module, and the relative position between the two modules can be ignored. Especially for the situation where the person camera module and the scene camera module are facing away from each other, the relative position between the two modules can be ignored. Therefore, the person can be directly determined based on the relative position of the human eye and the person camera module and the position information of the scene camera module. Eye position information. For example, assuming that the position of the rear camera in the scene is X and the position of the human eye relative to the front camera is Y, the position of the human eye may be X + Y and the orientation may be -Y.
  • the human eye position information may also be determined by combining the relative positions of the person camera module and the scene camera module.
  • the relative positions of the person camera module and the scene camera module are fixed, and can be determined based on the device information of the device where the person camera module and the scene camera module are located.
  • determining the position information of the human eye based on at least the relative position of the human eye and the person camera module and the position information of the scene camera module may include:
  • the position information of the human eye is calculated by combining the relative position of the human eye and the scene camera module and the position information of the scene camera module.
  • the relative positions of the human eye and the scene camera module can be obtained through the relative positions of the person camera module and the scene camera module, thereby improving the accuracy of the human eye position information.
  • the embodiment of the present specification intends to replace the camera perspective with the perspective of the human eye, so as to dynamically render the background scene (real scene) and virtual objects with the perspective of the human eye to enhance the sense of three-dimensionality and substitution. Therefore, the human eye position information is used as the position information of the scene camera module in the rendering parameters to render a three-dimensional model to obtain a projection image projected on the display screen.
  • the three-dimensional model is based on the virtual object and the real scene scanned by the scene camera module Combined and obtained, the rendering parameters are the parameters needed to render the 3D model.
  • the most important rendering parameters include camera position and projection surface information, and this embodiment mainly adjusts the camera position in the rendering parameters to adjust the camera angle to the human eye angle.
  • the position information of the scene camera module in the rendering parameters can be used as the position information of the human eye to replace the perspective of the scene camera module with the human eye perspective, and the three-dimensional model can be rendered using the adjusted rendering parameters.
  • the projection surface information in the rendering parameters may be determined according to the display screen information.
  • the rendering parameters also include other parameters required during rendering, such as lighting parameters, which are not listed here one by one.
  • the rendering parameters can be adjusted by the position of the human eye, the rendering parameters and the three-dimensional model are input to the rendering module, and the projection image is rendered by the rendering module.
  • the three-dimensional model in this specification may be a model obtained based on a combination of a virtual object and a scene scanned and displayed by the scene camera module.
  • the three-dimensional model is obtained based on scene modeling and virtual object overlay.
  • One of the three-dimensional model construction methods is listed below.
  • the three-dimensional model construction steps may include:
  • the virtual object is superimposed on the scene model to obtain a three-dimensional model.
  • the scene model is also called a space model, and includes but is not limited to an initial scene model for implementing augmented reality.
  • a scene model may be obtained by performing three-dimensional reconstruction on a real scene.
  • Three-dimensional reconstruction (3D Reconstruction) is to build a 3D model of an object in a real scene from the input data.
  • Vision-based 3D reconstruction can refer to acquiring data images of scene objects through a camera, analyzing and processing the images, and then deriving 3D information about objects in the real environment with computer vision knowledge.
  • a two-dimensional image can be used as an input to reconstruct a three-dimensional scene model in the scene.
  • a three-dimensional model of the object can be reconstructed.
  • the scene camera module may be a depth camera.
  • each frame of data scanned by the depth camera includes not only the color RGB images of the points in the scene, but also the distance value of each point from the vertical plane where the depth camera is located.
  • the distance value may be called a depth value, and the depth values together form a depth image of the frame.
  • a depth image can be understood as a grayscale image, where the grayscale value of each point in the image represents the depth value of the point, that is, the true distance of the point in the real world to the vertical plane where the camera is located. Therefore, the RGB image and depth image collected by the depth camera can be used as input to reconstruct a three-dimensional scene model in the scene.
  • 3D reconstruction In the process of 3D reconstruction, it may involve processes such as image acquisition, camera calibration, feature extraction, stereo matching, and 3D reconstruction. Since the 3D reconstruction technology is a relatively mature existing technology, it will not be repeated here. For example, methods such as instant localization and map construction (SLAM) can be used to implement 3D reconstruction of real scenes.
  • SLAM instant localization and map construction
  • a virtual object After obtaining the scene model, a virtual object can be filtered based on a preset overlay strategy, and the position where the virtual object needs to be superimposed can be superimposed on the scene model to obtain a three-dimensional model.
  • the preset overlay strategy may be a strategy for determining the content and location to be enhanced, which is not limited herein.
  • the human camera module is used to locate the human eye, so that the real scene and virtual objects are dynamically rendered from the human eye perspective, which can occur at the position of the human eye and the camera module When changing, the displayed projected image responds appropriately, enhancing the sense of three-dimensionality and substitution.
  • the human eye and the person camera are determined according to the current relative position of the human eye and the person camera module and the relative position of the human eye and the person camera module obtained last time.
  • the relative position of the module has changed.
  • steps 204 and 206 are performed, and when the relative position of the human eye and the person camera module is not changed, steps 204 and 206 are not performed, thereby avoiding real-time calculations. Waste of resources.
  • FIG. 3A it is a flowchart of another augmented reality image display method shown in this specification according to an exemplary embodiment.
  • the method can be applied to a mobile device, and the augmented reality display content of the mobile device is changed from a composite image from a camera perspective to a composite image from a human eye perspective.
  • the method may include:
  • step 302 a three-dimensional reconstruction of the scene on the back of the device is performed through an image collected by a rear camera, and a virtual object is superimposed to obtain a three-dimensional model.
  • step 304 the human eye position of the user is detected using a face detection algorithm through the image collected by the front camera.
  • step 306 according to the position of the human eye, the projection of the reconstructed three-dimensional scene on the device screen and the projection of the virtual object on the device screen position are recalculated, and a projection image is obtained.
  • step 308 the projected image is displayed on the device screen.
  • FIG. 3A is similar to the related technology in FIG. 2, and details are not described herein.
  • the display position of the virtual object in this embodiment is compared with the display position of the virtual object in the prior art with reference to FIG. 3B.
  • the angle of view of the rear camera is often greater than the angle of view of the objects that the human eye can see through the screen frame (referred to as the angle of view of the human eye). Therefore, the occlusion area of the virtual object under the camera angle is greater than the occlusion area of the virtual object under the human eye angle.
  • 32 indicates the display position of the virtual object on the screen after using the solution of this embodiment
  • 34 indicates the display position of the virtual object on the screen after using the prior art solution.
  • the display content is adjusted by judging the position of the human eye by the front camera, so that the display scene is closer to the effect of the human eye angle of view, and the sense of substitution and three-dimensionality is stronger.
  • 3D scene reconstruction to model the background can better respond to changes in human eye position and display backgrounds at different angles.
  • using the three-dimensional scene reconstruction method can make a more appropriate response to the scene where the device is stationary and the background is moving.
  • this specification also provides embodiments of an image display device for augmented reality and an electronic device to which it is applied.
  • the embodiment of the augmented reality image display device of the present specification can be applied to a computer device.
  • the device embodiments may be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory and running it through the processor of the computer equipment in which it is located.
  • FIG. 4 it is a hardware structure diagram of the computer equipment where the augmented reality image display device is located in this specification, except for the processor 410, network interface 420, memory 430, and non-processor shown in FIG. 4.
  • the computer equipment in which the augmented reality image display device 431 is located in the embodiment may generally include other hardware according to the actual function of the equipment, and details are not described herein again.
  • FIG. 5 it is a block diagram of an image display apparatus for augmented reality shown in this specification according to an exemplary embodiment.
  • the apparatus includes:
  • the relative position determining module 52 is configured to: obtain a person image collected by a person camera module, and determine a relative position between the human eye and the person camera module by using a relationship between a human eye area in the person image and the person image;
  • the human eye position determining module 54 is configured to determine human eye position information based on at least the relative position of the human eye and the person camera module and the position information of the scene camera module;
  • An image rendering module 56 is configured to: use human eye position information as the position information of the scene camera module in the rendering parameters to render a three-dimensional model to obtain a projection image projected on a display screen, the three-dimensional model based on the virtual object and the scene It is obtained by combining the real scenes scanned by the camera module.
  • the device is provided in an electronic device, the person camera module includes a front camera of the electronic device, and the scene camera module includes a rear camera of the electronic device.
  • the device further includes a three-dimensional model building module (not shown in FIG. 5), for:
  • the virtual object is superimposed on the scene model to obtain a three-dimensional model.
  • the human eye position determination module is specifically configured to:
  • the position information of the human eye is calculated by combining the relative position of the human eye and the scene camera module and the position information of the scene camera module.
  • the device further includes a position determination module (not shown in FIG. 5), configured to:
  • the relative position of the human eye and the human camera module has been changed according to the current relative position of the human eye and the human camera module and the relative position of the human eye and the human camera module obtained last time.
  • the relevant part may refer to the description of the method embodiment.
  • the device embodiments described above are only schematic, and the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, which may be located in One place, or can be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the solution in this specification. Those of ordinary skill in the art can understand and implement without creative efforts.
  • an embodiment of the present specification further provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the following method when the program is executed:
  • Determining human eye position information based at least on the relative position of the human eye and the person camera module and the position information of the scene camera module;
  • the three-dimensional model is based on combining a virtual object with a real scene scanned by the scene camera module And get.
  • a computer storage medium stores program instructions in the storage medium, and the program instructions include:
  • Determining human eye position information based at least on the relative position of the human eye and the person camera module and the position information of the scene camera module;
  • the three-dimensional model is based on combining a virtual object with a real scene scanned by the scene camera module And get.
  • the embodiments of the present specification may take the form of a computer program product implemented on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing program code.
  • Computer-usable storage media includes permanent and non-permanent, removable and non-removable media, and information can be stored by any method or technology.
  • Information may be computer-readable instructions, data structures, modules of a program, or other data.
  • Examples of computer storage media include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media may be used to store information that can be accessed by computing devices.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technologies
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disc
  • Magnetic tape cartridges magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media may be used to store information that can be accessed

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本说明书实施例提供一种增强现实的图像展示方法、装置及设备,本说明书实施例获取人物摄像模块所采集的人物图像,利用人物图像中的人眼区域与其所在图像的关系,确定人眼与人物摄像模块的相对位置;并至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;以人眼位置信息作为渲染参数中场景摄像模块的位置信息,并对三维模型进行渲染,获得投影在显示屏幕上的投影图像,实现将增强现实展示内容由摄像模块视角的投影图像更改为人眼视角的投影图像,使投影图像跟随人眼位置的变更而变更。

Description

增强现实的图像展示方法、装置及设备 技术领域
本说明书涉及图像处理技术领域,尤其涉及增强现实的图像展示方法、装置及设备。
背景技术
增强现实(Augmented Reality,简称AR),可以是指通过摄影机摄影的位置及角度并加上图像分析技术,让虚拟世界能够与现实世界场景进行结合与交互的技术。这种技术可以将真实的环境和虚拟的对象叠加到同一个画面而同时存在,从而给与用户超越现实的感官体验。在AR场景中,可以将相机位置作为渲染参数之一,对由虚拟对象和现实场景所获得的三维模型进行渲染,获得投影图像。然而,所展示的投影图像与移动设备的位姿相关,而针对设备静止且拍摄者运动的情况,设备不能做出相应的响应。
发明内容
为克服相关技术中存在的问题,本说明书提供了增强现实的图像展示方法、装置及设备。
根据本说明书实施例的第一方面,提供一种增强现实的图像展示方法,所述方法包括:
获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。
在一个实施例中,所述方法应用于电子设备,所述人物摄像模块包括所述电子设备的前置摄像头,所述场景摄像模块包括所述电子设备的后置摄像头。
在一个实施例中,所述三维模型的构建步骤包括:
利用场景摄像模块所采集的实景图像,对现实场景进行三维重建,获得场景模型;
基于预设叠加策略,将虚拟对象叠加至场景模型,获得三维模型。
在一个实施例中,所述至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息,包括:
获取人物摄像模块和场景摄像模块的相对位置;
利用人物摄像模块和场景摄像模块的相对位置,将人眼与人物摄像模块的相对位置,转换为人眼与场景摄像模块的相对位置;
结合人眼与场景摄像模块的相对位置、以及场景摄像模块的位置信息,计算获得人眼位置信息。
在一个实施例中,所述方法还包括:
在确定人眼位置之前,根据当前所获得的人眼与人物摄像模块的相对位置,与上一次所获得的人眼与人物摄像模块的相对位置,判定人眼与人物摄像模块的相对位置发生变更。
根据本说明书实施例的第二方面,提供一种增强现实的图像展示装置,所述装置包括:
相对位置确定模块,用于:获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
人眼位置确定模块,用于:至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
图像渲染模块,用于:以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。
在一个实施例中,所述装置设于电子设备,所述人物摄像模块包括所述电子设备的前置摄像头,所述场景摄像模块包括所述电子设备的后置摄像头。
在一个实施例中,所述装置还包括三维模型构建模块,用于:
利用场景摄像模块所采集的实景图像,对现实场景进行三维重建,获得场景模型;
基于预设叠加策略,将虚拟对象叠加至场景模型,获得三维模型。
在一个实施例中,所述人眼位置确定模块,具体用于:
获取人物摄像模块和场景摄像模块的相对位置;
利用人物摄像模块和场景摄像模块的相对位置,将人眼与人物摄像模块的相对位置,转换为人眼与场景摄像模块的相对位置;
结合人眼与场景摄像模块的相对位置、以及场景摄像模块的位置信息,计算获得人眼位置信息。
在一个实施例中,所述装置还包括位置判断模块,用于:
在确定人眼位置之前,根据当前所获得的人眼与人物摄像模块的相对位置,与上一次所获得的人眼与人物摄像模块的相对位置,判定人眼与人物摄像模块的相对位置发生变更。
根据本说明书实施例的第三方面,提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如上述任一项所述方法。
本说明书的实施例提供的技术方案可以包括以下有益效果:
本说明书实施例获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;并至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;以人眼位置信息作为渲染参数中场景摄像模块的位置信息,结合其他渲染参数对三维模型进行渲染,获得投影在显示屏幕上的投影图像,实现将增强现实展示内容由摄像模块视角的投影图像更改为人眼视角的投影图像,使投影图像跟随人眼位置的变更而变更。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本说明书。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本说明书的实施例,并与说明书一起用于解释本说明书的原理。
图1是本说明书根据一示例性实施例提供的一种AR场景拍摄示意图。
图2是本说明书根据一示例性实施例示出的一种增强现实的图像展示方法的流程图。
图3A是本说明书根据一示例性实施例示出的另一种增强现实的图像展示方法的流程图。
图3B是本说明书根据一示例性实施例示出一种虚拟对象显示位置对比示意图。
图4是本说明书根据一示例性实施例示出的一种增强现实的图像展示装置所在计算机设备的一种硬件结构图。
图5是本说明书根据一示例性实施例示出的一种增强现实的图像展示装置的框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本说明书的一些方面相一致的装置和方法的例子。
在本说明书使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书。在本说明书和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。
应当理解,尽管在本说明书可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本说明书范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。
增强现实(Augmented Reality,简称AR)技术,是一种将真实世界信息和虚拟世界信息无缝集成的新技术,该技术可以通过计算机技术,将虚拟的信息应用到真实世界,真实环境和虚拟物体实时地叠加到了同一个画面或空间同时存在。
AR技术一种常见的应用场景是,用户通过手持或佩戴等移动设备中的摄像模块拍摄真实环境,提供AR服务的软件可以基于所拍摄的初始图像数据,在初始图像数据上渲染一个或多个虚拟对象。实现上述场景的关键在于如何将虚拟对象与实际拍摄的现实场景结合,一方面,可以提供AR服务的软件可以预先配置一个或多个对应虚拟对象的 模型,每个虚拟对象的模型规定该虚拟对象对应的状态演变规则,以决定虚拟对象的不同运动状态。另一方面,软件还可以根据设备所拍摄的图像数据,确定虚拟对象在现实场景中的位置,进而确定虚拟对象渲染到图像数据上的哪个位置,在成功渲染后,用户即可观看到基于真实环境叠加有虚拟对象的画面。
然而,对由虚拟对象和现实场景所构建的三维模型进行渲染时,是以摄像模块的视角进行渲染。增强显示方案依赖设备的陀螺仪、加速度及重力感应器来感知设备角度变化。因此,如果摄像模块没有移动而拍摄者/观看者移动的情况,成像图像不会做出相应的响应,代入感和立体感较差。
举例来说,如图1所示,是本说明书根据一示例性实施例提供的一种AR场景拍摄示意图。图1中以虚拟对象为小狗、AR系统采用普通显示器显示为例进行示例,在该情况下,用户无需穿戴任何显示设备即可从显示屏幕中看到真实环境与虚拟对象的融合效果。拍摄者/观看者利用手机后置摄像头拍摄现实场景,在手机屏幕中展示包括小狗的投影图像。然而,拍摄者/观看者保持手机不动而眼睛与手机的相对位置发生改变时,手机屏幕所展示画面不会有任何改变。
鉴于此,本说明书提供一种增强现实的图像展示方法,通过将设备的增强现实展示内容由摄像机视角的复合图像改为人眼视角的复合图像,从而使得展示的图像更接近人眼视角的效果,增强立体感和代入感。其中,摄像模块之所以能摄影成像,主要是靠镜头将被摄体结成影像投在摄像管或固体摄像器件的成像面上。摄像机镜头能涵盖多大范围的景物,通常以角度来表示,该角度可以称为镜头的视角。本说明书实施例所指人眼视角,并非指人眼所能看到的全部视角,而可以是透过显示屏幕所能看到的视角。
以下结合附图对本说明书实施例进行示例说明。
如图2所示,是本说明书根据一示例性实施例示出的一种增强现实的图像展示方法的流程图,所述方法包括:
在步骤202中,获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
在步骤204中,至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
在步骤206中,以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场 景摄像模块所扫描现实场景相结合而获得。
在本说明书实施例中,人物摄像模块和场景摄像模块是不同的拍摄模块,两个拍摄模块的拍摄区域不同。在一个例子中,人物摄像模块和场景摄像模块的摄影方向相反,人物摄像模块的摄像头与显示屏幕在电子设备的同一面,甚至人物摄像模块的摄像头镜面与显示屏幕在同一平面,进一步的,两个摄像模块设于同一电子设备。例如,实际应用中,由于由后置摄像头采集的图像相比于由前置摄像头采集的图像清晰度高,拍摄者/观看者往往习惯用后置摄像头拍摄现实场景,同时,前置摄像头的镜面与显示屏幕在同一平面。因此,人物摄像模块可以是前置摄像头,场景摄像模块可以是后置摄像头,从而实现利用后置摄像头进行增强现实方面的应用,并利用前置摄像头辅助AR增强。
可以理解的是,人物拍摄模块和场景拍摄模块都是拍摄模块,只是为了区分不同拍摄模块,而进行不同的命名。在其他例子中,某些终端可能正面和反面均具有显示屏幕,因此可以将后置摄像头作为人物摄像模块,将前置摄像头作为场景摄像模块;又或者,人物摄像模块与场景摄像模块为设置于不同设备上的拍摄模块等。
人眼位置信息是用于表示拍摄者/观看者人眼在空间的位置,可以是人眼在世界坐标系或场景摄像模块坐标系下的三维坐标。步骤202和步骤204介绍如何确定人眼位置信息。作为一种应用实例,可以先确定人眼与人物摄像模块的相对位置,再根据人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息。
关于步骤202,人物摄像模块可以采集人物图像,特别是在人物拍摄模块所能拍摄范围内的摄像者的图像。人眼与人物摄像模块的相对位置,可以是相对位姿,包括相对距离和相对方向。在一个例子中,相对位置可以利用带方向的向量表示。
人眼与人物摄像模块的相对位置,可以利用人脸检测算法对人物图像进行人脸检测而获得。例如,可以先检测人物图像中人脸区域,进而根据人眼与人脸的关系从人脸区域中确定人眼区域,根据人眼区域与图像的关系,确定人眼与人物摄像模块的相对位置。
在一个实施例中,可以利用深度学习训练模型来确定人眼与人物摄像模块的相对位置。例如,可以以标注有人眼与摄像模块的相对位置的人物图像构建训练样本,利用训练样本对预设初始模型进行训练,获得用于检测人眼与摄像模块的相对位置的检测模型。在应用阶段,利用检测模型对待检测图像进行检测,获得人眼与摄像模块的相对位置。可以理解的是,在其他例子中,每组训练样本中还可以包括其他有助于提高相对位置检测结果的样本特征,例如,人脸区域框等。另外,也可以采用其他方式,通过对人物图 像的识别,获得人眼与人物摄像模块的相对位置,在此不一一赘述。
关于步骤204,场景摄像模块的位置信息用于表示场景摄像模块在空间的位置,可以是场景摄像模块在世界坐标系或场景摄像模块坐标系下的三维空间坐标,例如,场景摄像模块的位置信息可以在对场景摄像模块进行相机标定时获得。可以理解的是,人眼位置和场景摄像模块的位置,是在同一坐标系的坐标。在图像测量过程以及机器视觉应用中,为确定空间物体表面某点的三维几何位置与其在图像中对应点之间的相互关系,可以建立摄像机成像的几何模型,几何模型参数即摄像机参数。摄像机参数可以包括内参、外参、畸变参数等。实际应用中,可以采用相关技术中的标定方法对摄像机进行标定,例如,线性标定法、非线性优化标定法、Tsai的经典两步标定法等,在此不做限制。
在获得人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息后,可以至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息。
在某些应用场景中,人物摄像模块的设置位置和场景摄像模块的设置位置较近,可以忽略两模块间的相对位置。特别是针对人物摄像模块与场景摄像模块背对设置的情况,可以忽略两模块间的相对位置,因此,可以直接根据人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息。例如,假设后置摄像头在场景中的位置是X,人眼相对于前置摄像头的位置是Y,则人眼位置可以是X+Y,朝向可以为-Y。
在某些应用场景中,为了提高人眼位置信息的准确性,还可以结合人物摄像模块和场景摄像模块的相对位置来确定人眼位置信息。针对人物摄像模块和场景摄像模块设置在同一设备的情况,人物摄像模块和场景摄像模块的相对位置是固定的,可以基于其所在设备的设备信息而确定。相应的,所述至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息,可以包括:
获取人物摄像模块和场景摄像模块的相对位置;
利用人物摄像模块和场景摄像模块的相对位置,将人眼与人物摄像模块的相对位置,转换为人眼与场景摄像模块的相对位置;
结合人眼与场景摄像模块的相对位置、以及场景摄像模块的位置信息,计算获得人眼位置信息。
可见,在该实施例中,通过人物摄像模块和场景摄像模块的相对位置,可以获得人眼与场景摄像模块的相对位置,从而提高人眼位置信息的准确性。
本说明书实施例欲通过人眼视角替换摄像机视角,从而以人眼视角动态渲染背景场景(现实场景)和虚拟对象,增强立体感和代入感。因此,以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得,渲染参数是对三维模型进行渲染时所需参数。
在对模型进行渲染获得投影图像时,最主要的渲染参数包括相机位置和投影面信息,而本实施例主要调整渲染参数中的相机位置,实现将相机视角调整为人眼视角。为此,在该实施例中,可以人眼位置信息作为渲染参数中场景摄像模块的位置信息,以实现用人眼视角替换场景摄像模块的视角,利用调整后的渲染参数对三维模型进行渲染,可以获得投影在显示屏幕上的投影图像。其中,渲染参数中的投影面信息可以根据显示屏幕信息而定。此外,渲染参数还包括渲染时所需的其他参数,例如光照参数等,在此不一一列举。
在该实施例中,可以通过人眼位置调整渲染参数,将渲染参数和三维模型输入渲染模块,由渲染模块渲染出投影图像。
在AR系统的传统流程中,可以从真实世界出发,经过数字成像,然后系统通过影像数据和传感器数据一起对三维世界进行感知理解,同时得到对三维交互的理解。3D交互理解的目的是告知系统要“增强”的内容。3D环境理解的目的就是告知系统要“增强”的位置。一旦系统确定了要增强的内容和位置以后,就可以进行虚实结合,即通过渲染模块完成。最后,合成的视频被传递到用户的视觉系统中,就达到了增强现实的效果。
而本说明书的三维模型,可以是基于将虚拟对象与场景摄像模块所扫描显示场景相结合而获得的模型。三维模型基于场景建模和虚拟对象叠加而获得。以下列举其中一种三维模型构建方法,在该实施例中,三维模型的构建步骤可以包括:
利用场景摄像模块所采集的实景图像,对现实场景进行三维重建,获得场景模型;
基于预设叠加策略,将虚拟对象叠加至场景模型,获得三维模型。
其中,所述场景模型又称为空间模型,包括但不限于用于实现增强现实的初始化场景模型。在本说明书实施例中可以通过对现实场景进行三维重建,获得场景模型。三维重建(3D Reconstruction)是从输入数据中建立现实场景中物体的3D模型。基于视觉的三维重建,可以指通过摄像机获取场景物体的数据图像,并对此图像进行分析处理,再 结合计算机视觉知识推导出现实环境中物体的三维信息。
在一个实施例中,可以以二维图像作为输入,重建出场景中的三维场景模型。通过对物体的不同角度拍摄的RGB图像,使用相关的计算机图形学和视觉技术,便可以重建出该物体的三维模型。
而随着深度相机的出现,在另一个实施例中,场景摄像模块可以是深度相机。对于现实场景中的点,深度相机扫描得到的每一帧数据不仅包括场景中的点的彩色RGB图像,还可以包括每个点到深度相机所在的垂直平面的距离值。该距离值可以被称为深度值(depth),而深度值共同组成这一帧的深度图像。深度图像可以理解为一副灰度图像,其中图像中每个点的灰度值代表该点的深度值,即该点在现实中的位置到相机所在垂直平面的真实距离。因此,可以深度相机采集的RGB图像和深度图像作为输入,重建场景中的三维场景模型。
在三维重建过程中,可以涉及图像获取、摄像机标定、特征提取、立体匹配、三维重建等过程。由于三维重建技术是一种较为成熟的现有技术,此处不再赘述。例如,可以采用即时定位与地图构建(SLAM,simultaneous localization and mapping)等方法实现对现实场景的三维重建。
在获得场景模型后,可以基于预设叠加策略,筛选出虚拟对象,并定位虚拟对象所需叠加的位置,从而将虚拟对象叠加至场景模型,进而获得三维模型。预设叠加策略可以是确定待增强的内容和位置的策略,在此不做限制。
由上述实施例可见,在使用场景摄像模块进行增强现实应用的同时,使用人物摄像模块对人眼进行定位,从而以人眼视角动态渲染现实场景和虚拟对象,可以在人眼与摄像模块位置发生变更时,所展示的投影图像做出相适应的响应,增强立体感和代入感。
在一个实施例中,在确定人眼位置之前,根据当前所获得的人眼与人物摄像模块的相对位置,与上一次所获得的人眼与人物摄像模块的相对位置,判定人眼与人物摄像模块的相对位置发生变更。从而实现在人眼与人物摄像模块的相对位置发生变更时,执行步骤204和206,而在人眼与人物摄像模块的相对位置未发生变更时,不执行步骤204和206,从而避免实时计算导致的资源浪费。
以上实施方式中的各种技术特征可以任意进行组合,只要特征之间的组合不存在冲突或矛盾,但是限于篇幅,未进行一一描述,因此上述实施方式中的各种技术特征的任意进行组合也属于本说明书公开的范围。
以下以其中一种组合进行示例说明。
如图3A所示,是本说明书根据一示例性实施例示出的另一种增强现实的图像展示方法的流程图。所述方法可以应用于移动设备中,将移动设备的增强现实展示内容由摄像机视角的复合图像改为人眼视角的复合图像。所述方法可以包括:
在步骤302中,通过后置摄像头采集的图像进行设备背面场景的三维重建,并叠加虚拟对象,获得三维模型。
在步骤304中,通过前置摄像头采集的图像,使用人脸检测算法检测到用户的人眼位置。
在步骤306中,通过人眼的位置,重新计算重建的三维场景在设备屏幕上的投影以及虚拟对象在设备屏幕位置的投影,并获得投影图像。
在步骤308中,将投影图像展示在设备屏幕上。
图3A中与图2中相关技术相似,在此不一一赘述。
为了方便理解,还结合附图3B对本实施例中虚拟对象的显示位置与现有技术中虚拟对象显示位置进行对比说明。后置摄像头的视角往往大于人眼透过屏幕框所能看到的景物的视角(简称人眼视角),因此,摄像头视角下虚拟对象的遮挡区大于人眼视角下虚拟对象的遮挡区。图中32表示利用本实施例方案后,虚拟对象在屏幕中的显示位置,34表示利用现有技术方案后,虚拟对象在屏幕中的显示位置。
本实施例通过前置摄像头对人眼位置的判断来调节显示内容,使得显示场景更接近人眼视角的效果,代入感和立体感更强。使用三维场景重建来对背景建模,可以更好的响应人眼位置的变化而显示不同角度的背景。同时,使用三维场景重建的方式可以对设备静止而背景运动的场景做出更恰当的响应。
与前述增强现实的图像展示方法的实施例相对应,本说明书还提供了增强现实的图像展示装置及其所应用的电子设备的实施例。
本说明书增强现实的图像展示装置的实施例可以应用在计算机设备。装置实施例可以通过软件实现,也可以通过硬件或者软硬件结合的方式实现。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在计算机设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,如图4所示,为本说明书增强现实的图像展示装置所在计算机设备的一种硬件结构图,除了图4所示的处理器 410、网络接口420、内存430、以及非易失性存储器440之外,实施例中增强现实的图像展示装置431所在的计算机设备通常根据该设备的实际功能,还可以包括其他硬件,对此不再赘述。
如图5所示,是本说明书根据一示例性实施例示出的一种增强现实的图像展示装置的框图,所述装置包括:
相对位置确定模块52,用于:获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
人眼位置确定模块54,用于:至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
图像渲染模块56,用于:以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。在一个实施例中,所述装置设于电子设备,所述人物摄像模块包括所述电子设备的前置摄像头,所述场景摄像模块包括所述电子设备的后置摄像头。
在一个实施例中,所述装置还包括三维模型构建模块(图5未示出),用于:
利用场景摄像模块所采集的实景图像,对现实场景进行三维重建,获得场景模型;
基于预设叠加策略,将虚拟对象叠加至场景模型,获得三维模型。
在一个实施例中,所述人眼位置确定模块,具体用于:
获取人物摄像模块和场景摄像模块的相对位置;
利用人物摄像模块和场景摄像模块的相对位置,将人眼与人物摄像模块的相对位置,转换为人眼与场景摄像模块的相对位置;
结合人眼与场景摄像模块的相对位置、以及场景摄像模块的位置信息,计算获得人眼位置信息。
在一个实施例中,所述装置还包括位置判断模块(图5未示出),用于:
在确定人眼位置之前,根据当前所获得的人眼与人物摄像模块的相对位置,与上一次所获得的人眼与人物摄像模块的相对位置,判定人眼与人物摄像模块的相对位置发生变更。
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本说明书方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
相应的,本说明书实施例还提供一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如下方法:
获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于设备实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
一种计算机存储介质,所述存储介质中存储有程序指令,所述程序指令包括:
获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。
本说明书实施例可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机可用存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。
本领域技术人员在考虑说明书及实践这里申请的发明后,将容易想到本说明书的其它实施方案。本说明书旨在涵盖本说明书的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本说明书的一般性原理并包括本说明书未申请的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本说明书的真正范围和精神由下面的权利要求指出。
应当理解的是,本说明书并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本说明书的范围仅由所附的权利要求来限制。
以上所述仅为本说明书的较佳实施例而已,并不用以限制本说明书,凡在本说明书的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本说明书保护的范围之内。

Claims (11)

  1. 一种增强现实的图像展示方法,所述方法包括:
    获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
    至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
    以人眼位置信息作为渲染参数中场景摄像模块的位置信息,对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。
  2. 根据权利要求1所述的方法,所述方法应用于电子设备,所述人物摄像模块包括所述电子设备的前置摄像头,所述场景摄像模块包括所述电子设备的后置摄像头。
  3. 根据权利要求1所述的方法,所述三维模型的构建步骤包括:
    利用场景摄像模块所采集的实景图像,对现实场景进行三维重建,获得场景模型;
    基于预设叠加策略,将虚拟对象叠加至场景模型,获得三维模型。
  4. 根据权利要求1至3任一项所述的方法,所述至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息,包括:
    获取人物摄像模块和场景摄像模块的相对位置;
    利用人物摄像模块和场景摄像模块的相对位置,将人眼与人物摄像模块的相对位置,转换为人眼与场景摄像模块的相对位置;
    结合人眼与场景摄像模块的相对位置、以及场景摄像模块的位置信息,计算获得人眼位置信息。
  5. 根据权利要求1至3任一项所述的方法,所述方法还包括:
    在确定人眼位置之前,根据当前所获得的人眼与人物摄像模块的相对位置,与上一次所获得的人眼与人物摄像模块的相对位置,判定人眼与人物摄像模块的相对位置发生变更。
  6. 一种增强现实的图像展示装置,所述装置包括:
    相对位置确定模块,用于:获取人物摄像模块所采集的人物图像,并利用人物图像中人眼区域与所述人物图像的关系,确定人眼与人物摄像模块的相对位置;
    人眼位置确定模块,用于:至少基于人眼与人物摄像模块的相对位置、以及场景摄像模块的位置信息,确定人眼位置信息;
    图像渲染模块,用于:以人眼位置信息作为渲染参数中场景摄像模块的位置信息, 对三维模型进行渲染,获得投影在显示屏幕上的投影图像,所述三维模型基于将虚拟对象与场景摄像模块所扫描现实场景相结合而获得。
  7. 根据权利要求6所述的装置,所述装置设于电子设备,所述人物摄像模块包括所述电子设备的前置摄像头,所述场景摄像模块包括所述电子设备的后置摄像头。
  8. 根据权利要求6所述的装置,所述装置还包括三维模型构建模块,用于:
    利用场景摄像模块所采集的实景图像,对现实场景进行三维重建,获得场景模型;
    基于预设叠加策略,将虚拟对象叠加至场景模型,获得三维模型。
  9. 根据权利要求6至8任一项所述的装置,所述人眼位置确定模块,具体用于:
    获取人物摄像模块和场景摄像模块的相对位置;
    利用人物摄像模块和场景摄像模块的相对位置,将人眼与人物摄像模块的相对位置,转换为人眼与场景摄像模块的相对位置;
    结合人眼与场景摄像模块的相对位置、以及场景摄像模块的位置信息,计算获得人眼位置信息。
  10. 根据权利要求6至8任一项所述的装置,所述装置还包括位置判断模块,用于:
    在确定人眼位置之前,根据当前所获得的人眼与人物摄像模块的相对位置,与上一次所获得的人眼与人物摄像模块的相对位置,判定人眼与人物摄像模块的相对位置发生变更。
  11. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现如权利要求1至5任一项所述方法。
PCT/CN2019/098557 2018-09-28 2019-07-31 增强现实的图像展示方法、装置及设备 WO2020063100A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811142256.0A CN109615703B (zh) 2018-09-28 2018-09-28 增强现实的图像展示方法、装置及设备
CN201811142256.0 2018-09-28

Publications (1)

Publication Number Publication Date
WO2020063100A1 true WO2020063100A1 (zh) 2020-04-02

Family

ID=66002749

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/098557 WO2020063100A1 (zh) 2018-09-28 2019-07-31 增强现实的图像展示方法、装置及设备

Country Status (3)

Country Link
CN (1) CN109615703B (zh)
TW (1) TWI712918B (zh)
WO (1) WO2020063100A1 (zh)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615703B (zh) * 2018-09-28 2020-04-14 阿里巴巴集团控股有限公司 增强现实的图像展示方法、装置及设备
CN112797897B (zh) * 2019-04-15 2022-12-06 Oppo广东移动通信有限公司 物体几何参数的测量方法、装置和终端
CN110428388B (zh) * 2019-07-11 2023-08-08 创新先进技术有限公司 一种图像数据生成方法及装置
CN112306222A (zh) * 2019-08-06 2021-02-02 北京字节跳动网络技术有限公司 一种增强现实方法、装置、设备及存储介质
US10993417B2 (en) * 2019-08-14 2021-05-04 International Business Machines Corporation Detection and management of disease outbreaks in livestock using health graph networks
CN110930518A (zh) * 2019-08-29 2020-03-27 广景视睿科技(深圳)有限公司 基于增强现实技术的投影方法及投影设备
CN110928627B (zh) * 2019-11-22 2023-11-24 北京市商汤科技开发有限公司 界面展示方法及装置、电子设备和存储介质
CN111405263A (zh) * 2019-12-26 2020-07-10 的卢技术有限公司 一种双摄像头结合共用于增强抬头显示的方法及系统
CN111179438A (zh) * 2020-01-02 2020-05-19 广州虎牙科技有限公司 Ar模型动态固定方法、装置、电子设备和存储介质
WO2021184388A1 (zh) * 2020-03-20 2021-09-23 Oppo广东移动通信有限公司 图像展示方法及装置、便携式电子设备
CN111553972B (zh) * 2020-04-27 2023-06-30 北京百度网讯科技有限公司 用于渲染增强现实数据的方法、装置、设备及存储介质
CN111625101B (zh) * 2020-06-03 2024-05-17 上海商汤智能科技有限公司 一种展示控制方法及装置
CN114125418A (zh) * 2020-08-25 2022-03-01 陕西红星闪闪网络科技有限公司 一种全息游客服务中心及其实现方法
CN112672139A (zh) * 2021-03-16 2021-04-16 深圳市火乐科技发展有限公司 投影显示方法、装置及计算机可读存储介质
TWI779922B (zh) * 2021-11-10 2022-10-01 財團法人資訊工業策進會 擴增實境處理裝置以及方法
CN114401414B (zh) * 2021-12-27 2024-01-23 北京达佳互联信息技术有限公司 沉浸式直播的信息显示方法及系统、信息推送方法
CN114706936B (zh) * 2022-05-13 2022-08-26 高德软件有限公司 地图数据处理方法及基于位置的服务提供方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130194110A1 (en) * 2012-02-01 2013-08-01 Electronics And Telecommunications Research Institute Automotive augmented reality head-up display apparatus and method
CN108153502A (zh) * 2017-12-22 2018-06-12 长江勘测规划设计研究有限责任公司 基于透明屏幕的手持式增强现实显示方法及装置
CN108181994A (zh) * 2018-01-26 2018-06-19 成都科木信息技术有限公司 用于ar头盔的人机交互方法
CN108287609A (zh) * 2018-01-26 2018-07-17 成都科木信息技术有限公司 用于ar眼镜的图像绘制方法
CN109615703A (zh) * 2018-09-28 2019-04-12 阿里巴巴集团控股有限公司 增强现实的图像展示方法、装置及设备

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8730309B2 (en) * 2010-02-23 2014-05-20 Microsoft Corporation Projectors and depth cameras for deviceless augmented reality and interaction
US9652893B2 (en) * 2014-04-29 2017-05-16 Microsoft Technology Licensing, Llc Stabilization plane determination based on gaze location
AU2017215349B2 (en) * 2016-02-04 2021-12-16 Magic Leap, Inc. Technique for directing audio in augmented reality system
CN106302132A (zh) * 2016-09-14 2017-01-04 华南理工大学 一种基于增强现实的3d即时通讯系统与方法
CN106710002A (zh) * 2016-12-29 2017-05-24 深圳迪乐普数码科技有限公司 基于观察者视角定位的ar实现方法及其系统
CN107038746B (zh) * 2017-03-27 2019-12-24 联想(北京)有限公司 一种信息处理方法及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130194110A1 (en) * 2012-02-01 2013-08-01 Electronics And Telecommunications Research Institute Automotive augmented reality head-up display apparatus and method
CN108153502A (zh) * 2017-12-22 2018-06-12 长江勘测规划设计研究有限责任公司 基于透明屏幕的手持式增强现实显示方法及装置
CN108181994A (zh) * 2018-01-26 2018-06-19 成都科木信息技术有限公司 用于ar头盔的人机交互方法
CN108287609A (zh) * 2018-01-26 2018-07-17 成都科木信息技术有限公司 用于ar眼镜的图像绘制方法
CN109615703A (zh) * 2018-09-28 2019-04-12 阿里巴巴集团控股有限公司 增强现实的图像展示方法、装置及设备

Also Published As

Publication number Publication date
TW202013149A (zh) 2020-04-01
CN109615703A (zh) 2019-04-12
TWI712918B (zh) 2020-12-11
CN109615703B (zh) 2020-04-14

Similar Documents

Publication Publication Date Title
TWI712918B (zh) 擴增實境的影像展示方法、裝置及設備
US11632533B2 (en) System and method for generating combined embedded multi-view interactive digital media representations
US10949978B2 (en) Automatic background replacement for single-image and multi-view captures
US10269177B2 (en) Headset removal in virtual, augmented, and mixed reality using an eye gaze database
US10650574B2 (en) Generating stereoscopic pairs of images from a single lens camera
US20200234397A1 (en) Automatic view mapping for single-image and multi-view captures
US11748907B2 (en) Object pose estimation in visual data
US20230419438A1 (en) Extraction of standardized images from a single-view or multi-view capture
US20220078385A1 (en) Projection method based on augmented reality technology and projection equipment
EP3915050A1 (en) Damage detection from multi-view visual data
TWI531212B (zh) 呈現立體影像之系統及方法
JP7479729B2 (ja) 三次元表現方法及び表現装置
JP2016504828A (ja) 単一のカメラを用いて3d画像を取り込む方法およびシステム
US9897806B2 (en) Generation of three-dimensional imagery to supplement existing content
US11138743B2 (en) Method and apparatus for a synchronous motion of a human body model
JP2022518402A (ja) 三次元再構成の方法及び装置
CN108564654B (zh) 三维大场景的画面进入方式
WO2019213392A1 (en) System and method for generating combined embedded multi-view interactive digital media representations
CN108540790A (zh) 一种用于移动终端的立体图像获取方法、装置及移动终端
US11044464B2 (en) Dynamic content modification of image and video based multi-view interactive digital media representations
Louis et al. Rendering stereoscopic augmented reality scenes with occlusions using depth from stereo and texture mapping
WO2020167528A1 (en) Forming seam to join images
US20230217001A1 (en) System and method for generating combined embedded multi-view interactive digital media representations
JPWO2018117099A1 (ja) 画像処理装置及びプログラム
CN111754558B (zh) 用于rgb-d摄像系统与双目显像系统的匹配方法及其相关系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19865766

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19865766

Country of ref document: EP

Kind code of ref document: A1