CN117135332A - Real-time splicing system based on fusion of synthesized image and video - Google Patents

Real-time splicing system based on fusion of synthesized image and video Download PDF

Info

Publication number
CN117135332A
CN117135332A CN202310389810.XA CN202310389810A CN117135332A CN 117135332 A CN117135332 A CN 117135332A CN 202310389810 A CN202310389810 A CN 202310389810A CN 117135332 A CN117135332 A CN 117135332A
Authority
CN
China
Prior art keywords
image
video stream
real
environment
virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310389810.XA
Other languages
Chinese (zh)
Inventor
屠晓伟
张蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Jianmin Optoelectronic Technology Co ltd
Original Assignee
Suzhou Jianmin Optoelectronic Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Jianmin Optoelectronic Technology Co ltd filed Critical Suzhou Jianmin Optoelectronic Technology Co ltd
Priority to CN202310389810.XA priority Critical patent/CN117135332A/en
Publication of CN117135332A publication Critical patent/CN117135332A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/156Mixing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/122Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/332Displays for viewing with the aid of special glasses or head-mounted displays [HMD]
    • H04N13/344Displays for viewing with the aid of special glasses or head-mounted displays [HMD] with head-mounted left-right displays

Abstract

The application discloses a real-time splicing system based on fusion of a synthesized image and a video, which comprises the following steps: the real environment and the virtual environment are simultaneously ingested into the MR glasses in the form of video streams; respectively carrying out pixel screening on an R image of a real environment video stream and a V image of a virtual environment video stream in the video stream by adopting a pixel filtering/selector module; marking the position where the virtual environment needs to be displayed in the real environment; converting the video stream image information in the marked real environment into a pixel filter/selector; respectively filtering and selecting the required pixels in the R image of the real environment video stream and the V image of the virtual environment video stream by adopting a pixel filter/selector; a new hybrid M image is formed. The application provides the immersive 3D environment for the user by taking both the real environment and the virtual environment into the MR glasses for synthesis, and has more realistic display effect than the general immersive and non-immersive environments.

Description

Real-time splicing system based on fusion of synthesized image and video
Technical Field
The application belongs to the technical field of image processing, and particularly relates to a real-time splicing system based on fusion of a synthesized image and a video.
Background
VR (virtual reality) helmets are favored by many industries because of the ability to provide an immersive 3D display experience. Popular products on the market include the oculus quest of google, the Vive of PlayStationVR, HTC of sony, etc. These products only provide a highly realistic 3D virtual image display. In order to enhance the sense of realism of the user experience, MR (for short, mixed reality) technology introduces real scene information in a virtual environment, and an interactive feedback information loop is built up among the virtual world, the real world and the user. New visual environments created by merging real and virtual worlds. Physical and digital objects coexist in the new visualization environment and interact in real time. It is a further development of virtual reality technology.
There are currently two dominant ways to merge the virtual world with the real world. One is a non-immersive scheme. VR glasses employ a translucent display device with the real world as a background, where a user can see virtual objects superimposed in the real world, typical applications are virtual office desktops, auxiliary information displays, etc.; secondly, based on an image processing scheme, capturing objects in the real world, such as hands, bodies and the like of a person, through a camera and an algorithm, and superposing the captured object images into the virtual world, in such a way that a user sees the virtual world and the real world. Application of the leading edge, such as a VR helmet based flight simulator, the system uses image recognition and segmentation techniques to place images of the pilot's hands into the virtual cockpit to increase the pilot's immersion.
However, both of the above approaches have significant drawbacks. In the first non-immersive scheme, the user always sees overlapped images, the user lacks sense of realism and experiences poor, and the scheme has high requirements on environment, and virtual scenes are not obvious when the brightness of the real environment is high; when the brightness of the real environment is low, the user cannot see the real environment. The second scheme has higher requirements on software and hardware, and the real-time performance of image display is poor in the current technical environment, so that objects in a real scene which can be captured are limited. And the stability is insufficient, and a better solution is not available at present for the defects of two modes.
Disclosure of Invention
Aiming at the defects of the prior art, the application aims to provide a real-time splicing system based on the fusion of a synthesized image and a video, which solves the technical problems in the prior art.
The aim of the application can be achieved by the following technical scheme:
a real-time splicing system based on fusion of a synthesized image and a video comprises the following steps:
s100, the real environment and the virtual environment are simultaneously taken into the MR glasses in the form of video streams;
s200, adopting a pixel filtering/selector module to respectively screen pixels in an R image of a real environment video stream and a V image of a virtual environment video stream in the video stream;
s201, marking the position where the virtual environment needs to be displayed in the real environment;
s202, converting the video stream image information in the marked real environment into a pixel filter/selector;
s203, adopting a pixel filter/selector to respectively filter/select the required pixels in the image of the real environment video stream R and the image of the virtual environment video stream V;
s300, the pixel filtering/selecting generation module re-synthesizes the real environment video stream R image and the virtual environment video stream V image, and forms a new mixed M image.
Furthermore, the MR glasses of S100 have two image display channels, and each image display channel includes two video stream inputs, namely a real environment video stream input and a virtual environment video stream input.
Furthermore, the video stream information input by the real environment video stream adopts the real camera information to collect the image information under the current environment.
Further, the video stream information input by the video stream of the virtual environment adopts the image stream information synthesized in the virtual environment.
Further, the video stream information input by the video stream of the virtual environment is virtually formed after the pose of the virtual 3D environment and the pose of the virtual camera are collected.
Further, a positioning system is built in the S100, and the positioning system is used for measuring the position and the posture of the current eye state.
Further, the pixel filter/selector includes an infrared light display method, a specific color display method, a physical space calculation method, and a binary image calculation method.
Further, in S203, the pixel filtering/selecting unit adopts a binary image calculating method, when the binary image is 0, the pixel value of the corresponding position of the R image of the real environment video stream is selected as the pixel value of the M image, and when the binary image is 1, the pixel value of the corresponding position of the V image of the virtual environment video stream is selected as the pixel value of the M image.
Further, the binary image filters each frame of image pixels in the video stream information.
The application has the beneficial effects that:
1. the virtual and real image synthesis system of the MR glasses can realize immersive mixed reality scenes. The user can see the fused real 3D environment and virtual 3D environment at the same time. Unlike the present image synthesizing simple image superposing scheme, which has bad fidelity, reliability and high software and hardware requirement, the present application provides the user with vivid visual feeling.
2. According to the application, the real environment and the virtual environment are fused at the pixel level, and the pixel filter/selector is utilized, so that compared with an image recognition scheme, the MR process is simpler, stable and the instantaneity is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a schematic diagram of virtual-real image synthesis of MR glasses according to an embodiment of the application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
As shown in fig. 1, an embodiment of the present application provides a real-time stitching system based on fusion of a composite image and a video, in which a real environment and a virtual environment are first ingested into MR glasses in the form of video streams. For each frame of real image (R image) and virtual image (V image), the required pixels in the R image and the V image are respectively screened out by a pixel filtering/selecting method. Finally, the selected pixels are fused into a new "blended" image (M image) for display to the user.
The method specifically comprises the following steps:
s100, the real environment and the virtual environment are simultaneously taken into the MR glasses in the form of video streams, and in order to generate virtual and real images, the MR glasses comprise two image display channels (respectively, referred to as a left eye channel and a right eye channel) for the left eye and the right eye, and each channel comprises two video stream inputs. The MR glasses are provided with a positioning system for measuring the position and the posture of the glasses, and the left eye channel and the right eye channel exist, so the left eye channel is taken as an example in the application.
Initially, a video stream is taken in real time from a real camera mounted on MR glasses for capturing the real environment seen by the left eye of the user; the other video stream is derived from a composite image stream of the virtual environment, which is used to compose the virtual environment seen by the left eye of the user, and the specific virtual environment image to be displayed is determined by the pose of the virtual 3D environment and the virtual camera. The pose of the virtual camera is associated with a positioning system on the MR glasses (the positioning system is built in the MR glasses, and the positioning system is used for measuring the position and the pose of the current eye state). Thus, when the head of a user changes in pose, R images shot by the real camera change, and meanwhile, the pose of the virtual camera also changes correspondingly in real time to generate a needed V image.
The real environment video stream inputs the video stream information of the 2, and the real camera information is adopted to collect the image information in the current environment. The video stream information of the virtual environment video stream input 1 is formed by virtually acquiring the pose of the virtual 3D environment and the pose of the virtual camera by adopting the image stream information synthesized in the virtual environment, and the video stream information of the virtual environment video stream input 1.
S200, adopting a pixel filtering/selector module to respectively screen pixels in an R image of a real environment video stream and a V image of a virtual environment video stream in the video stream;
s201, marking the position where the virtual environment needs to be displayed in the real environment, (marking can be performed by adopting a specific color for later identification, the mark can be of any shape and is easy to extract, for example, an infrared light-emitting panel can be adopted to be placed in the real environment), the mark is used for distinguishing the display area of the real environment from the display area of the virtual environment, then the image passes through a spectroscope of the module 3, and then the image passes through projection and reaches the pixel filtering/selector generating module 4 and the image pixel filtering/selecting module 5 of the real environment. Thus, the image taken by the real camera on the MR glasses contains two parts: a marked area and a non-marked area.
S202, the image can be converted into a binary image because the marked area is of a specific color. The area of the binary image where the pixel value is 1 represents a marked area, and the other areas represent non-marked areas. Thus, converting the video stream image information in the marked real environment into a pixel filter/selector; the application adopts a binary image calculation method, and the adopted binary image screens each frame of image pixels in video stream information.
S203, adopting a binary image as a pixel filter/selector, and respectively filtering/selecting the required pixels in the R image of the real environment video stream and the V image of the virtual environment video stream; specifically, when the binary image is 0, the pixel value of the position corresponding to the R image of the real environment video stream is selected as the pixel value of the M image, the pixel value of the position corresponding to the V image of the virtual environment video stream is selected as the pixel value of the M image at the position corresponding to the 1 position of the binary image, the image pixel filtering/selecting module 6 of the virtual environment directly receives the image of the virtual scene from the video stream input 1 of the virtual environment and the pixel filtering/selecting from the pixel filtering/selecting generating module 4, then the image pixels of the required virtual scene are selected, and the right eye channel principle is the same.
S300, the pixel filtering/selecting generating module is used for re-fitting the R image of the real environment video stream and the V image of the virtual environment video stream, forming a new mixed M image, finally displaying the synthesized image on the display module 7, and a user can see the 3D image of virtual-real fusion on the display module 7 after wearing the designed MR glasses.
In summary, the above embodiments disclose a new system for synthesizing virtual and real images in MR glasses. The real environment video image and the virtual environment video image are respectively selected to be required pixels according to the pixel filter/selector to synthesize a new image. The embodiment can fuse images of real and virtual scenes to generate an immersive visual effect, and the system is simple in structure and good in reliability. The application can be widely applied to the field of mixed reality, such as carrier driving simulation, virtual exhibition halls, sports, individual training and the like.
The foregoing has shown and described the basic principles, principal features and advantages of the application. It will be understood by those skilled in the art that the present application is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present application, and various changes and modifications may be made without departing from the spirit and scope of the application, which is defined in the appended claims.

Claims (9)

1. The real-time splicing system based on the fusion of the synthesized image and the video is characterized by comprising the following steps:
s100, the real environment and the virtual environment are simultaneously taken into the MR glasses in the form of video streams;
s200, adopting a pixel filtering/selector module to respectively screen pixels in an R image of a real environment video stream and a V image of a virtual environment video stream in the video stream;
s201, marking the position where the virtual environment needs to be displayed in the real environment;
s202, converting the video stream image information in the marked real environment into a pixel filter/selector;
s203, adopting a pixel filter/selector to respectively filter/select the required pixels in the image of the real environment video stream R and the image of the virtual environment video stream V;
s300, the pixel filtering/selecting generation module re-synthesizes the real environment video stream R image and the virtual environment video stream V image, and forms a new mixed M image.
2. The real-time stitching system based on fusion of composite images and videos according to claim 1, wherein the MR glasses of S100 have two image display channels, namely a left image display channel and a right image display channel, and each image display channel includes two video stream inputs, namely a real environment video stream input 2 and a virtual environment video stream input 1.
3. The real-time splicing system based on the fusion of the synthesized image and the video according to claim 2, wherein the video stream information input by the real environment video stream is acquired by adopting real camera information.
4. The real-time splicing system based on the fusion of the synthesized image and the video according to claim 2, wherein the video stream information input by the video stream of the virtual environment adopts the image stream information synthesized in the virtual environment.
5. The real-time stitching system based on composite image and video fusion according to claim 4, wherein the video stream information input by the virtual environment video stream is virtually formed after pose acquisition of the virtual 3D environment and the virtual camera.
6. The real-time stitching system based on fusion of composite images and videos according to claim 1, wherein a positioning system is built in the S100, and the positioning system is used for measuring the position and the posture of the current eye state.
7. The real-time stitching system according to claim 1, wherein the pixel filter/selector comprises an infrared light display method, a specific color display method, a physical space calculation method, and a binary image calculation method.
8. The real-time stitching system according to claim 7, wherein in S203, the pixel filter/selector adopts a binary image calculation method, when the binary image is 0, the pixel value of the R image corresponding position of the real environment video stream is selected as the pixel value of the M image, and when the binary image is 1, the pixel value of the V image corresponding position of the virtual environment video stream is selected as the pixel value of the M image.
9. The real-time stitching system according to claim 8, wherein the binary image filters each frame of image pixels in the video stream information.
CN202310389810.XA 2023-04-13 2023-04-13 Real-time splicing system based on fusion of synthesized image and video Pending CN117135332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310389810.XA CN117135332A (en) 2023-04-13 2023-04-13 Real-time splicing system based on fusion of synthesized image and video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310389810.XA CN117135332A (en) 2023-04-13 2023-04-13 Real-time splicing system based on fusion of synthesized image and video

Publications (1)

Publication Number Publication Date
CN117135332A true CN117135332A (en) 2023-11-28

Family

ID=88861676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310389810.XA Pending CN117135332A (en) 2023-04-13 2023-04-13 Real-time splicing system based on fusion of synthesized image and video

Country Status (1)

Country Link
CN (1) CN117135332A (en)

Similar Documents

Publication Publication Date Title
CN106157359B (en) Design method of virtual scene experience system
US11076142B2 (en) Real-time aliasing rendering method for 3D VR video and virtual three-dimensional scene
CN107976811B (en) Virtual reality mixing-based method simulation laboratory simulation method of simulation method
Vallino Interactive augmented reality
US4925294A (en) Method to convert two dimensional motion pictures for three-dimensional systems
CN106162137B (en) Virtual visual point synthesizing method and device
CN107154197A (en) Immersion flight simulator
CN106413829A (en) Image encoding and display
Sanches et al. Mutual occlusion between real and virtual elements in augmented reality based on fiducial markers
CN106648098B (en) AR projection method and system for user-defined scene
EP2175636A1 (en) Method and system for integrating virtual entities within live video
CN108668050B (en) Video shooting method and device based on virtual reality
CN110850977B (en) Stereoscopic image interaction method based on 6DOF head-mounted display
WO2011029209A2 (en) Method and apparatus for generating and processing depth-enhanced images
CN110866978A (en) Camera synchronization method in real-time mixed reality video shooting
CN104702936A (en) Virtual reality interaction method based on glasses-free 3D display
CN112446939A (en) Three-dimensional model dynamic rendering method and device, electronic equipment and storage medium
CN101631257A (en) Method and device for realizing three-dimensional playing of two-dimensional video code stream
CN111729283A (en) Training system and method based on mixed reality technology
CN107240147A (en) Image rendering method and system
CN113035010A (en) Virtual and real scene combined visual system and flight simulation device
US20110149039A1 (en) Device and method for producing new 3-d video representation from 2-d video
CN113941138A (en) AR interaction control system, device and application
CN103248910A (en) Three-dimensional imaging system and image reproducing method thereof
Chen et al. A method of stereoscopic display for dynamic 3D graphics on android platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination