CN110675506B - System, method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion - Google Patents

System, method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion Download PDF

Info

Publication number
CN110675506B
CN110675506B CN201911145076.2A CN201911145076A CN110675506B CN 110675506 B CN110675506 B CN 110675506B CN 201911145076 A CN201911145076 A CN 201911145076A CN 110675506 B CN110675506 B CN 110675506B
Authority
CN
China
Prior art keywords
video
dimensional
texture
fusion
augmented reality
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911145076.2A
Other languages
Chinese (zh)
Other versions
CN110675506A (en
Inventor
石立阳
程远初
高星
徐建明
陈奇毅
朱文辉
华文
李德纮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PCI Technology Group Co Ltd
Original Assignee
PCI Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PCI Technology Group Co Ltd filed Critical PCI Technology Group Co Ltd
Publication of CN110675506A publication Critical patent/CN110675506A/en
Application granted granted Critical
Publication of CN110675506B publication Critical patent/CN110675506B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/162Segmentation; Edge detection involving graph-based methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/122Improving the 3D impression of stereoscopic images by modifying image signal contents, e.g. by filtering or adding monoscopic depth cues
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/167Synchronising or controlling image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/398Synchronisation thereof; Control thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Image Generation (AREA)
  • Processing Or Creating Images (AREA)
  • Controls And Circuits For Display Device (AREA)

Abstract

The embodiment of the application discloses a three-dimensional augmented reality system, a method and equipment for realizing multi-channel video fusion. According to the technical scheme provided by the embodiment of the application, the video stream is generated by the images shot by the multi-channel video acquisition device, the returned video stream is subjected to data synchronization, after the texture of the video frame obtained by resolving the video stream is mapped to the three-dimensional scene, texture reconstruction is carried out on the texture of the overlapped area of the texture mapping according to the weight value condition of the texture contribution of the projector, the condition that the projection areas are overlapped among a plurality of projectors is reduced, and the display effect of the rendered three-dimensional scene is improved.

Description

System, method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion
Technical Field
The embodiment of the application relates to the field of computer images, in particular to a system, a method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion.
Background
The virtual-real fusion (MR) technology carries out matching synthesis on a virtual environment and a real environment, reduces the workload of three-dimensional modeling, and improves the experience and the reliability of a user by means of a real scene and a real object. With the popularization of current video images, the discussion and research of MR technology are more concerned.
The video fusion technology utilizes the existing video images to fuse the video images into a three-dimensional virtual environment, and can realize video integration with uniformity and depth.
When a video is fused into a three-dimensional scene, in the prior art, the condition that projection areas of a plurality of projectors are overlapped exists, and adverse effects are caused on the display of the scene.
Disclosure of Invention
The embodiment of the application provides a three-dimensional augmented reality system, a three-dimensional augmented reality method and a three-dimensional augmented reality device for realizing multi-channel video fusion, which are used for reconstructing textures in texture mapping overlapping regions so as to fuse texture mapping and reduce the adverse effect on scene display caused by the overlapping of projection regions among a plurality of projectors
In a first aspect, an embodiment of the present application provides a three-dimensional augmented reality system for implementing multi-channel video fusion, including a three-dimensional scene system, a video real-time solution system, an image projection system, an image fusion system, and a virtual three-dimensional rendering system, where:
a three-dimensional scene system storing a scene three-dimensional scene;
the video real-time calculating system is used for calculating the received video stream in real time to obtain a video frame;
the image projection system is used for determining the mapping relation between the pixels in the video frame and the three-dimensional points in the three-dimensional scene and performing texture mapping on the video frame in the three-dimensional scene according to the mapping relation so as to complete the image projection of the video frame;
the image fusion system is used for determining texture values corresponding to three-dimensional points in an overlapped area of texture mapping and reconstructing textures in the overlapped area of the texture mapping according to the texture values to complete fusion of the texture mapping, wherein the texture values are determined according to weights of texture contributions of projectors corresponding to the overlapped area of the texture mapping;
and the virtual three-dimensional rendering system renders the fused texture and the three-dimensional scene.
Furthermore, the system also comprises a data synchronization system, wherein the video stream is generated by acquiring images at a plurality of positions on site by a multi-path image acquisition system, and the video stream generated by the multi-path image acquisition system is returned by a multi-path image real-time return control system; the data synchronization system performs data synchronization on the returned video streams, wherein the data synchronization is specifically time synchronization, so that the returned video streams in the same batch are located in the same time slice space.
Further, the video real-time solution system comprises a video frame extraction module and a hardware decoder, wherein:
the video frame extraction module extracts frame data from the video stream by using an FFMPEG library;
and the hardware decoder is used for resolving the frame data to obtain the video frame.
Further, a weight determination formula of texture contribution of the projector corresponding to the texture mapping overlapping region is r ═ p/(α × d);
wherein r is the weight contributed by the texture of the projector, p is the pixel resolution of the image of the projector, α is the included angle between two straight lines, and d is the distance from the position of the projector to the corresponding three-dimensional point.
Furthermore, the texture value corresponding to each three-dimensional point in the texture mapping overlapping region is determined by the formula T ═ Σ Ii×ri)/∑ri
Furthermore, the image fusion system is further configured to determine a partition line, where the partition line is used to intercept video frames of different paths, and fuse the intercepted video frames, and the texture of the three-dimensional point around the partition line is obtained by weighting the weight of the texture contribution of the projector corresponding to the intercepted video frame.
Further, the determination method of the dividing line is as follows:
and converting the video frames corresponding to the overlapped area to the same viewpoint, and obtaining a dividing line fusing the video frames corresponding to the overlapped area by using a GraphCut method.
Furthermore, when the image fusion system intercepts video frames of different paths, the image fusion system back projects the dividing lines into the video frames, and intercepts the actual use area of each video frame to obtain the corresponding video frame part.
In a second aspect, an embodiment of the present application provides a method for implementing a three-dimensional augmented reality for multi-channel video fusion, including:
the three-dimensional scene system stores a field three-dimensional scene;
the video real-time resolving system performs real-time resolving on the received video stream to obtain a video frame;
the image projection system determines the mapping relation between the pixels in the video frame and the three-dimensional points in the three-dimensional scene, and performs texture mapping on the video frame in the three-dimensional scene according to the mapping relation so as to complete the image projection of the video frame;
the image fusion system determines texture values corresponding to three-dimensional points in an overlapped area of texture mapping, reconstructs texture of the overlapped area of the texture mapping according to the texture values to complete fusion of the texture mapping, wherein the texture values are determined according to weights of texture contributions of projectors corresponding to the overlapped area of the texture mapping;
and rendering the fused texture and the three-dimensional scene by the virtual three-dimensional rendering system.
In a third aspect, an embodiment of the present application provides a three-dimensional augmented reality device for implementing multi-channel video fusion, including: a display screen, a memory, and one or more processors;
the display screen is used for displaying a three-dimensional scene fusing multiple paths of videos;
the memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a method for implementing multi-way video fused three-dimensional augmented reality as described in the second aspect.
The method includes the steps that a multi-path image acquisition system shoots a scene image and generates a video stream, a multi-path image real-time return control system and a data synchronization system return and synchronize the video stream with time, the returned video streams in the same batch are located in the same time slice space, the returned video is resolved by a video real-time resolving system to obtain a video frame, the video frame texture obtained by resolving the video stream is mapped to a three-dimensional scene by an image projection system, texture reconstruction is conducted on the texture of an overlapping area of texture mapping by the image fusion system according to the weight of texture contribution of projectors, the situation that projection areas are overlapped among a plurality of projectors is reduced, the fused texture and the three-dimensional scene are rendered by a virtual three-dimensional rendering system, and the display effect of the rendered three-dimensional scene is improved.
Drawings
Fig. 1 is a schematic structural diagram of a system for implementing a three-dimensional augmented reality for multi-channel video fusion provided in an embodiment of the present application;
fig. 2 is a schematic structural diagram of another system for implementing a three-dimensional augmented reality with multi-channel video fusion provided in an embodiment of the present application;
fig. 3 is a flowchart of a method for implementing a three-dimensional augmented reality with multi-channel video fusion according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a three-dimensional augmented reality device for implementing multi-channel video fusion according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, specific embodiments of the present application will be described in detail with reference to the accompanying drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some but not all of the relevant portions of the present application are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Fig. 1 shows a schematic structural diagram of a system for implementing a three-dimensional augmented reality for multi-channel video fusion according to an embodiment of the present application. Referring to fig. 1, the system for implementing multi-channel video-fused three-dimensional augmented reality includes a three-dimensional scene system 110, a video real-time solution system 140, an image projection system 150, an image fusion system 160, and a virtual three-dimensional rendering system 170. Wherein:
the three-dimensional scene system 110 stores a three-dimensional scene of a scene, and uses the three-dimensional scene as a base map for digital fusion. The source of the three-dimensional scene can be obtained by adding from an external server or by performing local three-dimensional modeling, the three-dimensional scene is stored locally after being obtained, and the three-dimensional scene is used as a base map of digital fusion as a starting point of basic analysis.
The video real-time solution system 140 is used for performing real-time solution on the received video stream to obtain video frames.
The image projection system 150 is configured to determine a mapping relationship between pixels in the video frame and three-dimensional points in the three-dimensional scene, and perform texture mapping on the video frame in the three-dimensional scene according to the mapping relationship, so as to complete image projection of the video frame.
The image fusion system 160 is configured to determine texture values corresponding to three-dimensional points in an overlapping region of texture mapping, and reconstruct textures in the overlapping region of texture mapping according to the texture values to complete fusion of texture mapping, where the texture values are determined according to weights of texture contributions of projectors corresponding to the overlapping region of texture mapping. Specifically, the image fusion system 160 cuts the video frame according to the segmentation lines obtained by the GraphCut method. When video frames of different paths are captured, the image fusion system 160 back-projects the dividing lines into the video frames, and captures the actual use area of each video frame to obtain the corresponding video frame portion. A projector is understood to be a representation of a video capture device (e.g., a camera, etc.) in a virtual scene.
The virtual three-dimensional rendering system 170 renders the fused texture and three-dimensional scene. Illustratively, the virtual three-dimensional rendering system 170 acquires a three-dimensional scene from the three-dimensional scene system 110, uses the three-dimensional scene as a base map, fuses the fused textures in the three-dimensional scene frame by frame according to the mapping result, and renders the fused three-dimensional scene for visual display.
The video real-time calculating system 140 calculates the received video stream in real time to obtain video frames, the image projection system 150 determines the mapping relationship of the pixel points of the video frames in the three-dimensional scene, the image fusion system 160 cuts the video frames in the texture overlapping area by using the dividing lines, the actual use area of each video frame is intercepted, then the image fusion system 160 splices and fuses the cut video frames, and finally the virtual three-dimensional rendering system 170 renders the video frames to complete the visual and intuitive display.
Fig. 2 is a schematic structural diagram of another system for implementing three-dimensional augmented reality with multi-channel video fusion according to an embodiment of the present application. Referring to fig. 2, the system for realizing the multi-channel video-fused three-dimensional augmented reality includes a three-dimensional scene system 110, a data synchronization system 120, a video real-time calculation system 140, an image projection system 150, an image fusion system 160, and a virtual three-dimensional rendering system 170, wherein the data synchronization system 120 is connected to a multi-channel image real-time feedback control system 130, and the multi-channel image real-time feedback control system 130 is connected to a multi-channel image acquisition system 180.
The multi-path images acquired by the multi-path image acquisition system 180 are transmitted back to the data synchronization system 120 in real time through the multi-path image real-time transmission control system 130 for synchronization, the synchronized video stream is resolved in real time by the video real-time resolving system 140, and the resolved result is mapped, merged and visually displayed in the virtual three-dimensional rendering system 170 with the three-dimensional scene through the image projection system 150 and the image merging system 160.
Specifically, the three-dimensional scene system 110 stores a three-dimensional scene of a scene, and uses the three-dimensional scene as a digital fusion base map. The source of the three-dimensional scene can be obtained by adding from an external server or by performing local three-dimensional modeling, the three-dimensional scene is stored locally after being obtained, and the three-dimensional scene is used as a base map of digital fusion as a starting point of basic analysis.
Further, the three-dimensional scene system 110 divides the three-dimensional data of the three-dimensional scene into blocks, and when the three-dimensional scene on the site is updated, the three-dimensional scene system 110 receives a three-dimensional update data packet corresponding to the blocks, where the three-dimensional update data packet should include the pointed block for updating the three-dimensional data, and the three-dimensional scene system 110 changes the three-dimensional data of the corresponding block into the three-dimensional data in the three-dimensional update data packet, so as to ensure timeliness of the three-dimensional scene.
Specifically, the multi-channel image capturing system 180 includes a multi-channel video capturing device for capturing images at a plurality of locations on a site and generating a video stream.
In this embodiment, the multi-channel video capture device should include a video capture device (such as a camera, etc.) that supports a maximum number of not less than 100. Wherein, every video acquisition device is not less than 200 ten thousand pixels, and the resolution ratio is 1920X1080, still can select following function according to actual need: the integrated ICR double-optical-filter day and night switching has the advantages of fog penetration function, electronic anti-shaking, switching of various white balance modes, automatic video aperture, support of H.264 coding and the like.
Each video acquisition device monitors different areas of a site, and the monitoring range of the multi-channel video acquisition device covers the site range corresponding to the three-dimensional scene, namely the site concerned range is monitored.
Further, the multi-channel image real-time feedback control system 130 is configured to feedback the video stream generated by the multi-channel image capturing system 180.
In this embodiment, the effective transmission distance of the multi-channel image real-time feedback control system 130 should be not less than 3KM, the video code stream should be not less than 8 mbps, and the time delay should be not more than 80ms, so as to ensure the timeliness of the display effect.
Illustratively, an access switch is arranged at the side of the multi-path image acquisition system 180, and is used for collecting the video stream generated by the multi-path image acquisition system 180 and converging the collected video stream into a convergence switch or a middle station, the convergence switch or the middle station preprocesses the video stream and then sends the preprocessed video stream to the multi-path image real-time backhaul control system 130, and the multi-path image real-time backhaul control system 130 transmits the video stream to the data synchronization system 120 for synchronization processing.
Optionally, the connection between the aggregation switch or the middle station and the access switches on both sides may be a communication connection in a wired and/or wireless manner. When through wired connection, modes such as accessible RS232, RS458, RJ45, bus are connected, when connecting through wireless, if near field communication modules such as mutual distance is close, accessible WiFi, zigBee, bluetooth carry out wireless communication, when far away from, accessible wireless bridge, 4G module, 5G module etc. carry out long-range wireless communication and connect.
The data synchronization system 120 receives the video streams returned by the multi-channel video real-time return control system 130 and performs data synchronization on the returned video streams. The synchronized video stream is sent to the video real-time calculation system 140 for calculation. The data synchronization is specifically time synchronization, so that the returned video streams of the same batch are located in the same time slice space. In this embodiment, the data synchronization system 120 should include data synchronization that supports a maximum number of no less than 100 video capture devices for returning video streams. Where the time-sliced space can be understood as a number of real time interval abstractions of fixed size.
Specifically, the video real-time calculation system 140 is configured to perform real-time calculation on the video stream to obtain video frames.
Further, the video real-time solution system 140 includes a video frame extraction module 141 and a hardware decoder 142, wherein:
the video frame extracting module 141 extracts frame data from the video stream using the FFMPEG library. The FFMPEG library is a set of open source computer programs that can be used to record, convert digital audio, video, and convert them into streams, and can fulfill the requirement of extracting frame data in this embodiment.
And a hardware decoder 142 for resolving the frame data to obtain a video frame. In this embodiment, the hardware decoder 142 is an independent video decoding module built in the NVIDIA graphics card, and supports h.264 and h.265 decoding, with a maximum resolution of 8K.
Specifically, the image projection system 150 is configured to determine a mapping relationship between a pixel in a video frame and a three-dimensional point in a three-dimensional scene, and perform texture mapping on the video frame in the three-dimensional scene according to the mapping relationship, so as to complete image projection of the video frame.
Illustratively, the pose information of the camera needs to be solved so as to determine the mapping relationship between the pixels in the video frame and the three-dimensional points in the three-dimensional virtual environment. The registration of video frames with virtual scenes is essentially a problem of camera calibration, requiring knowledge of the internal and external parameters of the camera. The position of the camera is clear when the camera shoots a video, the information of the two-dimensional picture in the three-dimensional scene can be obtained according to the space position and the posture of the camera, and the corresponding three-dimensional point can be obtained according to the corresponding information of the two-dimensional picture in the three-dimensional scene. Given a point (x) on a video framei,yi) And corresponding three-dimensional points (X)i,Yi,Zi) There is a 3 × 4 matrix M, having the following relationship:
Figure GDA0003035901470000071
where w is a scaling factor and M can be decomposed into a 3 x 3 camera internal reference matrix K, a 3 x 3 rotation matrix R and translation vectors T. ZiThe depth value of the three-dimensional point is calculated through a reconstruction process, namely, a triangle is formed by utilizing different azimuth postures of the camera, and the depth information is reversely calculated and is used as ZiThe value of (c). The zoom factor w is generally given empirically that the translation vector T is the true absolute position of the camera, which is a coordinate representation of the camera in the scene coordinate system, f is the focal length of the camera, s is the miscut of the camera, which is usually 0, x0And y0The principal point of the video frame is generally the center point of the video frame, and the 3 × 4 matrix M is the pose and position transformation of the physical camera itself, and these parameters of the camera can be obtained through the SDK provided by the camera manufacturer. Theoretically, the matrix M can be calculated from 6 sets of corresponding points, but because there is an error in matching, more corresponding points need to be found to calibrate the camera.
The corresponding relation between the three-dimensional point in the three-dimensional scene and the pixel in the video frame can be obtained by calibrating the internal and external parameters of the camera, the mapping relation is essentially determined by a 4 multiplied by 4 matrix N, and the 4 multiplied by 4 matrix N represents the transformation of the virtual camera in the graphics. In the system, the projection texture mapping is realized based on OpenGL, and N can be decomposed into a 4 × 4 view matrix V and a 4 × 4 projection matrix P. Given a spatial three-dimensional point, its texture coordinates are calculated as follows:
Figure GDA0003035901470000081
where (s, t) represents texture coordinates, u is used to determine whether a three-dimensional point is in front of the camera (>0) or behind the camera (<0), and q represents the depth value of the three-dimensional point, which ranges between (-1, 1), and needs to be normalized to between (0, 1). Since the depth values are considered, the intra-camera reference matrix K needs to be changed into a 4 × 4 matrix, and P and V are calculated as follows:
Figure GDA0003035901470000082
where F is the distance from the camera to the far clipping plane, N is the distance from the camera to the near clipping plane, and W and H are the width and height of the video frame.
Specifically, the image fusion system 160 is configured to determine texture values corresponding to three-dimensional points in an overlap region of texture mapping, and reconstruct textures in the overlap region of texture mapping according to the texture values to complete fusion of texture mapping, where the texture values are determined according to weights of texture contributions of projectors corresponding to the overlap region of texture mapping.
Further, a weight determination formula of texture contribution of the projector corresponding to the texture mapping overlapping region is r ═ p/(α × d); wherein r is a weight contributed by the texture of the projector, p is a pixel resolution of the image of the projector, α is an included angle between two straight lines, and d is a distance from the position of the projector to the corresponding three-dimensional point, wherein the included angle between the two straight lines is an included angle formed by two opposite oblique sides of a conical area formed by projection of the projector.
After the weight is obtained, the texture value is calculated according to a texture value determination formula, and the texture value determination formula corresponding to each three-dimensional point in the texture mapping overlapping region is T ═ (Σ I)i×ri)/∑ri. Wherein IiFor the corresponding texture original color value of the projector, riWeights for texture contribution of the corresponding projector.
It should be noted that the image fusion system 160 is further configured to determine a partition line, where the partition line is used to intercept video frames of different paths and fuse the intercepted video frames, and the texture of the three-dimensional point around the partition line is obtained by weighting the weight of the texture contribution of the projector corresponding to the intercepted video frame.
For example, the dividing line is determined by selecting the image closest to the virtual viewpoint as the main projection source, and projecting the main projection source first and then projecting the other projection sources. The determination of the main projection source is mainly determined according to the difference of the position and the visual angle of the projector and the position and the direction of the virtual viewpoint, and if the difference is within a threshold value, the main projection source is determined. If the main projection source does not exist, in other words, the contribution rates of the plurality of videos are similar, the video frames corresponding to the overlapping area are converted to the same viewpoint, and the dividing line fusing the video frames corresponding to the overlapping area is obtained by using the Graphcut method.
Optionally, the image fusion system 160 cuts the video frame according to the segmentation line obtained by the GraphCut method. When video frames of different paths are captured, the image fusion system 160 back-projects the dividing lines into the video frames, and captures the actual use area of each video frame to obtain the corresponding video frame portion. In the subsequent fusion process, the use area of the video frame is obtained according to the obtained dividing line for fusion, so that the calculation amount in the fusion process is reduced, and the real-time performance is improved.
Specifically, the virtual three-dimensional rendering system 170 renders the fused texture and the three-dimensional scene. Illustratively, the virtual three-dimensional rendering system 170 acquires a three-dimensional scene from the three-dimensional scene system 110, uses the three-dimensional scene as a base map, fuses the fused textures in the three-dimensional scene frame by frame according to the mapping result, and renders the fused three-dimensional scene for visual display.
The multi-channel images acquired by the multi-channel image acquisition system 180 are fed back in real time by the multi-channel image real-time feedback control system 130 and time synchronized by the data synchronization system 120, the video real-time calculation system 140 calculates the synchronized video stream in real time to obtain video frames, the image projection system 150 determines the mapping relationship of the pixel points of the video frames in the three-dimensional scene, the image fusion system 160 cuts the video frames in the texture overlapping area by using the dividing lines, the actual using area of each video frame is cut, then the image fusion system 160 splices and fuses the cut video frames, and finally the cut video frames are rendered by the virtual three-dimensional rendering system 170, so that visual and intuitive display is completed.
Fig. 3 is a flowchart of a method for implementing a multi-channel video fused three-dimensional augmented reality according to an embodiment of the present application, where the method for implementing a multi-channel video fused three-dimensional augmented reality according to this embodiment may be implemented by a system for implementing a multi-channel video fused three-dimensional augmented reality, and the system for implementing a multi-channel video fused three-dimensional augmented reality may be implemented in a hardware and/or software manner and integrated in a computer or other device. Referring to fig. 3, the method for implementing a multi-channel video fused three-dimensional augmented reality includes:
s101: the three-dimensional scene system stores a field three-dimensional scene.
Specifically, the source of the three-dimensional scene may be obtained by adding from an external server, or may be obtained by performing local three-dimensional modeling, and the three-dimensional scene is stored locally after being obtained, and is used as a base map of digital fusion as a starting point of basic analysis.
Furthermore, the three-dimensional scene system divides the three-dimensional data of the three-dimensional scene into blocks, and when the three-dimensional scene on site is updated, the three-dimensional scene system receives a three-dimensional update data packet corresponding to the blocks, the three-dimensional update data packet should contain the pointed blocks for updating the three-dimensional data, and the three-dimensional scene system replaces the three-dimensional data of the corresponding blocks with the three-dimensional data in the three-dimensional update data packet, so that the timeliness of the three-dimensional scene is guaranteed.
S102: and the video real-time calculating system carries out real-time calculation on the received video stream to obtain a video frame.
Illustratively, the video stream is generated by acquiring images of a plurality of positions on the scene by a multi-path image acquisition system, and the video stream generated by the multi-path image acquisition system is transmitted back by a multi-path image real-time transmission control system and is subjected to data synchronization by a data synchronization system.
S103: the image projection system determines the mapping relation between the pixels in the video frame and the three-dimensional points in the three-dimensional scene, and performs texture mapping on the video frame in the three-dimensional scene according to the mapping relation so as to complete the image projection of the video frame.
S104: and the image fusion system determines texture values corresponding to three-dimensional points in the texture mapping overlapping area, reconstructs textures in the texture mapping overlapping area according to the texture values to complete fusion of the texture mapping, wherein the texture values are determined according to weights of texture contributions of the projectors corresponding to the texture mapping overlapping area.
S105: and rendering the fused texture and the three-dimensional scene by the virtual three-dimensional rendering system.
Illustratively, the virtual three-dimensional rendering system acquires a three-dimensional scene from the three-dimensional scene system, takes the three-dimensional scene as a base map, fuses the fused textures in the three-dimensional scene frame by frame according to a mapping result, and renders the fused three-dimensional scene for visual display.
The video real-time calculating system is used for calculating the received video stream in real time to obtain video frames, the image projection system is used for determining the mapping relation of pixel points of the video frames in the three-dimensional scene, the image fusion system is used for cutting the video frames in the texture overlapping area by utilizing the dividing lines, the actual using area of each video frame is cut, then the image fusion system is used for splicing and fusing the cut video frames, and finally the virtual three-dimensional rendering system is used for rendering to finish visual and visual display.
On the basis of the foregoing embodiment, fig. 4 is a schematic structural diagram of a three-dimensional augmented reality device for implementing multi-channel video fusion according to an embodiment of the present application. Referring to fig. 4, the three-dimensional augmented reality device for implementing multi-channel video fusion provided in this embodiment may be a computer, and includes: a display screen 24, memory 22, and one or more processors 21; the display screen 24 is used for displaying a three-dimensional scene fused with multiple paths of videos; the memory 22 for storing one or more programs; when executed by the one or more processors 21, the one or more programs enable the one or more processors 21 to implement the method for implementing multi-channel video fusion three-dimensional augmented reality as provided in the embodiments of the present application.
The memory 22 is a computer readable storage medium for storing software programs, computer executable programs, and modules for implementing a method for multi-channel video fusion three-dimensional augmented reality according to any embodiment of the present application. The memory 22 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of a three-dimensional augmented reality device that implements multi-channel video fusion, and the like. Further, the memory 22 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 22 may further include a memory remotely located from the processor 21, and these remote memories may be connected to a three-dimensional augmented reality device that implements multi-way video fusion through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Further, the three-dimensional augmented reality device for realizing multi-channel video fusion further comprises a communication module 23, and the communication module 23 is used for establishing wired and/or wireless connection with other devices and performing data transmission.
The processor 21 executes various functional applications and data processing of the three-dimensional augmented reality device for implementing multi-channel video fusion by running software programs, instructions and modules stored in the memory 22, that is, implements the above-described method for implementing multi-channel video fusion.
The three-dimensional augmented reality device for realizing multi-channel video fusion can be used for executing the method for realizing the three-dimensional augmented reality for realizing the multi-channel video fusion provided by the embodiment, and has corresponding functions and beneficial effects.
The present application further provides a storage medium containing computer-executable instructions, where the computer-executable instructions, when executed by a computer processor, are configured to perform a method for implementing a multi-channel video fused three-dimensional augmented reality as provided in an embodiment of the present application, where the method for implementing a multi-channel video fused three-dimensional augmented reality includes: the three-dimensional scene system stores a field three-dimensional scene; the video real-time resolving system performs real-time resolving on the received video stream to obtain a video frame; the image projection system determines the mapping relation between the pixels in the video frame and the three-dimensional points in the three-dimensional scene, and performs texture mapping on the video frame in the three-dimensional scene according to the mapping relation so as to complete the image projection of the video frame; the image fusion system determines texture values corresponding to three-dimensional points in an overlapped area of texture mapping, reconstructs texture of the overlapped area of the texture mapping according to the texture values to complete fusion of the texture mapping, wherein the texture values are determined according to weights of texture contributions of projectors corresponding to the overlapped area of the texture mapping; and rendering the fused texture and the three-dimensional scene by the virtual three-dimensional rendering system.
Storage medium-any of various types of memory devices or storage devices. The term "storage medium" is intended to include: mounting media such as CD-ROM, floppy disk, or tape devices; computer system memory or random access memory such as DRAM, DDR RAM, SRAM, EDO RAM, Lanbas (Rambus) RAM, etc.; non-volatile memory such as flash memory, magnetic media (e.g., hard disk or optical storage); registers or other similar types of memory elements, etc. The storage medium may also include other types of memory or combinations thereof. In addition, the storage medium may be located in a first computer system in which the program is executed, or may be located in a different second computer system connected to the first computer system through a network (such as the internet). The second computer system may provide program instructions to the first computer for execution. The term "storage medium" may include two or more storage media that may reside in different locations, such as in different computer systems that are connected by a network. The storage medium may store program instructions (e.g., embodied as a computer program) that are executable by one or more processors 21.
Of course, the storage medium containing the computer-executable instructions provided in the embodiments of the present application is not limited to the method for implementing a multi-channel video fused three-dimensional augmented reality as described above, and may also perform related operations in the method for implementing a multi-channel video fused three-dimensional augmented reality as provided in any embodiments of the present application.
The system for realizing the three-dimensional augmented reality with multi-channel video fusion and the three-dimensional augmented reality device for realizing the multi-channel video fusion provided in the above embodiments may execute the method for realizing the three-dimensional augmented reality with multi-channel video fusion provided in any embodiment of the present application, and reference may be made to the system for realizing the three-dimensional augmented reality with multi-channel video fusion and the method provided in any embodiment of the present application without detailed technical details described in the above embodiments.
The foregoing is considered as illustrative of the preferred embodiments of the invention and the technical principles employed. The present application is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present application has been described in more detail with reference to the above embodiments, the present application is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present application, and the scope of the present application is determined by the scope of the claims.

Claims (10)

1. The system for realizing the three-dimensional augmented reality of multi-channel video fusion is characterized by comprising a three-dimensional scene system, a video real-time calculating system, an image projection system, an image fusion system and a virtual three-dimensional rendering system, wherein:
a three-dimensional scene system storing a scene three-dimensional scene;
the video real-time calculating system is used for calculating the received video stream in real time to obtain a video frame;
the image projection system is used for determining the mapping relation between the pixels in the video frame and the three-dimensional points in the three-dimensional scene and performing texture mapping on the video frame in the three-dimensional scene according to the mapping relation so as to complete the image projection of the video frame;
the image fusion system is used for determining texture values corresponding to three-dimensional points in an overlapped area of texture mapping and reconstructing textures in the overlapped area of the texture mapping according to the texture values to complete fusion of the texture mapping, wherein the texture values are determined according to weights of texture contributions of projectors corresponding to the overlapped area of the texture mapping;
and the virtual three-dimensional rendering system renders the fused texture and the three-dimensional scene.
2. The system for realizing three-dimensional augmented reality of multi-channel video fusion according to claim 1, further comprising a data synchronization system, wherein the video stream is generated by a multi-channel image acquisition system by acquiring images of multiple positions on site, and the video stream generated by the multi-channel image acquisition system is returned by a multi-channel image real-time return control system; the data synchronization system performs data synchronization on the returned video streams, wherein the data synchronization is specifically time synchronization, so that the returned video streams in the same batch are located in the same time slice space.
3. The system for realizing multi-channel video fused three-dimensional augmented reality according to claim 1, wherein the video real-time solution system comprises a video frame extraction module and a hardware decoder, wherein:
the video frame extraction module extracts frame data from the video stream by using an FFMPEG library;
and the hardware decoder is used for resolving the frame data to obtain the video frame.
4. The system for realizing three-dimensional augmented reality of multi-channel video fusion according to claim 1, wherein the weight determination formula of texture contribution of the projector corresponding to the texture mapping overlapping region is r ═ p/(α × d);
wherein r is the weight contributed by the texture of the projector, p is the pixel resolution of the image of the projector, α is the included angle between two straight lines, and d is the distance from the position of the projector to the corresponding three-dimensional point.
5. The system for implementing multi-channel video fused three-dimensional augmented reality as claimed in claim 4, wherein the texture value corresponding to each three-dimensional point in the texture mapping overlapping region is determined by the formula of T ═ Sigma Ii×ri)/∑riWhereinIiFor the corresponding texture original color value of the projector, riWeights for texture contribution of the corresponding projector.
6. The system of claim 5, wherein the image fusion system is further configured to determine a partition line, the partition line is configured to intercept video frames of different paths and fuse the intercepted video frames, and the texture of the three-dimensional point around the partition line is obtained by weighting the weight of the texture contribution of the projector corresponding to the intercepted video frame.
7. The system for realizing three-dimensional augmented reality of multi-channel video fusion according to claim 6, wherein the dividing line is determined by:
and converting the video frames corresponding to the overlapped area to the same viewpoint, and obtaining a dividing line fusing the video frames corresponding to the overlapped area by using a GraphCut method.
8. The system of claim 7, wherein when the image fusion system intercepts the video frames of different paths, it back-projects the dividing line into the video frames, and intercepts the actual used area of each video frame to obtain the corresponding video frame portion.
9. The method for realizing the three-dimensional augmented reality of the multi-channel video fusion is characterized by comprising the following steps of:
the three-dimensional scene system stores a field three-dimensional scene;
the video real-time resolving system performs real-time resolving on the received video stream to obtain a video frame;
the image projection system determines the mapping relation between the pixels in the video frame and the three-dimensional points in the three-dimensional scene, and performs texture mapping on the video frame in the three-dimensional scene according to the mapping relation so as to complete the image projection of the video frame;
the image fusion system determines texture values corresponding to three-dimensional points in an overlapped area of texture mapping, reconstructs texture of the overlapped area of the texture mapping according to the texture values to complete fusion of the texture mapping, wherein the texture values are determined according to weights of texture contributions of projectors corresponding to the overlapped area of the texture mapping;
and rendering the fused texture and the three-dimensional scene by the virtual three-dimensional rendering system.
10. A three-dimensional augmented reality device for realizing multi-channel video fusion is characterized by comprising: a display screen, a memory, and one or more processors;
the display screen is used for displaying a three-dimensional scene fusing multiple paths of videos;
the memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement the method for implementing multi-way video fused three-dimensional augmented reality of claim 9.
CN201911145076.2A 2019-08-21 2019-11-21 System, method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion Active CN110675506B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910775121.6A CN110517356A (en) 2019-08-21 2019-08-21 Realize system, the method and apparatus of the three-dimensional enhanced reality of multi-channel video fusion
CN2019107751216 2019-08-21

Publications (2)

Publication Number Publication Date
CN110675506A CN110675506A (en) 2020-01-10
CN110675506B true CN110675506B (en) 2021-07-09

Family

ID=68626098

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201910775121.6A Pending CN110517356A (en) 2019-08-21 2019-08-21 Realize system, the method and apparatus of the three-dimensional enhanced reality of multi-channel video fusion
CN201911145076.2A Active CN110675506B (en) 2019-08-21 2019-11-21 System, method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201910775121.6A Pending CN110517356A (en) 2019-08-21 2019-08-21 Realize system, the method and apparatus of the three-dimensional enhanced reality of multi-channel video fusion

Country Status (2)

Country Link
CN (2) CN110517356A (en)
WO (1) WO2021031455A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110517356A (en) * 2019-08-21 2019-11-29 佳都新太科技股份有限公司 Realize system, the method and apparatus of the three-dimensional enhanced reality of multi-channel video fusion
CN111415416B (en) * 2020-03-31 2023-12-15 武汉大学 Method and system for fusing monitoring real-time video and scene three-dimensional model
CN112380894B (en) * 2020-09-30 2024-01-19 北京智汇云舟科技有限公司 Video overlapping region target deduplication method and system based on three-dimensional geographic information system
CN112312230B (en) * 2020-11-18 2023-01-31 秒影工场(北京)科技有限公司 Method for automatically generating 3D special effect for film
CN112489225A (en) * 2020-11-26 2021-03-12 北京邮电大学 Method and device for fusing video and three-dimensional scene, electronic equipment and storage medium
CN112714304B (en) * 2020-12-25 2022-03-18 新华邦(山东)智能工程有限公司 Large-screen display method and device based on augmented reality
CN112949292B (en) * 2021-01-21 2024-04-05 中国人民解放军61540部队 Method, device, equipment and storage medium for processing return data of cluster unmanned aerial vehicle
CN114036347B (en) * 2021-11-18 2022-06-03 北京中关村软件园发展有限责任公司 Cloud platform supporting digital fusion service and working method
CN117152400B (en) * 2023-10-30 2024-03-19 武汉苍穹融新科技有限公司 Method and system for fusing multiple paths of continuous videos and three-dimensional twin scenes on traffic road
CN117560578B (en) * 2024-01-12 2024-04-16 北京睿呈时代信息科技有限公司 Multi-channel video fusion method and system based on three-dimensional scene rendering and irrelevant to view points
CN117853678B (en) * 2024-03-08 2024-05-17 陕西天润科技股份有限公司 Method for carrying out three-dimensional materialization transformation on geospatial data based on multi-source remote sensing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050082377A (en) * 2004-02-18 2005-08-23 삼성전자주식회사 Method and apparatus for integrated modeling of 3d object considering its physical features
CN101673403A (en) * 2009-10-10 2010-03-17 安防制造(中国)有限公司 Target following method in complex interference scene
CN104599243A (en) * 2014-12-11 2015-05-06 北京航空航天大学 Virtual and actual reality integration method of multiple video streams and three-dimensional scene
CN107067458A (en) * 2017-01-15 2017-08-18 曲阜师范大学 It is a kind of to strengthen the method for visualizing of texture advection

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7978928B2 (en) * 2007-09-18 2011-07-12 Seiko Epson Corporation View projection for dynamic configurations
CN102547350B (en) * 2012-02-02 2014-04-16 北京大学 Method for synthesizing virtual viewpoints based on gradient optical flow algorithm and three-dimensional display device
CN107918948B (en) * 2017-11-02 2021-04-16 深圳市自由视像科技有限公司 4D video rendering method
CN110060351B (en) * 2019-04-01 2023-04-07 叠境数字科技(上海)有限公司 RGBD camera-based dynamic three-dimensional character reconstruction and live broadcast method
CN110517356A (en) * 2019-08-21 2019-11-29 佳都新太科技股份有限公司 Realize system, the method and apparatus of the three-dimensional enhanced reality of multi-channel video fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050082377A (en) * 2004-02-18 2005-08-23 삼성전자주식회사 Method and apparatus for integrated modeling of 3d object considering its physical features
CN101673403A (en) * 2009-10-10 2010-03-17 安防制造(中国)有限公司 Target following method in complex interference scene
CN104599243A (en) * 2014-12-11 2015-05-06 北京航空航天大学 Virtual and actual reality integration method of multiple video streams and three-dimensional scene
CN107067458A (en) * 2017-01-15 2017-08-18 曲阜师范大学 It is a kind of to strengthen the method for visualizing of texture advection

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Novel view synthesis with light-weight view-dependent texture mapping for a stereoscopic HMD;Thiwat Rongsirigul等;《Proceedings of the IEEE International Conference on Multimedia and Expo》;20170831;第704-708页 *
体素可视外壳并行优化建模方法;张淑军等;《中国图象图形学报》;20110416;第16卷(第4期);第686-692页 *

Also Published As

Publication number Publication date
CN110517356A (en) 2019-11-29
CN110675506A (en) 2020-01-10
WO2021031455A1 (en) 2021-02-25

Similar Documents

Publication Publication Date Title
CN110675506B (en) System, method and equipment for realizing three-dimensional augmented reality of multi-channel video fusion
US11195314B2 (en) Artificially rendering images using viewpoint interpolation and extrapolation
US10600233B2 (en) Parameterizing 3D scenes for volumetric viewing
US11019259B2 (en) Real-time generation method for 360-degree VR panoramic graphic image and video
US11636637B2 (en) Artificially rendering images using viewpoint interpolation and extrapolation
US10733475B2 (en) Artificially rendering images using interpolation of tracked control points
CN112738010B (en) Data interaction method and system, interaction terminal and readable storage medium
US10726593B2 (en) Artificially rendering images using viewpoint interpolation and extrapolation
CN112738534B (en) Data processing method and system, server and storage medium
US20130321593A1 (en) View frustum culling for free viewpoint video (fvv)
JP2000215311A (en) Method and device for generating virtual viewpoint image
JPH08331607A (en) Three-dimensional display image generating method
CN112738495B (en) Virtual viewpoint image generation method, system, electronic device and storage medium
CN116228862A (en) AR glasses cooperative system combined with large-space vision positioning
CN115712351A (en) Hierarchical rendering and interaction method and system for multi-person remote mixed reality sharing scene
CN112738646B (en) Data processing method, device, system, readable storage medium and server
CN112738009B (en) Data synchronization method, device, synchronization system, medium and server
CN112734821B (en) Depth map generation method, computing node cluster and storage medium
WO2022191070A1 (en) 3d object streaming method, device, and program
CN117376540A (en) Virtual visual angle synthesis method and device based on depth map
CN116208725A (en) Video processing method, electronic device and storage medium
CN114071115A (en) Free viewpoint video reconstruction and playing processing method, device and storage medium
Chuchvara Real-time video-plus-depth content creation utilizing time-of-flight sensor-from capture to display

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 306, zone 2, building 1, Fanshan entrepreneurship center, Panyu energy saving technology park, No. 832 Yingbin Road, Donghuan street, Panyu District, Guangzhou City, Guangdong Province

Applicant after: Jiadu Technology Group Co.,Ltd.

Address before: Room 306, zone 2, building 1, Fanshan entrepreneurship center, Panyu energy saving technology park, No. 832 Yingbin Road, Donghuan street, Panyu District, Guangzhou City, Guangdong Province

Applicant before: PCI-SUNTEKTECH Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant