CN111726594A

CN111726594A - Implementation method for efficient optimization rendering and pose anti-distortion fusion

Info

Publication number: CN111726594A
Application number: CN201910218901.0A
Authority: CN
Inventors: 周正华; 周益安
Original assignee: Shanghai Flying Ape Information Technology Co ltd
Current assignee: Shanghai Taojinglihua Information Technology Co ltd
Priority date: 2019-03-21
Filing date: 2019-03-21
Publication date: 2020-09-29

Abstract

The invention provides a method for realizing efficient optimization rendering and pose anti-distortion fusion, which relates to the embedded field and comprises the following steps: s1: acquiring external input data by using a CPU, wherein the external input data comprises four types, namely a panoramic or 3D data source, pose information, an FOV (field of view) and a projection mode; s2: performing hardware decoding by using the VPU, and transmitting the decoded output to the GPU; s3: performing map rendering by using a GPU, wherein the map rendering comprises projection modeling, chromaticity space reduction, FOV initialization and attitude fusion; s4: and performing loop iteration according to the requirements of the video or the image. The invention fully utilizes hardware acceleration, provides a real-time video pipeline rendering comprehensive method with various projection modes and supporting FOV (field of view) and pose information and the like, and solves VR (virtual reality) rendering of high-resolution panoramic videos, panoramic images and 3D (three-dimensional) video images by utilizing the performances of a VPU (virtual private Unit), a CPU (Central processing Unit) and a GPU (graphics processing Unit).

Description

Implementation method for efficient optimization rendering and pose anti-distortion fusion

Technical Field

The invention relates to the embedded field, in particular to a method for realizing efficient optimization rendering and pose anti-distortion fusion.

Background

With the rise of VR (virtual reality), how to make the most common handheld devices support the generation and output of VR panorama becomes a popular research issue.

The most common IVS (independent software vendor) solutions are based on CPU instruction acceleration, for example: the decoding acceleration is realized through the acceleration of the assembly; however, for panoramic rendering, the conventional instruction-based acceleration method cannot be basically realized because too many matrix operations are involved, and the basic instruction acceleration is far from sufficient; although hardware-based acceleration is also available, it is basically limited to decoding of video; for application scenes with more matrix type operations such as panorama and inverse distortion, the traditional CPU is very inefficient; with the rise of VR, rendering is performed based on GPU, but it is difficult to achieve optimization of overall hardware performance and VR rendering without combining common attributes related to GPU (image processing unit), CPU (general purpose computing processing unit), VPU (video processing unit) and VR.

Disclosure of Invention

In view of the above drawbacks of the prior art, an object of the present invention is to provide a method for implementing efficient optimization rendering and pose anti-distortion fusion, which has multiple projection modes, and solves the optimization of the overall hardware performance and VR rendering of a high-resolution panoramic video by using the performance of VPU, CPU, and GPU.

The invention provides a method for realizing efficient optimization rendering and pose anti-distortion fusion, which comprises the following steps:

s1: acquiring external input data by using a CPU, wherein the external input data comprises four types, namely a panoramic or 3D data source, pose information, an FOV (field of view) and a projection mode;

s2: performing hardware decoding by using the VPU, and transmitting decoded output to the GPU;

s3: performing map rendering by using a GPU, wherein the map rendering comprises projection modeling, chromaticity space reduction, FOV initialization and attitude fusion;

s4: and performing loop iteration according to the requirements of the video or the image.

Further, the panoramic or 3D data source is a video or an image, and the video and the image are in a panoramic format developed by equal latitude and longitude or in a 3D format; the pose information is output data capable of providing a three-dimensional pose information device; the FOV is the display field angle.

Further, the projection mode includes a plane projection mode, a spherical projection mode, and a cubic projection mode.

Further, the inverse distortion is also a special projection mode.

Further, the map rendering comprises the following steps:

s3.1: performing projection modeling based on the projection mode;

s3.2: initializing a GPU according to external input data;

s3.3: splicing and fusing through a customized vertex shader;

s3.4: performing chroma space conversion through a customized fragment shader;

s3.5: performing attitude fusion based on the pose information and the FOV;

s3.6: and performing real-time rendering by using a pipeline of the GPU, and controlling output display by using a ping-pong Buffer mechanism.

As described above, the implementation method for efficient optimization rendering and pose anti-distortion fusion has the following beneficial effects: the invention fully utilizes the common attributes related to VR such as CPU, VPU, GPU and the like of the handheld device to render the panoramic video and the picture, combines the demand points of large resolution, different projection modes, pose information, FOV and the like of the panoramic video and the picture, and effectively utilizes hardware acceleration to fulfill the rendering demand, thereby providing effective technical support for a universal embedded system to efficiently complete the panoramic rendering, fully utilizing the hardware capability of the universal system, greatly reducing the requirements of the panoramic rendering on hardware, greatly promoting the application of the panoramic to fall to the ground, and decoding all panoramas with 2K, 4K, 6K and higher resolution in the future.

Drawings

FIG. 1 is a general flow chart of the implementation method disclosed in the embodiment of the invention;

FIG. 2 is a diagram showing the relationship among the CPU, GPU and VPU disclosed in the embodiment of the present invention;

FIG. 3 is a flow chart illustrating the splicing and fusing steps disclosed in the embodiments of the present invention;

FIG. 4 is a flowchart illustrating the chrominance space down-conversion procedure disclosed in the embodiments of the present invention;

FIG. 5 is a flowchart illustrating the gesture fusion steps disclosed in the embodiments of the present invention.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

As shown in fig. 1 and fig. 2, the present invention provides a method for implementing efficient optimization rendering and pose anti-distortion fusion, the method includes the following steps:

the panoramic or 3D data source can be a video or an image, the video and the image are in a panoramic format developed by equal latitude and longitude, the general aspect ratio is 2:1, if the panoramic or 3D data source is a stereo, the left-right ratio is 4:1, and the up-down ratio is 1: 1; the pose information is output data capable of providing a three-dimensional pose information device such as: the thetas/Phi/Gamma (rotation angle of three-dimensional space along XYZ direction) is generally the output data of the gyroscope, but is not limited to the gyroscope; the FOV is generally a display field angle, and commonly used angles include 90 °, 110 °, 130 °, and the like; the projection mode is commonly used as a plane projection mode, a spherical projection mode, a cubic projection mode and the like;

in addition, the inverse distortion is also a special projection mode;

in the traditional video rendering, video decoding is the most important, and for panoramic rendering, the video decoding is not the most important but is an important link, and VPUs are used for hardware decoding to obtain texture input, and meanwhile, the texture input is transmitted to GPUs for high-performance operation;

besides common video decoding data, the texture input also refers to data of a watermark or a logo;

the map rendering comprises the following steps:

s3.1: performing projection modeling based on the projection mode;

s3.2: initializing a GPU according to external input data;

s3.3: splicing and fusing are carried out through a customized vertex shader, spherical XYZ coordinates, UV (texture) coordinates, weight and vertex sequence coordinates of a panoramic or 3D data source are calculated according to the customized vertex shader by taking spherical projection as an example, and the panoramic or 3D data source is spliced and fused; as shown in fig. 3, the number of cells per row and column is obtained from the original image of the panoramic or 3D data source through a LUT (look-up table), and the quality and efficiency of rendering can be balanced by configuring the number of rows and columns;

the LUT can be output data after self calibration, or generated by a third-party tool such as PT-GUI, and is a lookup table used for expanding specific positions after feature matching;

s3.4: performing chromaticity space down-conversion by the customized fragment shader, as shown in fig. 4, configuring a conversion matrix from a YUV space to an RGB space according to the customized fragment shader, and converting YUV data of each frame into RGB information suitable for LCD display;

s3.5: performing attitude fusion based on the pose information and the FOV, as shown in FIG. 5, firstly obtaining a view projection based on the FOV according to the input of the FOV, then performing fusion projection based on the pose matrix, and finally obtaining a final projection according to the initialized magnification attribute;

s3.6: and performing real-time rendering by using a GPU (graphics processing unit) pipeline, and controlling output display by using a ping-pong Buffer mechanism.

S4: performing loop iteration according to the requirements of videos or images;

the anti-distortion, Logo and watermark can be regarded as a special form of common rendering, and are the same logic, and the anti-distortion processing is performed while the common rendering is performed through the customization of a vertex shader and a fragment shader, and the rendering output is performed on the watermark and the Logo.

In conclusion, the invention fully utilizes hardware acceleration, provides a real-time video pipeline rendering comprehensive method which has multiple projection modes and supports FOV (field of view angle), pose information and the like, and solves VR rendering of high-resolution panoramic video by utilizing the performances of a VPU, a CPU and a GPU. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. An implementation method for efficient optimization rendering and pose anti-distortion fusion is characterized by comprising the following steps:

2. The method for realizing efficient optimization rendering and pose anti-distortion fusion according to claim 1, characterized by comprising the following steps: the panoramic or 3D data source is a video or an image, and the video and the image are in a panoramic format or a 3D format which is expanded by equal latitude and longitude; the pose information is output data capable of providing a three-dimensional pose information device; the FOV is the-display field angle.

3. The method for realizing efficient optimization rendering and pose anti-distortion fusion according to claim 1, characterized by comprising the following steps: the projection mode comprises a plane projection mode, a spherical projection mode and a cubic projection mode.

4. The method for realizing efficient optimization rendering and pose anti-distortion fusion according to claim 1, characterized by comprising the following steps: inverse distortion is also a special projection mode.

5. The method for realizing efficient optimization rendering and pose anti-distortion fusion according to claim 1, wherein the chartlet rendering comprises the following steps:

s3.1: performing projection modeling based on the projection mode;

s3.2: initializing a GPU according to external input data;

s3.3: splicing and fusing through a customized vertex shader;

s3.4: performing chroma space conversion through a customized fragment shader;

s3.5: performing attitude fusion based on the pose information and the FOV;