CN113254828B

CN113254828B - Seamless multi-mode content mixing exhibition method based on nonlinear editing technology

Info

Publication number: CN113254828B
Application number: CN202110562948.6A
Authority: CN
Inventors: 赵海英; 皮帮瀚; 梁昊光
Original assignee: BEIJING INTERNATIONAL STUDIES UNIVERSITY; Beijing University of Posts and Telecommunications
Current assignee: BEIJING INTERNATIONAL STUDIES UNIVERSITY; Beijing University of Posts and Telecommunications
Priority date: 2021-05-24
Filing date: 2021-05-24
Publication date: 2022-09-16
Anticipated expiration: 2041-05-24
Also published as: CN113254828A

Abstract

A seamless multi-modal content mixing presentation method based on a non-linear editing technology comprises the following steps: firstly, reading, storing and processing multi-mode data by using a multi-mode content presentation method supported by multiple formats; then, by utilizing a nonlinear editing technology, only multi-modal data needs to be uploaded once, so that the single time sequence editing limit can be broken through, multiple times of arrangement and editing are carried out according to various sequences, and the nonlinear structure is constructed and presented in a mixed mode through a corresponding structure; and finally, rendering the multi-modal content to the page efficiently by using a Canvas layering-based rendering method. This method is a method for editing multi-modal content by a nonlinear structure, and has the characteristics of rapidness, simplicity, and high degree of freedom, as compared with linear editing performed in a fixed time sequence. The embodiment of the invention can effectively solve the problems of low efficiency and single form of making the augmented reality content at the webpage end, so that a user can automatically experience the multimode Web AR content with a story line.

Description

Seamless multi-mode content mixing exhibition method based on nonlinear editing technology

Technical Field

The invention relates to the technical field of virtual reality, in particular to a seamless multi-modal content mixed exhibition method based on a nonlinear editing technology, which relates to the application of augmented reality at a webpage end.

Background

Cultural relics, which are mental resources of culture, are subordinate to culture and are components of culture. The digital propagation of cultural heritage is an indispensable part in the nation of self confidence and strong culture. In recent years, the development of virtual Reality related technologies is rapid, and among them, the Augmented Reality (Augmented Reality) technology becomes one of the very good expression ways for the digitization of cultural relics. The augmented reality (Web AR) at the webpage end has the advantages of natural cross-platform and light weight because the user can experience the augmented reality effect only through the webpage without any software. The combination of cultural relic digital exhibition and Web AR technology is undoubtedly one of the best exhibition forms in the cultural relic digital propagation field.

However, the cultural relic display system based on the Web AR also has a plurality of disadvantages: the method has the advantages of high threshold, low development efficiency, high maintenance cost, complex multi-mode data and few sharing means. The existing Web AR cultural relic display system has the problems of complex multi-modal data and difficult effective integration of contents; this not only makes the Web AR content production form single, but also makes it difficult to produce quality content having a story line.

Disclosure of Invention

The invention aims to solve the problems in the Web AR content production process, and provides a seamless multi-modal content mixing presentation method based on a nonlinear editing technology.

The invention provides a seamless multi-modal content mixing exhibition method based on a nonlinear editing technology, which is characterized by comprising the following steps:

step 1, reading, storing and calculating multi-mode data suitable for Web AR, wherein the multi-mode data comprises a three-dimensional model, pictures, characters, videos and audios;

step 2, firstly constructing nonlinear structures of the modal data nodes, then combining the independent modal data nodes according to the constructed nonlinear structures, and finally, mixing and presenting the data in the Web AR scene according to the configuration resource information of each mode, wherein the specific process is as follows:

2.1, constructing a nonlinear structure of multi-modal data, namely firstly defining a node structure of each modal data, defining a universal node structure and refining the attribute of each node according to the characteristic of each modal data, wherein each attribute and the meaning of the universal node structure are respectively as follows:

(1) NodeId: a unique identifier of a node for distinguishing different nodes;

(2) NodeName: the name of the node;

(3) NodeType: the types of the nodes at least comprise 7 types of the nodes: Start-Start node, End-End node, Model-three-dimensional Model node, Picture-Picture node, Text-Text node, Video-Video node and Audio-Audio node;

(4) NodeLife: the life cycle of a node is a four-tuple: nodelength (lifeLength, canSkip, startTime, endTime), where lifeLength is the life length of a node, i.e., the duration; canSkip indicates whether the node can be skipped; startTime and endTime respectively represent the start time and end time of a node;

(5) NodeProps: the attributes of the nodes at least comprise fileName-fileName, fileType-file type, filePath-file storage path, upLoader-upLoader, uplodeTime-uploading time;

(6) NodeConnections: the association attribute with the current node is integrally represented by a set, each element in the set is a binary connection (source, target), the source is a source node, the target is a target node, and the binary represents a directed connection line associated with the current node;

(7) NodeInfo: node information including node creation time and creator information;

2.2, a two-tuple is used to represent that a total time period T of a nonlinear structure is (start, end), where start and end correspond to a start time point and an end time point of the total time period of the nonlinear structure, respectively, and there is start < end, then a total duration L of the nonlinear structure is end-start, and possible timing relationships for durations d1 and d2 of any two nodes n1 and n2 in the same nonlinear structure are as follows:

(1) neighbor (d1, d 2): n1 and n2 are in a sequential order relationship, so that d2 starts immediately after d1 ends;

(2) gradation (d1, d 2): n1 and n2 are in sequential relation but not adjacent, then d2 begins after a period of time after d1 ends;

(3) async (d1, d 2): n1 and n2 have no sequence relation, the d1 and d2 respectively execute and are not influenced by each other;

according to the time sequence relation, for the data of the five modes, firstly, displaying a three-dimensional model, then, sequentially displaying texts and pictures, simultaneously playing audio, and finally, displaying videos after the pictures and the audio are all finished so as to obtain time nodes related to the display time sequence; analyzing the sequence and the duration of the time nodes to obtain the relationship between the duration of each node of the nonlinear structure and the total time period T; finally, obtaining the logic combination flow of each modal node through the conversion of time mapping;

step 3, adopting a Canvas-based layered rendering method, integrating three-dimensional scenes of a WebRTC video stream and a WebGL and webpage DOM elements, and realizing the layered rendering of a Web AR page and the mixed presentation of a nonlinear structure, wherein the Canvas-based layered implementation method comprises the following specific steps:

3.1, creating a < canvas > element and adding it to the DOM structure;

3.2, adding size and positioning attributes to the < canvas > element to support layering;

3.3, adding a transparency style element to the canvas to generate a transparent background;

3.4, repeating the three steps to generate a multilayer canvas;

and 3.5, binding the node data needing to be hierarchically rendered and displayed on different Canvas hierarchies of the hierarchies through a logic combination flow of each modal node of the nonlinear structure.

The present invention is characterized in that it is fast, simple, and highly flexible to perform multi-modal content editing by a nonlinear structure, as compared with linear editing performed in a certain time sequence. Therefore, the seamless multi-mode content mixed exhibition method based on the nonlinear editing technology can effectively solve the problems of low production efficiency and single form of the augmented reality (Web AR) content at the webpage end, so that a user can automatically experience the multi-mode Web AR content with a story main line.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a seamless multimodal content mixing presentation method based on a nonlinear editing technology according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of a time node with timing dependent display.

Fig. 3 is a schematic diagram of the relationship between the duration of each node of the nonlinear structure and the total time period T.

Fig. 4 is a schematic diagram of a logic combination flow of each modal node.

Detailed Description

The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a flowchart of a seamless multimodal content mixing presentation method based on a nonlinear editing technology according to an embodiment of the present invention. The principle of the method is as follows: firstly, reading, storing and processing multi-mode data by using a multi-mode content presentation method supported by multiple formats; then, constructing a nonlinear structure by utilizing a nonlinear editing technology and performing mixed presentation by a corresponding structure through a time sequence control method; and finally, rendering the multi-modal content to the page efficiently by using a Canvas layering-based rendering method.

Specifically, the seamless multimodal content mixing presentation method based on the nonlinear editing technology in the embodiment includes the following steps:

s110, reading, storing, calculating and processing multi-mode content suitable for Web AR, including 5 types of modal data such as models, pictures, characters, videos and audios; taking the first four hexadecimal codes of the header of the binary file to judge the accurate type of the file for the multi-modal data file to be processed; and for model data, parsing, storing and rendering in three formats, namely obj, fbx, glTF and the like, are provided.

And S120, defining a general node structure of each modal data, and refining the attribute of the node according to the characteristic of each modal data. The method comprises the following specific steps:

the nonlinear structure of the multi-modal data is constructed by firstly defining the node structure of each modal data, defining a universal node structure and refining the attribute of the node according to the characteristic of each modal data, wherein each attribute and the meaning of the universal node structure are respectively as follows:

(1) NodeId: a unique identifier of a node for distinguishing different nodes;

(2) NodeName: the name of the node;

(4) NodeLife: the life cycle of a node is a four-tuple: nodelength (lifeLength, canSkip, startTime, endTime), where lifeLength is the life length of a node, i.e., the duration; canSkip indicates whether the node can be skipped; startTime and endTime respectively represent the start time and the end time of a node;

s130, combining the independent modal data nodes according to the constructed nonlinear structure, and displaying in a Web AR scene according to the configuration resource information of each mode in a mixed mode, wherein the specific process is as follows:

using a two-tuple to represent a total time period T of a nonlinear structure (start, end), where start and end correspond to a start time point and an end time point of the total time period of the nonlinear structure, respectively, and if there is start < end, then the total duration L of the nonlinear structure is end-start, and for durations d1 and d2 of any two nodes n1 and n2 in the same nonlinear structure, the possible timing relationship is as follows:

(3) async (d1, d 2): n1 and n2 have no sequential relation, then d1 and d2 respectively execute and are not influenced by each other;

according to the time sequence relationship, for the data of the five modes, firstly, the three-dimensional model is displayed, then, the text and the picture are sequentially displayed, the audio is played, and finally, after the picture and the audio are finished, the video is displayed, so that the time node (shown in fig. 2) related to the display time sequence is obtained.

Analyzing the sequence and the duration of the time nodes to obtain the relationship between the duration of each node of the nonlinear structure and the total time period T (as shown in FIG. 3); and finally, obtaining the logic combination flow of each modal node through the conversion of the time mapping (as shown in fig. 4).

S140, adopting a Canvas-based layered rendering method, integrating three-dimensional scenes of a WebRTC video stream and a WebGL and webpage DOM elements, and realizing the layered rendering of a Web AR page and the mixed presentation of a nonlinear structure, wherein the Canvas-based layered implementation method comprises the following specific steps:

3.1, creating and adding a < canvas > element to the DOM structure;

3.4, repeating the three steps to generate a multilayer canvas;

and 3.5, binding the node data needing to be subjected to layered rendering display on different Canvas layers of the layers through the logic combination flow of each modal node of the nonlinear structure.

According to the time sequence relation, for the data of the five modes, firstly, three-dimensional model display is carried out, then, text and picture display is carried out sequentially, audio is played, and finally, after the pictures and the audio are finished, video display is carried out;

in addition to the above embodiments, the present invention may have other embodiments. All technical solutions formed by adopting equivalent substitutions or equivalent transformations fall within the protection scope of the claims of the present invention.

Claims

1. A seamless multi-modal content mixing presentation method based on a non-linear editing technology is characterized by comprising the following steps:

(1) NodeId: a unique identifier of a node for distinguishing different nodes;

(2) NodeName: the name of the node;

(6) node connections: the association attribute with the current node is integrally represented by a set, each element in the set is a binary connection (source, target), the source is a source node, the target is a target node, and the binary represents a directed connection line associated with the current node;

according to the time sequence relation, for the data of the five modes, firstly, displaying a three-dimensional model, then, sequentially displaying texts and pictures, simultaneously playing audio, and finally, displaying videos after the pictures and the audio are all finished so as to obtain time nodes related to the display time sequence; analyzing the sequence and the duration of the time nodes to obtain the relationship between the duration of each node of the nonlinear structure and the total time period T; finally, obtaining the logic combination flow of each modal node through conversion of time mapping;

3.1, creating a < canvas > element and adding it to the DOM structure;

3.4, repeating the three steps to generate a multilayer canvas;

2. The seamless multi-modal content mix presentation method based on non-linear editing technology as claimed in claim 1, wherein: in the step 1, the multi-modal data file to be processed is converted into a binary form, and the first four hexadecimal codes of the header of the file are used for judging the accurate type of the file.

3. The method of claim 1, wherein the method comprises: in the step 1, for the data of the model, which is a modality, the parsing, storing and rendering of the three formats, namely obj, fbx and glTF, are provided.