CN117241063B

CN117241063B - Live broadcast interaction method and system based on virtual reality technology

Info

Publication number: CN117241063B
Application number: CN202311499656.8A
Authority: CN
Inventors: 郭勇; 苑朋飞; 靳世凯; 赵存喜
Original assignee: China Film Annual Beijing Culture Media Co ltd
Current assignee: Zhongying Nian Nian Beijing Technology Co ltd
Priority date: 2023-11-13
Filing date: 2023-11-13
Publication date: 2024-01-26
Anticipated expiration: 2043-11-13
Also published as: CN117241063A

Abstract

The invention relates to the field of artificial intelligence, and discloses a live broadcast interaction method and system based on a virtual reality technology, which are used for improving live broadcast interaction instantaneity and user experience through the virtual reality technology. The method comprises the following steps: calibrating the virtual space position of the target live broadcast scene, determining the virtual space position, and creating the virtual scene to obtain a plurality of target scene elements; performing scene element state identification to obtain a plurality of scene element state data and performing live broadcast rendering in real time to obtain initial live broadcast virtual scene data; performing scene element tracking identification to obtain real-time operation interaction data of a plurality of scene elements and performing operation interaction response to obtain target live virtual scene data; performing virtual environment rendering delay analysis to obtain a rendering delay feature set; and inputting the rendering delay feature set into a rendering strategy analysis model to perform live virtual scene rendering strategy optimization analysis, obtaining a target scene rendering strategy and performing scene rendering parameter updating.

Description

Live broadcast interaction method and system based on virtual reality technology

Technical Field

The invention relates to the field of artificial intelligence, in particular to a live broadcast interaction method and system based on a virtual reality technology.

Background

With the advent of virtual reality technology and live interaction, there is an increasing need for a live experience that is more immersive and interactive. The traditional live broadcast mode is difficult to meet the expectations of users for richer and realistic interactive experience, so that a live broadcast interaction method based on the virtual reality technology is generated.

Traditional live versions are typically unidirectional in that the viewer can only view the content and cannot interact actively with the content. The advent of virtual reality technology has enabled live broadcasts to be more interactive, and viewers can participate more deeply in the live broadcast process. However, to realize highly interactive live broadcast, technical challenges of virtual reality technology, such as rendering delay, real-time, etc., need to be overcome.

Disclosure of Invention

The invention provides a live broadcast interaction method and system based on a virtual reality technology, which are used for improving live broadcast interaction instantaneity and user experience through the virtual reality technology.

The first aspect of the present invention provides a live broadcast interaction method based on a virtual reality technology, which includes:

calibrating virtual space positions of a plurality of initial scene elements in a target live broadcast scene, determining the virtual space position of each initial scene element, and creating a virtual scene of the target live broadcast scene based on the virtual space position to obtain a plurality of target scene elements;

Performing scene element state identification on the plurality of target scene elements to obtain a plurality of scene element state data, and performing live broadcast rendering on the target live broadcast scene in real time by a virtual reality technology to obtain initial live broadcast virtual scene data;

performing scene element tracking identification on the initial live virtual scene data to obtain a plurality of scene element real-time operation interaction data, and performing operation interaction response on the initial live virtual scene data through the virtual reality technology to obtain target live virtual scene data;

performing virtual environment rendering delay analysis on the target live virtual scene data to generate virtual environment rendering delay data;

performing rendering delay feature extraction on the virtual environment rendering delay data to obtain a rendering delay feature set;

and inputting the rendering delay feature set into a preset rendering strategy analysis model to perform live virtual scene rendering strategy optimization analysis to obtain a target scene rendering strategy, and performing scene rendering parameter updating on the target live virtual scene data according to the target scene rendering strategy.

With reference to the first aspect, in a first implementation manner of the first aspect of the present invention, the calibrating a virtual space position of a plurality of initial scene elements in a target live scene, determining a virtual space position of each initial scene element, and creating a virtual scene for the target live scene based on the virtual space position, so as to obtain a plurality of target scene elements, includes:

Performing scene element identification on a target live scene to obtain a plurality of initial scene elements;

distributing element identification codes of each initial scene element to obtain element identification codes corresponding to each initial scene element;

performing element construction analysis on the plurality of initial scene elements according to the element identification codes, and determining size construction data corresponding to each initial scene element;

performing virtual space position analysis on the plurality of initial scene elements according to the size construction data, and determining the virtual space position of each initial scene element;

and performing virtual scene mapping on the target live scene based on the virtual space position corresponding to each target scene element, and creating a plurality of corresponding target scene elements.

With reference to the first aspect, in a second implementation manner of the first aspect of the present invention, the performing scene element state recognition on the plurality of target scene elements to obtain a plurality of scene element state data, and performing live broadcast rendering on the target live broadcast scene in real time by using a virtual reality technology to obtain initial live broadcast virtual scene data includes:

carrying out scene element state identification on the plurality of target scene elements through preset element monitoring buried points to obtain a plurality of scene element state data;

Modeling parameter analysis is carried out on the plurality of target scene elements based on the scene element state data to obtain a plurality of scene element modeling parameters;

performing real-time rendering engine parameter analysis on the target live scene based on the scene element state data to obtain target rendering engine parameters;

and performing live broadcast rendering on the plurality of target scene elements in the target live broadcast scene in real time based on the plurality of scene element modeling parameters and the target rendering engine parameters to obtain initial live broadcast virtual scene data.

With reference to the first aspect, in a third implementation manner of the first aspect of the present invention, the performing scene element tracking identification on the initial live virtual scene data to obtain real-time operation interaction data of a plurality of scene elements, and performing operation interaction response on the initial live virtual scene data by using the virtual reality technology to obtain target live virtual scene data includes:

performing scene element range analysis on the initial live virtual scene data to determine a target scene element position range;

performing interactive operation region analysis on the initial live virtual scene data according to the target scene element position range, and determining a target interactive operation region;

Carrying out scene element tracking on each target scene element through the target interactive operation region, and determining real-time operation interactive data of a plurality of scene elements;

performing movement state analysis on the real-time operation interaction data of the plurality of scene elements to obtain corresponding scene element interaction tracks;

performing operation interaction response on a plurality of target scene elements based on the scene element interaction track through the virtual reality technology to obtain operation interaction response data;

and according to the operation interaction response data, performing rendering data updating on the initial live virtual scene data to generate target live virtual scene data.

With reference to the first aspect, in a fourth implementation manner of the first aspect of the present invention, the performing a virtual environment rendering delay analysis on the target live virtual scene data, to generate virtual environment rendering delay data, includes:

performing operation interaction time stamp extraction on the real-time operation interaction data of the plurality of scene elements to obtain first time stamp data;

carrying out hash calculation on the first timestamp data to obtain a plurality of first hash values, and encoding the plurality of first hash values through a preset first encoding mode to obtain a plurality of first encoded values;

Extracting the interaction response time stamp of the operation interaction response data to obtain second time stamp data;

performing hash calculation on the second timestamp data to obtain a plurality of second hash values, and encoding the second hash values in a preset second encoding mode to obtain a plurality of second encoded values;

and performing coding matching and data alignment on the plurality of first coding values and the plurality of second coding values, performing virtual environment rendering delay calculation on the target live virtual scene data, and generating virtual environment rendering delay data.

With reference to the first aspect, in a fifth implementation manner of the first aspect of the present invention, the performing a rendering delay feature extraction on the rendering delay data of the virtual environment to obtain a rendering delay feature set includes:

inputting the virtual environment rendering delay data into a preset double-layer convolution long-short time network, wherein the double-layer convolution long-short time network comprises a first-layer convolution long-short time network and a second-layer convolution long-short time network;

performing forward hidden feature extraction on the virtual environment rendering delay data through the first layer convolution long-short time network to obtain forward hidden features;

Performing backward hidden characteristic extraction on the virtual environment rendering delay data through the second-layer convolution long-short time network to obtain backward hidden characteristics;

and carrying out feature fusion on the forward hidden features and the backward hidden features to obtain a rendering delay feature set.

With reference to the first aspect, in a sixth implementation manner of the first aspect of the present invention, inputting the rendering delay feature set into a preset rendering policy analysis model to perform optimization analysis on a live virtual scene rendering policy, to obtain a target scene rendering policy, and performing scene rendering parameter update on the target live virtual scene data according to the target scene rendering policy, where the method includes:

inputting the rendering delay feature set into a preset rendering strategy analysis model, wherein the rendering strategy analysis model comprises a plurality of weak classifiers and a strategy optimization network, and each weak classifier comprises an input layer, a threshold circulation network and a fully-connected network;

performing feature coding mapping on the rendering delay feature set through an input layer in the weak classifier to generate a standard rendering delay feature vector;

performing high-dimensional feature extraction on the standard rendering delay feature vector through a threshold circulation network in the weak classifier to obtain a high-dimensional rendering delay feature vector;

Performing live virtual scene rendering parameter compensation operation on the high-dimensional rendering delay feature vector through a fully connected network in the weak classifier to obtain a rendering parameter compensation predicted value of each weak classifier;

acquiring classifier weight data of the weak classifiers, and carrying out weighted average calculation on the rendering parameter compensation predicted value according to the classifier weight data to obtain a target parameter compensation predicted value;

performing strategy initialization on the target parameter compensation predicted value through an improved genetic algorithm in the strategy optimization network to generate an initial scene rendering strategy group;

performing live virtual scene rendering strategy optimization analysis on the initial scene rendering strategy group to obtain a target scene rendering strategy;

and updating scene rendering parameters of the target live virtual scene data according to the target scene rendering strategy.

The second aspect of the present invention provides a live broadcast interaction system based on a virtual reality technology, where the live broadcast interaction system based on the virtual reality technology includes:

the creation module is used for calibrating virtual space positions of a plurality of initial scene elements in a target live broadcast scene, determining the virtual space position of each initial scene element, and creating the virtual scene of the target live broadcast scene based on the virtual space position to obtain a plurality of target scene elements;

The identification module is used for carrying out scene element state identification on the plurality of target scene elements to obtain a plurality of scene element state data, and carrying out live broadcast rendering on the target live broadcast scene in real time through a virtual reality technology to obtain initial live broadcast virtual scene data;

the response module is used for carrying out scene element tracking identification on the initial live virtual scene data to obtain a plurality of scene element real-time operation interaction data, and carrying out operation interaction response on the initial live virtual scene data through the virtual reality technology to obtain target live virtual scene data;

the analysis module is used for carrying out virtual environment rendering delay analysis on the target live virtual scene data and generating virtual environment rendering delay data;

the extraction module is used for extracting rendering delay characteristics of the virtual environment rendering delay data to obtain a rendering delay characteristic set;

and the updating module is used for inputting the rendering delay characteristic set into a preset rendering strategy analysis model to perform live virtual scene rendering strategy optimization analysis to obtain a target scene rendering strategy, and performing scene rendering parameter updating on the target live virtual scene data according to the target scene rendering strategy.

In the technical scheme provided by the invention, virtual space position calibration is carried out on a target live broadcast scene, the virtual space position is determined, and virtual scene creation is carried out to obtain a plurality of target scene elements; performing scene element state identification to obtain a plurality of scene element state data and performing live broadcast rendering in real time to obtain initial live broadcast virtual scene data; performing scene element tracking identification to obtain real-time operation interaction data of a plurality of scene elements and performing operation interaction response to obtain target live virtual scene data; performing virtual environment rendering delay analysis to obtain a rendering delay feature set; the invention can render and present the target live scene in real time through the virtual reality technology, so that the audience can obtain more real and immersed viewing experience, and compared with the traditional live broadcast, the delay is effectively reduced. The scene element tracking and interaction response enables the user to interact with elements in the virtual environment in real time, and the participation and interactivity of the user are improved. Through rendering delay analysis and rendering strategy optimization, rendering parameters can be dynamically adjusted, so that the live scene can achieve the best rendering effect under different equipment and network environments, and high-quality visual presentation is provided. Scene element state identification and real-time rendering engine parameter analysis allow the system to adjust live content according to user interaction and preference, personalized user experience is achieved, scene element tracking technology is combined with sensor data of virtual reality equipment, high-precision identification and response to user operation can be achieved, and naturalness and accuracy of interaction are improved. Through rendering strategy optimization analysis, the most suitable rendering strategy can be selected under different situations, so that high-quality live broadcast pictures can be provided under various equipment and network conditions, and the real-time performance of live broadcast interaction is improved.

Drawings

FIG. 1 is a schematic diagram of one embodiment of a live interaction method based on virtual reality technology in an embodiment of the present invention;

FIG. 2 is a flow chart of scene element status recognition in an embodiment of the invention;

FIG. 3 is a flowchart of scene element tracking and identification according to an embodiment of the present invention;

FIG. 4 is a flow chart of virtual environment rendering delay analysis in an embodiment of the invention;

fig. 5 is a schematic diagram of an embodiment of a live interaction system based on virtual reality technology according to an embodiment of the present invention.

Detailed Description

The embodiment of the invention provides a live broadcast interaction method and system based on a virtual reality technology, which are used for improving live broadcast interaction instantaneity and user experience through the virtual reality technology. The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus.

For easy understanding, the following describes a specific flow of an embodiment of the present invention, referring to fig. 1, and one embodiment of a live interaction method based on a virtual reality technology in the embodiment of the present invention includes:

s101, calibrating virtual space positions of a plurality of initial scene elements in a target live scene, determining the virtual space position of each initial scene element, and creating a virtual scene of the target live scene based on the virtual space positions to obtain a plurality of target scene elements;

it can be understood that the execution subject of the present invention may be a live broadcast interaction system based on a virtual reality technology, and may also be a terminal or a server, which is not limited herein. The embodiment of the invention is described by taking a server as an execution main body as an example.

Specifically, scene element identification is performed, and a plurality of initial scene elements are identified in a target live scene, wherein the elements can be actual objects, characters or virtual objects. These elements include buildings, cars, virtual characters, or other visual objects. And allocating element identification codes to each initial scene element. Each initial scene element is assigned a unique element identification code that facilitates tracking and identification of the elements in subsequent steps. The element identification code is a key datum for associating the initial scene element with its virtual spatial location and other attributes. And carrying out element construction analysis according to the element identification codes to determine the size construction data of each initial scene element. This includes attributes related to the size, shape, and other dimensions of the element. These data will be used to construct each target scene element accurately in the virtual scene. Further, virtual space position analysis is performed on the initial scene elements according to the size construction data to determine an accurate position of each initial scene element in the virtual space. This step involves the position coordinates, orientation, and relative position information of the element with respect to other elements. This ensures accurate placement of each element in the virtual environment. And performing virtual scene mapping based on the virtual space position corresponding to each target scene element, and creating a plurality of corresponding target scene elements. The virtual representation of the initial scene element is matched to a location in the actual virtual space to generate a plurality of target scene elements. These elements will be presented in a virtual live scene and available for viewing and interaction by the user. Consider, for example, a virtual reality live show. The initial scene elements include virtual musicians and musical instruments. These musicians and instruments are identified and then each musician and instrument is assigned a unique element identification code. Elemental construction analysis is performed to determine the appearance characteristics of the musician and the size of the instrument. Subsequently, the position of the musician and the placement of the instrument are determined by virtual spatial position analysis. Virtual representations of these musicians and instruments are mapped into virtual reality scenes to create virtual reality live shows in which spectators can enjoy and interact with the musicians' performances.

S102, performing scene element state identification on a plurality of target scene elements to obtain a plurality of scene element state data, and performing live broadcast rendering on a target live broadcast scene in real time by a virtual reality technology to obtain initial live broadcast virtual scene data;

specifically, the scene element state recognition is performed on a plurality of target scene elements by using preset element monitoring buried points. The states of elements in the target scene are detected and identified by sensors, cameras or other data sources. These status data may include information of the position, posture, action, color, etc. of the element. For example, considering the context of a virtual concert, element monitoring burial points can identify the location and performance status of singers, band members, and stage property. Modeling parameter analysis is performed using state data of a plurality of scene elements. This step involves matching the state data with a previously established model to obtain modeling parameters for a plurality of scene elements. These parameters can be used to accurately describe the appearance and behavior of the element. For example, the modeling parameters may include virtual appearance characteristics of the singer and acoustic characteristics of the instrument. Real-time rendering engine parameter analysis is also performed based on the state data of the scene elements. This step helps determine parameters of the real-time rendering engine used to render the virtual scene. These parameters can affect the appearance, quality, and performance of the virtual scene. For example, rendering engine parameters include lighting settings, texture resolution, and rendering speed. And then, based on modeling parameters of the plurality of scene elements and target rendering engine parameters, performing live broadcast rendering on the plurality of target scene elements in the target live broadcast scene in real time by using a virtual reality technology. The virtual representation of the scene element is combined with its modeling parameters and rendering engine parameters to render the virtual scene in real-time. This process requires a high degree of computational power and graphics processing to ensure that the virtual scene exhibits high quality effects in real-time interactions.

S103, carrying out scene element tracking identification on the initial live virtual scene data to obtain a plurality of scene element real-time operation interaction data, and carrying out operation interaction response on the initial live virtual scene data through a virtual reality technology to obtain target live virtual scene data;

it should be noted that, the scene element range analysis is performed on the initial live virtual scene data, and the position range of the target scene element is determined. This may include bounding boxes or regions of elements for subsequent interactive analysis and response. And carrying out interactive interaction operation area analysis according to the position range of the target scene element. This step helps to determine the target interactive operational area, i.e. the specific area where the user can interact with the scene element. This may include a contact area or trigger area of the element. And then, tracking the scene elements of each target scene element through the target interactive operation area so as to acquire real-time operation interactive data of a plurality of scene elements. This includes ways in which the user interacts with the element, such as dragging, clicking, gestures, etc. These data will be used to analyze the interactive behavior of the user. And carrying out movement state analysis on the real-time operation interaction data of the plurality of scene elements to obtain the interaction track of each scene element. This is to know how the user interacts with the elements, including movements, transformations, and state changes of the elements. And then, performing operation interaction response on the plurality of target scene elements based on the interaction track of the scene elements through a virtual reality technology. The virtual scene is updated in real time according to the interactive behavior of the user so as to reflect the operation of the user. For example, if the user clicks on an object in the virtual environment, the system may trigger a corresponding response in the virtual scene, such as an animation or sound effect of the object. And according to the operation interaction response data, performing rendering data updating on the initial live virtual scene data to generate target live virtual scene data. Elements in the virtual scene will be updated according to the user's interactive behavior to provide a consistent and immersive virtual live experience. Consider, for example, a live interaction with a virtual museum. In this embodiment, the initial live virtual scene data includes various exhibits in the virtual museum. The user may interact with these exhibits using the virtual reality head display and the handle. The system will analyze the location range of each exhibit and determine the area in which the user can interact with the exhibit. When a user clicks on an exhibit using a handle, the system tracks the clicking operation, records the user's interaction trajectory, and triggers a corresponding interaction response, such as introduction information of the exhibit or a view of the amplified exhibit. These responses update the virtual museum's scene in real time to meet the user's interaction needs, thereby providing a more attractive virtual museum visiting experience.

S104, performing virtual environment rendering delay analysis on the target live virtual scene data to generate virtual environment rendering delay data;

specifically, operation interaction time stamp extraction is performed on real-time operation interaction data of a plurality of scene elements. A time stamp is recorded in each interaction event for subsequent analysis of the rendering delay. These timestamps may be used to track the user's interactive operations and responses. Hash computation is performed on the first timestamp data to generate a plurality of first hash values. Ha Xiji is to convert the time stamp data into a hash value of fixed length for subsequent comparison and matching. Each interaction event generates a corresponding first hash value. And encoding the plurality of first hash values through a preset first encoding mode to obtain a plurality of first encoded values. Encoding is the conversion of hash values into a data format that can be transmitted or stored for subsequent processing. The first encoded value helps to identify a characteristic of the interaction event. And meanwhile, extracting the interaction response time stamp of the operation interaction response data to acquire second time stamp data. These time stamps represent the time at which the operational interaction response occurs, e.g. the time stamp at which the user clicked on an element in the virtual scene. Hash computation is performed on the second timestamp data to generate a plurality of second hash values. These hash values are used to identify timestamp data of the operational interaction response. And encoding the plurality of second hash values through a preset second encoding mode to obtain a plurality of second encoding values. The second encoded value helps to identify a characteristic of the operational interaction response. Subsequently, the plurality of first encoded values and the plurality of second encoded values are code matched and data aligned. The first encoded value and the second encoded value are compared to determine an association therebetween. This makes it possible to identify the correspondence between the user's operation and the response of the system. And performing virtual environment rendering delay calculation on the target live virtual scene data based on the results of the code matching and the data alignment. This process helps determine the delay in rendering the virtual environment, i.e., the time difference between the user's interactive operation and the response of the system. This delay data may be used to improve the user experience, ensuring that the response of the virtual environment remains synchronized with the user's operation.

S105, performing rendering delay feature extraction on the virtual environment rendering delay data to obtain a rendering delay feature set;

specifically, virtual environment rendering delay data is input into a preset double-layer convolution long-short time network. This network includes two key components: a first layer convolving the long and short time network and a second layer convolving the long and short time network. The role of these networks is to extract important features from the rendering delay data in order to further analyze and improve the virtual reality experience. And performing forward hidden characteristic extraction on the virtual environment rendering delay data through the first layer convolution long-short time network. This step uses Convolutional Neural Networks (CNNs) to analyze the data and capture information about the forward features. These forward hidden features may include trends, patterns, peaks, and other rendering delay related features in the delay data. And then, performing backward hidden characteristic extraction on the virtual environment rendering delay data through a second-layer convolution long-short time network. This step continues to use CNN, but it focuses on capturing backward features, which are typically related to dynamic changes in rendering delay and tail features. The forward hidden feature and the backward hidden feature are then fused together to form a set of rendering delay features. The purpose of this fusion process is to combine the forward and backward features together to obtain a more comprehensive, global feature set that better characterizes rendering delay. Feature fusion may employ various methods such as concatenation, weighted averaging, or other fusion techniques to ensure that the final feature set has more information. For example, rendering delay refers to the time from user operation to the virtual reality head display rendering the corresponding feedback. By monitoring and recording the rendering delay data, the system obtains a large amount of time series data. These time series data are fed into a preset double-layer convolution long-short-time network. The first layer network analyzes forward features in the data such as user head movements, gesture operations, and element dynamics in the virtual environment. The layer two network analyzes backward characteristics in the data, including the sustained impact of user operations and response delays of virtual environment elements. The forward and backward features are fused together to form a complete set of rendering delay features. This feature set may be used to analyze the trend of rendering delays, discover delay problems, and take appropriate measures to improve the quality of the virtual reality live stream to provide a smoother, more pleasing user experience.

S106, inputting the rendering delay feature set into a preset rendering strategy analysis model to conduct live virtual scene rendering strategy optimization analysis, obtaining a target scene rendering strategy, and conducting scene rendering parameter updating on target live virtual scene data according to the target scene rendering strategy.

Specifically, the rendering delay feature set is input into a preset rendering strategy analysis model. This model includes a plurality of weak classifiers and a policy optimization network. Each weak classifier consists of an input layer, a threshold loop network, and a fully connected network. In each weak classifier, the input layer is configured to perform feature encoding mapping on the rendering delay feature set to generate a standard rendering delay feature vector. And the threshold cyclic network performs high-dimensional feature extraction on the standard rendering delay feature vector to obtain a high-dimensional rendering delay feature vector. These high-dimensional features contain more information and can more accurately characterize rendering delays. And then aurone performs live virtual scene rendering parameter compensation operation on the high-dimensional rendering delay feature vector through a fully connected network to obtain a rendering parameter compensation predicted value of each weak classifier. These predictors represent adjustments to rendering parameters to optimize rendering quality. Classifier weight data for a plurality of weak classifiers is then obtained, and the weight data is used to measure the credibility of each classifier. And carrying out weighted average calculation on the rendering parameter compensation predicted value according to the weight data to obtain the target parameter compensation predicted value. This value integrates predictions for multiple classifiers to improve accuracy. And carrying out strategy initialization on the target parameter compensation predicted value through an improved genetic algorithm in the strategy optimization network. This step helps create an initial scene rendering strategy population as a starting point for the optimization analysis. Then, an optimization analysis is performed on the initial scene rendering strategy population to find the best rendering strategy. This analysis may include modeling rendering results under different policies, evaluating the performance of each policy, and selecting the policy with the best performance. And according to the obtained target scene rendering strategy, performing scene rendering parameter updating on the target live virtual scene data. These parameter updates may include graphics quality, frame rate, lighting effects, and other parameters that affect virtual scene rendering.

In the embodiment of the invention, virtual space position calibration is carried out on a target live broadcast scene, the virtual space position is determined, and virtual scene creation is carried out to obtain a plurality of target scene elements; performing scene element state identification to obtain a plurality of scene element state data and performing live broadcast rendering in real time to obtain initial live broadcast virtual scene data; performing scene element tracking identification to obtain real-time operation interaction data of a plurality of scene elements and performing operation interaction response to obtain target live virtual scene data; performing virtual environment rendering delay analysis to obtain a rendering delay feature set; the invention can render and present the target live scene in real time through the virtual reality technology, so that the audience can obtain more real and immersed viewing experience, and compared with the traditional live broadcast, the delay is effectively reduced. The scene element tracking and interaction response enables the user to interact with elements in the virtual environment in real time, and the participation and interactivity of the user are improved. Through rendering delay analysis and rendering strategy optimization, rendering parameters can be dynamically adjusted, so that the live scene can achieve the best rendering effect under different equipment and network environments, and high-quality visual presentation is provided. Scene element state identification and real-time rendering engine parameter analysis allow the system to adjust live content according to user interaction and preference, personalized user experience is achieved, scene element tracking technology is combined with sensor data of virtual reality equipment, high-precision identification and response to user operation can be achieved, and naturalness and accuracy of interaction are improved. Through rendering strategy optimization analysis, the most suitable rendering strategy can be selected under different situations, so that high-quality live broadcast pictures can be provided under various equipment and network conditions, and the real-time performance of live broadcast interaction is improved.

In a specific embodiment, the process of executing step S101 may specifically include the following steps:

(1) Performing scene element identification on a target live scene to obtain a plurality of initial scene elements;

(2) Distributing element identification codes of each initial scene element to obtain element identification codes corresponding to each initial scene element;

(3) Performing element construction analysis on a plurality of initial scene elements according to the element identification codes, and determining size construction data corresponding to each initial scene element;

(4) Performing virtual space position analysis on a plurality of initial scene elements according to the size construction data, and determining the virtual space position of each initial scene element;

(5) And performing virtual scene mapping on the target live scene based on the virtual space position corresponding to each target scene element, and creating a plurality of corresponding target scene elements.

Specifically, scene element identification is performed on the target live scene, so as to obtain a plurality of initial scene elements. Through computer vision techniques and image processing, the system will analyze various elements in the live scene, including characters, objects, background, etc. This may be achieved by techniques such as object detection, image segmentation, feature extraction, etc. And carrying out element identification code allocation on each initial scene element. Each identified element is assigned a unique identification code or tag so that the system can accurately distinguish them. This step is the basis for ensuring that the individual operations and interactions of the different elements can be performed in the subsequent processing. Subsequently, elemental structure analysis is performed. The appearance and characteristics of each initial scene element are analyzed. This includes characteristics in terms of color, shape, texture, size, etc. of the element. Next, size construction data is determined, which determines the size and scale of each initial scene element. This is to ensure that in the virtual environment, the size of the element matches the corresponding element in the actual scene, so that the virtual environment is more realistic. A virtual space location analysis is performed that involves determining the location and coordinates of each initial scene element in the virtual environment. This step ensures that each element can be precisely placed in the virtual environment so that the user can interact with them. And performing virtual scene mapping on the target live scene based on the virtual space position of each target scene element to create a plurality of corresponding target scene elements. This is to ensure that the virtual scene remains consistent with the actual scene, and the user can interact with the scene elements in the virtual environment to enjoy a realistic virtual experience.

In a specific embodiment, as shown in fig. 2, the process of executing step S102 may specifically include the following steps:

s201, carrying out scene element state identification on a plurality of target scene elements through preset element monitoring buried points to obtain a plurality of scene element state data;

s202, carrying out modeling parameter analysis on a plurality of target scene elements based on state data of the plurality of scene elements to obtain a plurality of scene element modeling parameters;

s203, performing real-time rendering engine parameter analysis on the target live scene based on the state data of the plurality of scene elements to obtain target rendering engine parameters;

s204, performing live broadcast rendering on a plurality of target scene elements in the target live broadcast scene based on the plurality of scene element modeling parameters and the target rendering engine parameters to obtain initial live broadcast virtual scene data.

Specifically, scene element state identification is carried out on a plurality of target scene elements through preset element monitoring buried points. Predefined monitoring buried points are embedded in the virtual live scene, and the buried points can track the states and behaviors of various elements. For example, for a virtual live program, a buried point may be set to monitor the status of elements such as a presenter, guest, background image, etc., such as location, motion, expression, etc. Based on these buried data, the system obtains state data for a plurality of scene elements. These data include the current state of each element, such as the position of the moderator, the action of the guest, changes in the background image, and the like. These state data provide the basis for subsequent analysis and rendering. Modeling parameter analysis is performed on a plurality of scene elements based on their state data. The system may use the state data to build a model to better understand the nature and behavior of the elements. For example, if a host in a virtual live has multiple states, the model may help the system understand how the states are associated with the host's animation model. Meanwhile, through the state data, the system can also conduct real-time rendering engine parameter analysis. The system evaluates real-time rendering requirements of the virtual live scene, such as parameters of lighting, textures, animations, etc. This helps to determine how to present the individual elements for best visual effect. With the modeling parameters and rendering engine parameters of the plurality of scene elements, the system starts to perform live rendering in real time. This step includes integrating virtual elements into the actual live stream to create the initial virtual live virtual scene data. The data contains a presentation of all virtual elements that the user can interact with in the virtual environment. For example, through state recognition, the system knows their location, motion, and expression. Through modeling parameter analysis, the system determines how to present these virtual hosts and guests to make them appear more realistic. Through the analysis of the parameters of the rendering engine and the real-time rendering, the system seamlessly integrates the virtual hosts and guests into the actual live broadcast, and provides a vivid virtual live broadcast experience for users.

In a specific embodiment, as shown in fig. 3, the process of executing step S103 may specifically include the following steps:

s301, analyzing a scene element range of initial live virtual scene data, and determining a target scene element position range;

s302, analyzing an interactive interaction operation area of the initial live virtual scene data according to the position range of the target scene element, and determining a target interactive interaction operation area;

s303, carrying out scene element tracking on each target scene element through a target interaction operation area, and determining real-time operation interaction data of a plurality of scene elements;

s304, carrying out movement state analysis on the real-time operation interaction data of the plurality of scene elements to obtain corresponding scene element interaction tracks;

s305, performing operation interaction response on a plurality of target scene elements based on the scene element interaction track through a virtual reality technology to obtain operation interaction response data;

and S306, updating rendering data of the initial live virtual scene data according to the operation interaction response data, and generating target live virtual scene data.

Specifically, scene element range analysis is performed on the initial live virtual scene data, and the position range of the target scene element is determined. The individual elements in the virtual scene are bounding box detected or contour analyzed to find their location ranges. And based on the target scene element position range, carrying out interactive interaction operation region analysis on the initial live virtual scene data. It is determined which regions can be used for interactive operations, such as touch, gesture, or other interaction means. This region of operation is typically related to the location and size of the target scene element. And tracking the scene elements of each target scene element through the target interactive operation area so as to determine the real-time position and state of each target scene element. This includes tracking movement, rotation, scaling and other interactive operations of the elements. And then, carrying out movement state analysis on the real-time operation interaction data of the plurality of scene elements to obtain the interaction track of the scene elements. This step helps to understand the manner in which the participant interacts with the element, such as dragging, clicking or rotating. These interaction trajectories capture the dynamic nature of the interaction. And performing operation interaction response on the plurality of target scene elements based on the scene element interaction track through a virtual reality technology. The virtual environment updates the scene elements in real time according to the interaction track so as to simulate the interaction effect. For example, if a participant drags an element in a virtual live, the virtual environment will respond in real-time and update the location of the element. And according to the operation interaction response data, performing rendering data updating on the initial live virtual scene data to generate target live virtual scene data. The virtual environment presents the updated data to the participants to reflect their interactions and operations.

In a specific embodiment, as shown in fig. 4, the process of executing step S104 may specifically include the following steps:

s401, operation interaction time stamp extraction is carried out on real-time operation interaction data of a plurality of scene elements, and first time stamp data are obtained;

s402, carrying out hash calculation on first timestamp data to obtain a plurality of first hash values, and encoding the plurality of first hash values through a preset first encoding mode to obtain a plurality of first encoded values;

s403, extracting interaction response time stamps of the operation interaction response data to obtain second time stamp data;

s404, carrying out hash calculation on the second timestamp data to obtain a plurality of second hash values, and coding the plurality of second hash values in a preset second coding mode to obtain a plurality of second coding values;

and S405, performing code matching and data alignment on the plurality of first code values and the plurality of second code values, and performing virtual environment rendering delay calculation on the target live broadcast virtual scene data to generate virtual environment rendering delay data.

Specifically, the system extracts operation interaction time stamps from the real-time operation interaction data of a plurality of scene elements, wherein the time stamps record the interaction time of the user and the virtual environment. These time stamps are used to track the order and timing of interactions. The system performs a hash calculation on the timestamp data, converting each timestamp into a unique hash value. This helps to ensure the uniqueness and integrity of the data. Then, the system encodes the hash values by using a preset encoding mode to obtain corresponding encoded values. The purpose of encoding is to reduce the size of the data, improve the transmission efficiency, while maintaining critical information of the data. The system then performs similar operations to process the operational interaction response data. It extracts the interactive response time stamp from the data, performs hash computation, and then encodes. Once the system has obtained the first encoded value and the second encoded value, it is code matched and data aligned. The system will compare the two sets of encoded values to find a matching pair of time stamps. This helps determine how the user's interactions affect the virtual environment and which actions trigger the response of the virtual environment. The system uses these matched pairs of timestamps to calculate rendering delay data for the virtual environment. This delay data indicates how the user interaction affects the presentation speed of the virtual environment. This is important to ensure fluency and real-time of virtual live broadcast. For example, if a spectator waves his hand in the virtual environment towards the presenter, the system will capture this gesture and generate a corresponding operational interactive response, such as the presenter's response to the spectator or an interactive action. The operational interaction response data is used to update the initial virtual scene data to generate target virtual live scene data. This ensures that the audience can interact with the virtual moderator and obtain an immersive virtual live experience in the virtual environment.

In a specific embodiment, the process of executing step S105 may specifically include the following steps:

(1) Inputting the virtual environment rendering delay data into a preset double-layer convolution long-short time network, wherein the double-layer convolution long-short time network comprises a first-layer convolution long-short time network and a second-layer convolution long-short time network;

(2) Performing forward hidden characteristic extraction on the virtual environment rendering delay data through a first layer convolution long-short time network to obtain forward hidden characteristics;

(3) Performing backward hidden characteristic extraction on the virtual environment rendering delay data through a second layer convolution long-short time network to obtain backward hidden characteristics;

(4) And carrying out feature fusion on the forward hidden features and the backward hidden features to obtain a rendering delay feature set.

Specifically, virtual environment rendering delay data is input into a preset double-layer convolution long-short time network. The double-layer convolution long-short time network consists of two main parts: a first layer convolving the long and short time network and a second layer convolving the long and short time network. The purpose of these two parts is to extract the forward and backward hidden features for subsequent analysis and fusion. And then, performing forward hidden characteristic extraction on the virtual environment rendering delay data through the first layer convolution long and short time network. The network analyzes the forward variation of the virtual environment rendering delay data and captures rendering delay characteristics at different time steps. These characteristics include information on the rate of change of delay, intensity, etc. Outputting a forward hidden feature. And performing backward hidden characteristic extraction on the virtual environment rendering delay data through a second-layer convolution long-short time network. This layer is similar to the first layer, but primarily analyzes the backward variation of virtual environment rendering delay data, capturing rendering delay characteristics over different time steps. These characteristics include information on the back-off condition of the delay, the reverse rate of change, etc. Outputting a backward hidden feature. And fusing the forward hidden characteristic and the backward hidden characteristic to obtain a rendering delay characteristic set. Feature fusion may take a variety of approaches, such as simple feature level stitching, weighted averaging, etc. The fused feature set captures forward and backward information in the virtual environment rendering delay data, providing a comprehensive rendering delay feature representation. For example, the rendering delay data includes rendering speeds and delay cases of different elements in the virtual scene. The system will extract the forward and backward hidden features and fuse them into a set of rendering delay features. With these features, the system better understands the pattern of rendering delays, thereby better adjusting the rendering strategy of the virtual live scene to ensure that the viewer gets the best virtual experience.

In a specific embodiment, the process of executing step S106 may specifically include the following steps:

(1) Inputting a rendering delay feature set into a preset rendering strategy analysis model, wherein the rendering strategy analysis model comprises a plurality of weak classifiers and a strategy optimization network, and each weak classifier comprises an input layer, a threshold circulation network and a full-connection network;

(2) Performing feature coding mapping on the rendering delay feature set through an input layer in the weak classifier to generate a standard rendering delay feature vector;

(3) Performing high-dimensional feature extraction on the standard rendering delay feature vector through a threshold circulation network in the weak classifier to obtain a high-dimensional rendering delay feature vector;

(4) Performing live virtual scene rendering parameter compensation operation on the high-dimensional rendering delay feature vector through a fully connected network in the weak classifier to obtain a rendering parameter compensation predicted value of each weak classifier;

(5) Obtaining classifier weight data of a plurality of weak classifiers, and carrying out weighted average calculation on the rendering parameter compensation predicted value according to the classifier weight data to obtain a target parameter compensation predicted value;

(6) Carrying out strategy initialization on the target parameter compensation predicted value through an improved genetic algorithm in a strategy optimization network to generate an initial scene rendering strategy group;

(7) Performing live virtual scene rendering strategy optimization analysis on the initial scene rendering strategy group to obtain a target scene rendering strategy;

(8) And updating scene rendering parameters of the target live virtual scene data according to the target scene rendering strategy.

Specifically, a set of rendering delay features is input into a preset rendering policy analysis model that includes a plurality of weak classifiers and a policy optimization network. Each weak classifier has three key components: input layer, threshold loop network and fully connected network. The task of these weak classifiers is to analyze and optimize the scene based on the input set of rendering delay features. Then, the rendering delay feature set is subjected to feature coding mapping through the input layer, and a standard rendering delay feature vector is generated. This step ensures that the feature data is ready for further processing. The threshold loop network is responsible for high-dimensional feature extraction of the standard rendering delay feature vector. This is to capture more complex and abstract features to better understand the pattern and association of rendering delays. The fully connected network uses the high-dimensional rendering delay feature vector to perform a compensation operation of the live virtual scene rendering parameters, thereby generating rendering parameter compensation predictions for each weak classifier. These predictors can be used to adjust rendering parameters of the virtual live scene to improve rendering quality. The multiple weak classifiers may each generate rendering parameter compensation predictions for different aspects and features. These predictions are weighted averaged according to their weights to obtain the target parameter compensation prediction. The purpose of this step is to integrate the results of the multiple weak classifiers to obtain the final parameter compensation prediction value. The policy optimization network then uses the improved genetic algorithm to policy initialize the target parameter compensation predictions. It generates an initial scene rendering strategy based on the current context and requirements. This strategy is constantly iterated and optimized to ensure the best rendering effect of the virtual live scene. And according to the optimized strategy, the scene rendering parameters are updated to generate target virtual live scene data. These updates will ensure that the virtual live broadcast provides the best visual and interactive effects in various situations, not only improving the viewer experience, but also ensuring the quality of the virtual live broadcast. For example, the rendering policy model may dynamically adjust rendering parameters of the virtual exhibit live to accommodate the complexity and visual needs of different exhibits. For example, as the viewer gets closer to a particular exhibit, the rendering parameters may be automatically adjusted to provide higher resolution and more detail. The optimization can improve the viewing experience of the virtual exhibition live broadcast, so that the audience can better interact and explore the virtual environment.

The live broadcast interaction method based on the virtual reality technology in the embodiment of the present invention is described above, and the live broadcast interaction system based on the virtual reality technology in the embodiment of the present invention is described below, referring to fig. 5, where one embodiment of the live broadcast interaction system based on the virtual reality technology in the embodiment of the present invention includes:

the creation module 501 is configured to perform virtual space position calibration on a plurality of initial scene elements in a target live scene, determine a virtual space position of each initial scene element, and perform virtual scene creation on the target live scene based on the virtual space position, so as to obtain a plurality of target scene elements;

the identifying module 502 is configured to identify a scene element state of the plurality of target scene elements to obtain a plurality of scene element state data, and perform live broadcast rendering on the target live broadcast scene in real time by using a virtual reality technology to obtain initial live broadcast virtual scene data;

a response module 503, configured to perform scene element tracking recognition on the initial live virtual scene data to obtain a plurality of scene element real-time operation interaction data, and perform operation interaction response on the initial live virtual scene data through the virtual reality technology to obtain target live virtual scene data;

The analysis module 504 is configured to perform virtual environment rendering delay analysis on the target live virtual scene data, and generate virtual environment rendering delay data;

the extracting module 505 is configured to perform rendering delay feature extraction on the virtual environment rendering delay data to obtain a rendering delay feature set;

the analysis module 506 is configured to input the rendering delay feature set into a preset rendering strategy analysis model to perform optimization analysis on the live virtual scene rendering strategy, obtain a target scene rendering strategy, and perform scene rendering parameter update on the target live virtual scene data according to the target scene rendering strategy.

Through the cooperation of the components, virtual space position calibration is carried out on the target live broadcast scene, the virtual space position is determined, virtual scene creation is carried out, and a plurality of target scene elements are obtained; performing scene element state identification to obtain a plurality of scene element state data and performing live broadcast rendering in real time to obtain initial live broadcast virtual scene data; performing scene element tracking identification to obtain real-time operation interaction data of a plurality of scene elements and performing operation interaction response to obtain target live virtual scene data; performing virtual environment rendering delay analysis to obtain a rendering delay feature set; the invention can render and present the target live scene in real time through the virtual reality technology, so that the audience can obtain more real and immersed viewing experience, and compared with the traditional live broadcast, the delay is effectively reduced. The scene element tracking and interaction response enables the user to interact with elements in the virtual environment in real time, and the participation and interactivity of the user are improved. Through rendering delay analysis and rendering strategy optimization, rendering parameters can be dynamically adjusted, so that the live scene can achieve the best rendering effect under different equipment and network environments, and high-quality visual presentation is provided. Scene element state identification and real-time rendering engine parameter analysis allow the system to adjust live content according to user interaction and preference, personalized user experience is achieved, scene element tracking technology is combined with sensor data of virtual reality equipment, high-precision identification and response to user operation can be achieved, and naturalness and accuracy of interaction are improved. Through rendering strategy optimization analysis, the most suitable rendering strategy can be selected under different situations, so that high-quality live broadcast pictures can be provided under various equipment and network conditions, and the real-time performance of live broadcast interaction is improved.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. The live broadcast interaction method based on the virtual reality technology is characterized by comprising the following steps of:

inputting the rendering delay feature set into a preset rendering strategy analysis model to perform live virtual scene rendering strategy optimization analysis to obtain a target scene rendering strategy, and performing scene rendering parameter update on the target live virtual scene data according to the target scene rendering strategy; the method specifically comprises the following steps: inputting the rendering delay feature set into a preset rendering strategy analysis model, wherein the rendering strategy analysis model comprises a plurality of weak classifiers and a strategy optimization network, and each weak classifier comprises an input layer, a threshold circulation network and a fully-connected network; performing feature coding mapping on the rendering delay feature set through an input layer in the weak classifier to generate a standard rendering delay feature vector; performing high-dimensional feature extraction on the standard rendering delay feature vector through a threshold circulation network in the weak classifier to obtain a high-dimensional rendering delay feature vector; performing live virtual scene rendering parameter compensation operation on the high-dimensional rendering delay feature vector through a fully connected network in the weak classifier to obtain a rendering parameter compensation predicted value of each weak classifier; acquiring classifier weight data of the weak classifiers, and carrying out weighted average calculation on the rendering parameter compensation predicted value according to the classifier weight data to obtain a target parameter compensation predicted value; performing strategy initialization on the target parameter compensation predicted value through an improved genetic algorithm in the strategy optimization network to generate an initial scene rendering strategy group; performing live virtual scene rendering strategy optimization analysis on the initial scene rendering strategy group to obtain a target scene rendering strategy; and updating scene rendering parameters of the target live virtual scene data according to the target scene rendering strategy.

2. The live interaction method based on virtual reality technology according to claim 1, wherein the performing virtual space position calibration on a plurality of initial scene elements in a target live scene, determining a virtual space position of each initial scene element, and performing virtual scene creation on the target live scene based on the virtual space positions, to obtain a plurality of target scene elements, includes:

3. The live broadcast interaction method based on virtual reality technology according to claim 1, wherein the performing scene element state recognition on the plurality of target scene elements to obtain a plurality of scene element state data, and performing live broadcast rendering on the target live broadcast scene in real time by virtual reality technology to obtain initial live broadcast virtual scene data includes:

4. The live broadcast interaction method based on virtual reality technology according to claim 3, wherein the performing scene element tracking recognition on the initial live broadcast virtual scene data to obtain a plurality of scene element real-time operation interaction data, and performing operation interaction response on the initial live broadcast virtual scene data through the virtual reality technology to obtain target live broadcast virtual scene data includes:

5. The live interaction method based on virtual reality technology according to claim 4, wherein the performing virtual environment rendering delay analysis on the target live virtual scene data generates virtual environment rendering delay data, and comprises:

6. The live interaction method based on virtual reality technology according to claim 1, wherein the performing a rendering delay feature extraction on the virtual environment rendering delay data to obtain a rendering delay feature set includes:

7. The live broadcast interaction system based on the virtual reality technology is characterized by comprising:

the updating module is used for inputting the rendering delay feature set into a preset rendering strategy analysis model to perform live virtual scene rendering strategy optimization analysis to obtain a target scene rendering strategy, and performing scene rendering parameter updating on the target live virtual scene data according to the target scene rendering strategy; the method specifically comprises the following steps: inputting the rendering delay feature set into a preset rendering strategy analysis model, wherein the rendering strategy analysis model comprises a plurality of weak classifiers and a strategy optimization network, and each weak classifier comprises an input layer, a threshold circulation network and a fully-connected network; performing feature coding mapping on the rendering delay feature set through an input layer in the weak classifier to generate a standard rendering delay feature vector; performing high-dimensional feature extraction on the standard rendering delay feature vector through a threshold circulation network in the weak classifier to obtain a high-dimensional rendering delay feature vector; performing live virtual scene rendering parameter compensation operation on the high-dimensional rendering delay feature vector through a fully connected network in the weak classifier to obtain a rendering parameter compensation predicted value of each weak classifier; acquiring classifier weight data of the weak classifiers, and carrying out weighted average calculation on the rendering parameter compensation predicted value according to the classifier weight data to obtain a target parameter compensation predicted value; performing strategy initialization on the target parameter compensation predicted value through an improved genetic algorithm in the strategy optimization network to generate an initial scene rendering strategy group; performing live virtual scene rendering strategy optimization analysis on the initial scene rendering strategy group to obtain a target scene rendering strategy; and updating scene rendering parameters of the target live virtual scene data according to the target scene rendering strategy.