CN110418095B

CN110418095B - Virtual scene processing method and device, electronic equipment and storage medium

Info

Publication number: CN110418095B
Application number: CN201910578517.1A
Authority: CN
Inventors: 贺杰; 戴景文
Original assignee: Guangdong Virtual Reality Technology Co Ltd
Current assignee: Guangdong Virtual Reality Technology Co Ltd
Priority date: 2019-06-28
Filing date: 2019-06-28
Publication date: 2021-09-14
Anticipated expiration: 2039-06-28
Also published as: CN110418095A

Abstract

The embodiment of the application discloses a processing method and device of a virtual scene, electronic equipment and a storage medium. The processing method of the virtual scene comprises the following steps: generating a virtual session scene corresponding to a remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session; acquiring user data corresponding to the one or more terminal devices; analyzing the user data to obtain user emotion information corresponding to the one or more terminal devices; and when the user emotion information meets the set emotion condition, obtaining adjustment content matched with the user emotion information meeting the set emotion condition, and adjusting the virtual session scene according to the adjustment content. The method can improve the remote session effect.

Description

Virtual scene processing method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of display technologies, and in particular, to a method and an apparatus for processing a virtual scene, an electronic device, and a storage medium.

Background

In recent years, with the rapid development of network technology and technology, more and more people use electronic devices to perform remote conversations (e.g., chatting, conferencing, etc.). In a common remote conversation, voice input by a user and an image collected by a camera are transmitted, but the user is difficult to feel personally on the scene, so that a conversation effect is poor.

Disclosure of Invention

The embodiment of the application provides a processing method and device of a virtual scene, electronic equipment and a storage medium, which can feed back the emotion of a user in a remote session and improve the effect of the remote session.

In a first aspect, an embodiment of the present application provides a method for processing a virtual scene, where the method includes: generating a virtual session scene corresponding to a remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session; acquiring user data corresponding to the one or more terminal devices; analyzing the user data to obtain user emotion information corresponding to the one or more terminal devices; and when the user emotion information meets the set emotion condition, obtaining adjustment content matched with the user emotion information meeting the set emotion condition, and adjusting the virtual session scene according to the adjustment content.

In a second aspect, an embodiment of the present application provides an apparatus for processing a virtual scene, where the apparatus includes: the system comprises a scene generation module, a data acquisition module, an emotion analysis module and a scene adjustment module, wherein the scene generation module is used for generating a virtual session scene corresponding to a remote session, and the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session; the data acquisition module is used for acquiring user data corresponding to the remote equipment; the emotion analysis module is used for analyzing the user data to obtain user emotion information corresponding to the one or more terminal devices; and the scene adjusting module is used for acquiring adjusting content matched with the user emotion information meeting the set emotion condition when the user emotion information meets the set emotion condition, and adjusting the virtual session scene according to the adjusting content.

In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of processing a virtual scene as provided in the first aspect above.

In a fourth aspect, an embodiment of the present application provides a storage medium, where a program code is stored in the computer-readable storage medium, and the program code may be called by a processor to execute the processing method for the virtual scene provided in the first aspect.

According to the scheme, a virtual session scene corresponding to the remote session is generated, the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session, user data corresponding to the one or more terminal devices are obtained, the user data are analyzed, emotion information of users corresponding to the one or more terminal devices is obtained, when the emotion information of the users meets set emotion conditions, adjustment content matched with the emotion information of the users meeting the set emotion conditions is obtained, the virtual session scene is adjusted according to the adjustment content, emotion of the users in the remote session can be fed back, real feeling of the users is given, and the effect of the remote session is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 shows a schematic diagram of an application scenario suitable for use in an embodiment of the present application.

Fig. 2 shows another schematic diagram of an application scenario suitable for use in an embodiment of the present application.

FIG. 3 shows a flow diagram of a method for processing a virtual scene according to one embodiment of the present application.

Fig. 4 is a schematic diagram illustrating a display effect according to an embodiment of the present application.

Fig. 5 shows a flow chart of a processing method of a virtual scene according to another embodiment of the present application.

Fig. 6 is a schematic diagram illustrating a display effect according to another embodiment of the present application.

Fig. 7 shows a flow chart of a processing method of a virtual scenario according to yet another embodiment of the present application.

Fig. 8 is a schematic diagram illustrating a display effect according to another embodiment of the present application.

Fig. 9 is a flowchart illustrating a method for generating a virtual scene according to still another embodiment of the present application.

Fig. 10 is a schematic diagram illustrating a display effect according to still another embodiment of the present application.

Fig. 11 shows a block diagram of a processing apparatus for a virtual scene according to an embodiment of the present application.

Fig. 12 is a block diagram of a terminal device for executing a processing method of a virtual scene according to an embodiment of the present application.

Fig. 13 is a block diagram of a server for executing a processing method of a virtual scene according to an embodiment of the present application.

Fig. 14 is a storage unit for storing or carrying program codes for implementing a processing method of a virtual scene according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

An application scenario of the virtual scenario generation method provided in the embodiment of the present application is described below.

Referring to fig. 1, a schematic diagram of an application scenario of a virtual scenario generation method provided in an embodiment of the present application is shown, where the application scenario includes an interactive system 10, and the interactive system 10 may be applied to a remote session. The interactive system 10 includes: a plurality of terminal devices 100 and a server 200, wherein the terminal devices 100 are connected with the server 200.

In some embodiments, the terminal device 100 is communicatively connected to the server 200 through a network, so that data interaction between the terminal device 100 and the server 200 is possible. The terminal device 100 may access the network where the router is located, and communicate with the server 200 through the network where the router is located, and of course, the terminal device 100 may also communicate with the server 200 through a data network.

In some embodiments, the terminal device 100 may be a head-mounted display device, and may also be a mobile device such as a mobile phone or a tablet. When the terminal device 100 is a head-mounted display device, the head-mounted display device may be an integrated head-mounted display device. The terminal device 100 may also be an intelligent terminal such as a mobile phone connected to an external/access head-mounted display device, that is, the terminal device 100 may be inserted or accessed into the external head-mounted display device as a processing and storage device of the head-mounted display device, and display virtual content on the head-mounted display device. In the remote session, the terminal device 100 may be configured to display a Virtual session scene of the remote session, so as to implement AR (Augmented Reality) display or VR (Virtual Reality) display on a scene picture of the Virtual session scene, and improve a display effect of the scene picture in the remote session. In another embodiment, the terminal device 100 may also be a display device such as a computer, a tablet computer, or a television, and the terminal device 100 may display a 2D (2 Dimensions) screen corresponding to the virtual conversation scene.

In some embodiments, the terminal device 100 may collect information data in a remote session (e.g., collect facial information, voice data, etc. of a user) to build a three-dimensional model from the information data. In other embodiments, the terminal device 100 may also perform modeling according to information data such as face information, voice data, and body model stored in advance, or may perform modeling in combination with the information data stored in advance and the collected information data. For example, the terminal device 100 may collect face information in real time to establish a face model, where the face information may include expression information and morphological action information (such as head-off, head-on, etc.), and then integrate the face model with a preset body model, so that the time for modeling and rendering is saved, and the expression and morphological action of the user can be obtained in real time. In some embodiments, the terminal device 100 may transmit the collected information data to the server 200 or other terminal devices 100.

In some embodiments, referring to fig. 2, the interactive system 100 may also include an information collecting device 300, where the information collecting device 300 is configured to collect the information data (for example, collect facial information, voice data, etc. of the user) and transmit the collected information data to the terminal device 100 or the server 200. In some embodiments, the information collecting device 300 may include a camera, an audio module, and the like, and may also include various sensors such as a light sensor and an acoustic sensor. As a specific embodiment, the information collecting apparatus 300 may be a photographing device (such as an RGB-D Depth camera) having functions of a common color camera (RGB) and a Depth camera (Depth) to acquire Depth data of a photographed user, so as to obtain a three-dimensional structure corresponding to the user. In a specific embodiment, the information collecting apparatus 300 and the terminal device 100 may be in the same field environment so as to collect information of a user corresponding to the terminal device 100, and the information collecting apparatus 300 may be connected or not connected to the terminal device 100, which is not limited herein.

In some embodiments, the server 200 may be a local server or a cloud server, and the type of the specific server 200 may not be limited in this embodiment. In the remote session, the server 200 may be configured to implement data interaction between multiple terminal devices 100/information collecting apparatuses 300, so as to ensure data transmission and synchronization between multiple terminal devices 100/information collecting apparatuses 300, and implement virtual session scenes, synchronization of audio and video data, data transmission between terminal devices 100/information collecting apparatuses 300, and the like in the remote session.

In some embodiments, when at least two terminal devices 100 exist in the same field environment (for example, in the same room) among the plurality of terminal devices 100 in the remote session, the at least two terminal devices 100 in the same field environment may also be connected through a communication method such as bluetooth, WiFi (Wireless Fidelity), ZigBee (ZigBee technology), or the like, or may also be connected through a wired communication method such as a data line, so as to implement data interaction between the at least two terminal devices 100 in the same field environment. Of course, the connection mode between at least two terminal devices 100 in the same field environment may not be limited in the embodiment of the present application.

A specific processing method of the virtual scene is described below.

Referring to fig. 3, an embodiment of the present application provides a method for processing a virtual scene, where the method for processing a virtual scene may include:

step S110: and generating a virtual session scene corresponding to the remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session.

The remote session refers to a process of performing remote interaction and communication through multiple ends established by data communication. The virtual conversation scene is a 3D (3 Dimensions) scene in a virtual space, and the virtual conversation scene may at least include a virtual object, and a position of the virtual object in the virtual conversation scene relative to a world coordinate origin in a world coordinate system is fixed. The virtual object may include a virtual character model, a virtual character avatar, and the like corresponding to the terminal device in the remote session, for example, a simulated stereo image of a user corresponding to the terminal device; the virtual object may also include a virtual object image, a virtual animal image, and the like associated with the user, which is not limited herein, for example, the virtual object may also include a virtual document content shared by the terminal device in the remote session, and the like. Of course, the specific content in the virtual session scene may not be limited, and for example, a virtual conference table, a virtual tablecloth, a virtual ornament, and the like may also be included in the virtual session scene.

In some embodiments, the server may generate the virtual session scene according to participation data of the terminal devices participating in the remote session and content data of the virtual object. The participation data can include one or more data of the identity information of the user corresponding to the terminal device, the time for the terminal device to join the remote session, the spatial position of the terminal device in the real scene, the posture of the terminal device and the place where the terminal device is located; the content data of the virtual object may be three-dimensional model data of the virtual object, and the three-dimensional model data may include colors, model vertex coordinates, model contour data, and the like for constructing a model corresponding to the three-dimensional model. The participation data of the terminal device may be acquired from the terminal device or an information acquisition device in the same real scene as the terminal device, and the content data of the virtual object may be stored locally or acquired from the terminal device, which is not limited herein.

As a specific implementation manner, the server may perform position arrangement on the virtual object corresponding to the terminal device according to the participation data, determine the position of the virtual object in the virtual space according to the position arrangement result, and then render and generate the virtual session scene including the virtual object corresponding to the terminal device according to the content data of the virtual object and the position of the virtual object in the virtual space.

In the embodiment of the application, the generated virtual session scene may be used to generate a scene picture of the virtual session scene displayed by the terminal device, and the picture data of the scene picture is sent to the terminal device, and the terminal device may display the scene picture, so that a user may observe a 3D virtual session scene, and may observe virtual objects corresponding to other terminal devices in a remote session, so that the user feels a strong sense of reality. For example, referring to fig. 4, fig. 4 shows a scene diagram of a remote conference scene, where the terminal device 100 may be a head-mounted display device, the user 601 is at a position around a physical table body in a real scene, the user 601 may observe a scene picture of a virtual conversation scene through the head-mounted display device, and the scene picture of the virtual conversation scene may include virtual characters 701 of other users participating in the remote conference.

Step S120: and acquiring user data corresponding to one or more terminal devices.

In some embodiments, the server may obtain user data corresponding to each terminal device in the remote session to determine user emotion information corresponding to the terminal device according to the user data. The user data may include, but is not limited to, a facial image of the user, voice data of the user, and the like, and may also include, for example, body characteristics data such as a heartbeat, a blood pressure, and the like of the user. The user data corresponding to the terminal device may be collected by the terminal device, or may be collected by an information collecting device in the same field environment as the terminal device, which is not limited herein.

Step S130: and analyzing the user data to obtain user emotion information corresponding to one or more terminal devices.

In some embodiments, the server may analyze user expressions, user moods, and the like of the user according to user data corresponding to the terminal device in the remote session, and obtain user emotion information corresponding to the terminal device according to an analysis result. The emotion information of the user may be information representing emotion of the user, and the emotion of the user may include joy, anger, sadness, surprise, fear, confusion, concentration, vague, and the like, which is not limited herein.

Step S140: and when the emotion information of the user meets the set emotion condition, obtaining the adjustment content matched with the emotion information of the user meeting the set emotion condition, and adjusting the virtual session scene according to the adjustment content.

After the server obtains the user emotion information corresponding to the terminal device through analysis, whether the user emotion information of the terminal device meets the set emotion condition or not can be determined. Wherein the set emotional condition may include a specified emotion, e.g., anger, surprise, confusion, etc.; the set emotional condition may further include a degree of a specified emotion, for example, a degree of anger, a degree of surprise, and the like. The specific emotional condition may not be limited, and may be set according to the actual scene and requirements of the remote session.

When the server determines that the user emotion information meets the set emotion condition, the server can acquire the adjustment content matched with the user emotion information meeting the set emotion condition. The adjustment content is used to adjust the virtual session scene, and the adjustment content may include performing a specific adjustment operation on a display screen in the virtual session scene, for example, the adjustment content may include performing sharpness adjustment, brightness adjustment, content replacement, content occlusion, and the like in the virtual session scene, and the specific adjustment content may not be limited, for example, the adjustment content may also include marking at least part of the content, and marking content data of the content, and the like.

In some embodiments, after the server acquires the adjustment content matched with the user emotion information meeting the set emotion condition, the virtual session scene may be adjusted according to the adjustment content. As a specific embodiment, the server may adjust a virtual object according to the adjustment content, where the virtual object may be a virtual object corresponding to a target terminal device, and the target terminal device may be a terminal device corresponding to user emotion information that satisfies a set emotion condition, or may be another terminal device, which is not limited herein. For example, the server may adjust the definition, brightness, etc. of the virtual object corresponding to the target terminal device.

In the embodiment of the present application, the steps of generating a virtual session scene, acquiring user data, analyzing emotion information of a user, and adjusting the virtual session scene, that is, the steps from step S110 to step S140, may also be executed by the terminal device. When the terminal device serves as an execution subject, and the steps are executed, the virtual session scene generated by the terminal device may include a virtual object corresponding to at least one terminal device, the obtained user data may include user data corresponding to at least one terminal device, and the at least one terminal device is another terminal device in the remote session except the execution subject. For example, when the remote session includes two terminal devices (a first terminal device and a second terminal device), and the first terminal device executes the above steps, the generated virtual session scene only includes a virtual object corresponding to the second terminal device, and only obtains user data corresponding to the second terminal device.

The processing method of the virtual scene provided by the embodiment of the application can realize that the virtual session scene is adjusted according to the adjustment content matched with the emotion information of the user meeting the set emotion condition in the virtual session scene of the remote session, the emotion of the user can be reflected, the users can know the emotion of the user in the remote session conveniently, the real feeling of the user is provided, and the effect of the remote session is improved.

Referring to fig. 5, another embodiment of the present application provides a method for processing a virtual scene, where the method for processing a virtual scene may include:

step S210: and generating a virtual session scene corresponding to the remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session.

Step S220: and acquiring user data corresponding to one or more terminal devices.

In the embodiment of the present application, step S210 and step S220 may refer to the contents of the above embodiments, and are not described herein again.

Step S230: and analyzing the user data to obtain user emotion information corresponding to one or more terminal devices.

In some embodiments, the user data of the terminal device acquired by the server may include a facial image of the user. The facial image of the user may be acquired by a camera of the terminal device, or may be acquired by an image acquisition device in the same field environment as the terminal device, which is not limited herein.

As an embodiment, analyzing the user data to obtain user emotion information corresponding to one or more terminal devices may include:

acquiring at least one of user expressions and user moods corresponding to one or more terminal devices according to the facial image; and acquiring user emotion information corresponding to one or more terminal devices according to at least one of the user expression and the user emotion.

The server can perform expression recognition on the facial image according to the facial image of the user to obtain the user expression corresponding to the terminal equipment. The expression of the user may be used to characterize the expression of the user's emotion on the face, for example, smiling, pouring, blunting, crying, and the like, and is not limited herein. As an embodiment, the server may perform preprocessing such as marking, graying, and normalization on the facial image, extract the features of the five sense organs in the facial image, and determine the user expression according to the extracted features of the five sense organs, where a specific manner of analyzing the user expression may not be limited.

The server can also perform emotion recognition on the face image according to the face image of the user to obtain the emotion of the user corresponding to the terminal equipment. The user's mood may be used to characterize mental activities expressed by expressions that reveal the user's face, e.g., drowsiness, fear, pleasure, etc. As an embodiment, after extracting the user expression according to the facial image, the server may analyze the user emotion according to the user expression, and a specific manner of analyzing the user emotion may not be limited.

In some embodiments, the server obtains the emotion information of the user after obtaining the expression and the emotion of the user.

As another embodiment, the user data of the terminal device acquired by the server may also include voice data of the user. The voice data of the user may be collected by an audio input module such as a microphone of the terminal device, or may be collected by an audio collection device in the same field environment as the terminal device, which is not limited herein.

In some embodiments, analyzing the user data to obtain emotion information of the user corresponding to one or more terminal devices includes:

acquiring user voice atmosphere corresponding to one or more terminal devices according to the voice data; and acquiring user emotion information corresponding to one or more terminal devices according to the user tone.

The server can perform voice analysis on the voice data according to the voice data of the user to obtain the voice tone of the user corresponding to the terminal equipment. User tone may be used to characterize the form of sound of a particular statement, e.g., question, exclamation, statement, etc., subject to a particular ideological and emotional condition, and is not limited thereto. As a specific implementation manner, the server may analyze the voice data to obtain voice parameters related to the speaking mood, such as voice volume, tone, voice content, and the like, and determine the user mood according to specific parameter values of the voice parameters, where the specific manner of analyzing the user mood may not be limited. The server can further analyze the user tone and obtain the user emotion information of the user. Of course, the embodiment of obtaining the emotion information of the user according to the mood of the user may not be limited.

In some embodiments, the server may also analyze the user emotion information of the user based on the facial image and the voice data of the user at the same time.

In other embodiments, the server may further input facial images, voice data, and the like to the training model by using the trained training model to obtain the user emotion information of the user. The training model can be obtained by inputting a large number of training sets into an initial model for training, the initial model can be a neural network model, a decision tree model and the like, and the training set can be composed of a large number of face images and voice data marked with user emotion.

Step S240: and when the user emotion information meets the set emotion condition, acquiring the adjustment content matched with the user emotion information meeting the set emotion condition.

In some embodiments, adjusting the content may include at least one of indication information for performing sharpness adjustment, brightness adjustment, content replacement, and content occlusion on at least a portion of the content in the virtual conversation scene; the adjustment content may further include an adjustment parameter for performing sharpness adjustment, an adjustment parameter for brightness adjustment, a content used for replacement, a content used for occlusion, and the like, which is not limited herein.

Furthermore, the adjustment content is matched with the user emotion information meeting the set emotion condition, so that the server can feed back the emotion of the user or avoid the influence of other users on the user emotion information meeting the set emotion condition after adjusting the virtual session scene according to the adjustment content. For example, when the user emotion information satisfying the set emotion condition is anger, the adjustment content may be to block the virtual object or to reduce the clarity of the virtual object. For example, when the user emotion information satisfying the set emotion condition is a casualty, the adjustment content may be to replace the virtual object or to reduce the brightness of the virtual object. Of course, the content of the adjustment specifically matching the emotion information of the user satisfying the set emotion condition may not be limited.

Step S250: and according to the adjustment content, at least one of adjusting the definition of at least part of the content in the virtual session scene, adjusting the brightness of at least part of the content in the virtual session scene, replacing at least part of the content in the virtual session scene, and blocking at least part of the content in the virtual session scene.

In some embodiments, the server may adjust at least a portion of the content in the virtual session scene based on the adjusted content. At least part of the content may be a virtual object corresponding to the terminal device in the virtual session scene.

As an embodiment, according to the adjustment content, adjusting at least part of the content in the virtual session scene may include:

acquiring terminal equipment of a first user corresponding to user emotion information meeting set emotion conditions; acquiring a first virtual object corresponding to terminal equipment of a first user in a virtual session scene; and adjusting the first virtual object according to the adjustment content.

The server can acquire the terminal equipment of the first user corresponding to the emotion information of the user according to the emotion information of the user meeting the set emotion condition. The server may obtain a first virtual object corresponding to a terminal device of a first user in a virtual session scene, use the first virtual object as at least part of content that needs to be adjusted, and may perform at least one of sharpness adjustment, brightness adjustment, replacement, and occlusion on the first virtual object, for example, the brightness of the first virtual object may be increased, and for example, the first virtual object may be occluded by a specified virtual content, which is not limited herein.

Therefore, after the first virtual object is adjusted, other users except the first user in the remote session scene can check the adjustment of the virtual object corresponding to the terminal device of the first user, and the user can know the emotion of the first user conveniently.

For example, referring to fig. 6, in the teleconference scene, the virtual session scene corresponding to the teleconference scene includes virtual character a, virtual character B, virtual character C, and virtual character D, where virtual character a corresponds to a terminal device corresponding to user 1, virtual character B corresponds to a terminal device corresponding to user 2, virtual character C corresponds to a terminal device corresponding to user 3, and virtual character D corresponds to a terminal device corresponding to user 4, and when the user emotion information corresponding to the terminal device of user 2 is a casualty, the brightness of virtual character B may be increased, so that user 1, user 3, and user 4 may see virtual character B in the scene picture of the virtual session scene more prominently, and user 1, user 3, and user 4 may know the emotion of user 2 conveniently.

As another embodiment, according to the adjustment content, adjusting at least a part of the content in the virtual session scene may include:

acquiring a first user corresponding to user emotion information meeting set emotion conditions; acquiring a second user associated with the first user in the virtual session scene, and acquiring a second virtual object corresponding to the terminal equipment of the second user; and adjusting the second virtual object according to the adjustment content.

In some embodiments, the user emotion information corresponding to the first user satisfies the set emotion condition. The second user is associated with the first user in the virtual session scenario. As an embodiment, the second user may be a user who affects the user emotion information of the first user, for example, the second user may be a user who generates the user emotion information satisfying the set emotion condition for the first user, and is not limited herein. The server may determine the second user associated with the first user according to the user data, the participation data of the terminal device in the remote session, and the like, for example, may determine the gazing direction of the first user according to eyeball data of the first user to determine the focus content watched by the first user, thereby determining the second user associated with the first user, and for example, may recognize a keyword according to voice data of the first user, and determine the second user associated with the first user according to the keyword, which is not limited herein.

The server may obtain a second virtual object corresponding to the terminal device of the second user in the virtual session scene, use the second virtual object as at least part of content that needs to be adjusted, and may adjust the second virtual object, for example, may reduce the definition of the second virtual object, replace the second virtual object with a specified virtual content, and the like, which is not limited herein. Therefore, after the second virtual object is adjusted, the emotion of the first user in the remote session scene can be prevented from being influenced by the second user.

For example, in a teleconference scene, a virtual session scene corresponding to the teleconference scene includes a virtual character a and a virtual character B, the virtual character a corresponds to a terminal device corresponding to the user 1, and the virtual character B corresponds to a terminal device corresponding to the user 2, when user emotion information corresponding to the terminal device of the user 2 is angry, the user 2 is angry due to the user 1, and then in a picture of the virtual session scene that needs to be displayed by the terminal device corresponding to the user 2, the definition of the virtual character a can be adjusted to be low, so that emotion interference of the user 1 to the user 2 is reduced.

In some embodiments, the user data of the terminal device in the remote session may include voice data. The processing method of the virtual scene may further include:

acquiring voice data corresponding to terminal equipment corresponding to user emotion information meeting set emotion conditions, and judging whether decibel values of the acquired voice data are larger than a set threshold value or not, wherein the first user is a user corresponding to the user emotion information meeting the set emotion conditions; and when the decibel value is larger than the set threshold value, reducing the decibel value of the acquired corresponding voice data.

When the server determines the user emotion information meeting the set emotion condition, the server can also acquire the voice data corresponding to the terminal device corresponding to the user emotion information, namely the voice data of the first user generating the user emotion information, and determine whether the decibel value of the voice data of the first user generating the user emotion information is larger than a set threshold value, and when the decibel value of the voice data of the first user is larger than the set threshold value, the server indicates that the speaking volume of the first user is larger, so that the server can reduce the decibel value of the voice data corresponding to the terminal device of the first user in the remote session, and other users are prevented from being influenced by the emotion of the first user.

In some embodiments, when determining the user emotion information satisfying the set emotion condition, the server may further determine whether a decibel value of voice data of a second user associated with a first user generating the user emotion information is greater than a set threshold, and when the voice data of the second user is greater than the set threshold, may reduce the decibel value of the voice data of the terminal device corresponding to the second user in the remote session, so as to reduce an influence of the second user on the emotion of the first user.

In the embodiment of the present application, the above steps may also be executed by the terminal device.

The processing method of the virtual scene provided by the embodiment of the application can realize that the virtual object in the virtual session scene of the remote session is adjusted according to the adjustment content matched with the emotion information of the user meeting the set emotion condition, the emotion of the user can be reflected, the emotion of the user in the remote session can be known among the users, the emotion interference to the user can be reduced, and the effect of the remote session is improved.

Referring to fig. 7, another embodiment of the present application provides a method for processing a virtual scene, where the method for processing a virtual scene may include:

step S310: and generating a virtual session scene corresponding to the remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session.

Step S320: and acquiring user data corresponding to one or more terminal devices.

Step S330: and analyzing the user data to obtain user emotion information corresponding to one or more terminal devices.

In the embodiment of the present application, steps S310 to S330 may refer to the contents of the above embodiments, and are not described herein again.

Step S340: when the user emotion information meets the set emotion condition, obtaining adjustment content matched with the user emotion information meeting the set emotion condition, wherein the adjustment content comprises first content data of virtual mark content, and the virtual mark content corresponds to the user emotion information meeting the set emotion condition.

In some implementations, the adapted content may include first content data for tagged virtual tagged content; the adjustment content may further include indication information indicating that the content in the virtual session scene is marked, so that the server may mark the content in the virtual scene according to the adjustment content. The virtual markup content is used to tag emotions so that other users in the remote session learn the emotions of the users. In one embodiment, the virtual mark content may be a text corresponding to the user emotion information satisfying the set emotion condition, for example, when the user emotion information is anger, the virtual mark content may be a text such as "anger" or "anger". As another embodiment, the virtual tag content may be a virtual animated expression corresponding to the user emotion information satisfying the set emotion condition, where the virtual animated expression is used to represent the user emotion information, for example, when the user emotion information is a heart injury, the virtual tag content may be a heart injury virtual animated expression.

Step S350: and acquiring the terminal equipment of the first user corresponding to the user emotion information meeting the set emotion condition.

Step S360: and acquiring a first virtual object corresponding to the terminal equipment of the first user in the virtual session scene.

Step S370: virtual markup content is generated at a position of the first virtual object in the virtual session scene from the first content data, the virtual markup content for tagging an emotion of the first user.

In some embodiments, after acquiring the first virtual object corresponding to the terminal device of the first user, the server may regard the first virtual object as content that needs to be marked in the virtual session scene. Therefore, the server may mark the first virtual object according to the first content data in the adjusted content, that is, generate a virtual mark content in the virtual session scene, where a position of the virtual mark content in the virtual session scene may be within a preset range relative to the first virtual object in the virtual session scene. Wherein the position of the virtual label content refers to the spatial position of the virtual label content in the virtual space. As an embodiment, the server may determine a spatial position of the virtual tag content in the virtual space (e.g., a spatial position in the virtual space near the first virtual object) according to a spatial position of the first virtual object in the virtual space, and generate the virtual tag content in the virtual conversation scene in the virtual space according to the spatial position of the virtual tag content in the virtual space and the first content data. Therefore, after the virtual mark content corresponding to the first virtual object is generated in the virtual conversation scene, other users except the first user in the remote conversation can be facilitated to know the emotion of the first user.

For example, referring to fig. 8, in the teleconference scene, the virtual session scene corresponding to the teleconference scene includes virtual character a, virtual character B, virtual character C, and virtual character D, where virtual character a corresponds to the terminal device corresponding to the user 1, virtual character B corresponds to the terminal device corresponding to the user 2, virtual character C corresponds to the terminal device corresponding to the user 3, and virtual character D corresponds to the terminal device corresponding to the user 4, and when the user emotion information corresponding to the terminal device of the user 3 is a casualty, a virtual expression representing the casualty may be generated at the position of virtual character C, so that the user 1, the user 3, and the user 4 can know the emotion of the user 2.

In the embodiment of the present application, the steps from step S310 to step S370 may also be executed by the terminal device.

The processing method of the virtual scene provided by the embodiment of the application can generate the virtual mark content at the position of the virtual object in the virtual session scene according to the adjustment content matched with the emotion information of the user meeting the set emotion condition in the virtual session scene of the remote session, can reflect the emotion of the user, is convenient for the users to mutually know the emotion of the user in the remote session, and improves the effect of the remote session.

Referring to fig. 9, a further embodiment of the present application provides a method for processing a virtual scene, where the method for processing a virtual scene may include:

step S410: and generating a virtual session scene corresponding to the remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session.

Step S420: and acquiring user data corresponding to one or more terminal devices.

Step S430: and analyzing the user data to obtain user emotion information corresponding to one or more terminal devices.

In the embodiment of the present application, steps S410 to S430 may refer to the contents of the above embodiments, and are not described herein again.

Step S440: and when the user emotion information meets the set emotion condition, acquiring the adjustment content matched with the user emotion information meeting the set emotion condition, wherein the adjustment content comprises second content data of the virtual prompt content.

In some implementations, the adjustment content can include second content data of the virtual cue content; the adjustment content may further include instruction information for instructing generation of the virtual prompt content in the virtual session scene, so that the server may generate the virtual prompt content in the virtual scene according to the adjustment content. The virtual prompting content is used for displaying in the terminal equipment of the first user generating the user emotion information meeting the set emotion condition so as to prompt the emotion of the first user, and the first user can know the emotion of the first user conveniently. In one embodiment, the virtual guidance content may be a text guidance content corresponding to the user emotion information that satisfies the set emotion condition, for example, when the user emotion information is angry, the virtual guidance content may be a text content such as "you are angry and please check their own emotion". As another embodiment, the virtual marker content may be a virtual animation corresponding to the emotion information of the user satisfying the set emotion condition, the virtual animation expression is used for representing the emotion information of the user, for example, when the emotion information of the user is vague, the virtual marker content may be a virtual animation representing the vague.

Step S450: and acquiring the terminal equipment of the first user corresponding to the user emotion information meeting the set emotion condition.

Step S460: and generating virtual prompt content according to the second content data, wherein the virtual prompt content is used for displaying in the terminal equipment of the first user so as to prompt the emotion of the first user.

In some embodiments, after acquiring the first virtual object corresponding to the terminal device of the first user, the server may generate the virtual prompt content in the scene screen of the virtual session scene for display by the terminal device of the first user according to the second content data. After the server generates the virtual prompt content in the scene picture of the virtual session scene displayed by the terminal device of the first user, the picture data of the scene picture can be sent to the terminal device of the first user, so that the first user can conveniently see the virtual prompt content through the terminal device, the first user can conveniently know the emotion of the first user, the emotion is restrained, and the first user can better communicate with other users in the remote session.

In some embodiments, the processing method of the virtual scene may further include:

acquiring user emotion information of other terminal equipment in the remote session except for the target equipment, wherein the target equipment is the terminal equipment of which the identity is a preset identity; and generating feedback data according to the user emotion information of other terminal equipment, and generating an emotion feedback picture according to the feedback data, wherein the emotion feedback picture is used for being displayed in the target equipment.

The server can acquire user emotion information of other terminal devices except the target device according to user emotion information of each terminal device in the remote session, namely acquire the user emotion information of other users, and generate emotion feedback data according to the user emotion information of other terminal devices. The emotional feedback data may characterize the user's emotional information of other users, which may be, for example, textual content, image content, and so on. The server can transmit the picture data of the emotion feedback picture generated according to the emotion feedback data to the target device, so that the target device can display the emotion feedback picture according to the picture data of the emotion feedback picture, and a user corresponding to the target device can know the emotion of each of other users conveniently.

For example, referring to fig. 10, in a session scene of remote teaching, a terminal device corresponding to a teacher may be a target device, and an emotion feedback screen generated according to user emotion information of student a1, student a2, and student a3 may be displayed by the target device, so that the teacher can know emotion, state, and teaching atmosphere of each student.

In some embodiments, other terminal devices in the remote session besides the target device may also transmit information such as the opinion and opinion information input by the user to the server, and the server may generate a virtual picture according to the information such as the opinion and opinion information of the user, where the virtual picture is displayed by the target device, so that the user of the target device can know the opinions and opinions of other users conveniently.

Of course, the embodiment in which the server generates the emotion feedback screen may be performed in the above embodiment.

In the embodiment of the present application, the steps performed by the server may also be performed by a terminal device.

The processing method of the virtual scene provided by the embodiment of the application can generate the virtual prompt content in the virtual session scene according to the adjustment content matched with the emotion information of the user meeting the set emotion condition in the virtual session scene of the remote session, prompts the emotion of the user, and facilitates the user to know the emotion of the user so as to suppress the emotion, so that the communication with other users in the remote session can be better realized, and the effect of the remote session can be improved.

Referring to fig. 11, a block diagram of a display device 400 for virtual content according to the present application is shown. The display apparatus 400 of the virtual content includes: a scene generation module 410, a data acquisition module 420, an emotion analysis module 430, and a scene adjustment module 440. The scene generating module 410 is configured to generate a virtual session scene corresponding to the remote session, where the virtual session scene at least includes virtual objects corresponding to one or more terminal devices in the remote session; the data obtaining module 420 is configured to obtain user data corresponding to the remote device; the emotion analysis module 430 is configured to analyze the user data to obtain user emotion information corresponding to one or more terminal devices; the scene adjusting module 440 is configured to, when the emotion information of the user meets the set emotion condition, obtain an adjustment content matched with the emotion information of the user meeting the set emotion condition, and adjust the virtual session scene according to the adjustment content.

In some implementations, the user data includes an image of a face of the user. The emotion analysis module 430 may be specifically configured to: acquiring at least one of user expressions and user moods corresponding to one or more terminal devices according to the facial image; and acquiring user emotion information corresponding to one or more terminal devices according to at least one of the user expression and the user emotion.

In some embodiments, the user data includes voice data of the user. The emotion analysis module 430 may be specifically configured to: acquiring user voice atmosphere corresponding to one or more terminal devices according to the voice data; and acquiring user emotion information corresponding to one or more terminal devices according to the user tone.

In some embodiments, the scene adjustment module 440 may be specifically configured to: acquiring a first user corresponding to user emotion information meeting set emotion conditions; acquiring a second user associated with the first user in the virtual session scene, and acquiring a second virtual object corresponding to the terminal equipment of the second user; and adjusting the second virtual object according to the adjustment content.

In some embodiments, the scene adjustment module 440 adjusts the virtual session scene, including: at least one of adjusting sharpness of at least a portion of the content in the virtual conversational scene, adjusting brightness of at least a portion of the content in the virtual conversational scene, replacing at least a portion of the content in the virtual conversational scene, and occluding at least a portion of the content in the virtual conversational scene.

In some embodiments, the adjustment content includes first content data of virtual tagged content corresponding to user emotional information satisfying the set emotional condition. The scene adjustment module 440 may be specifically configured to: acquiring terminal equipment of a first user corresponding to user emotion information meeting set emotion conditions; acquiring a first virtual object corresponding to terminal equipment of a first user in a virtual session scene; virtual markup content is generated at a position of the first virtual object in the virtual session scene from the first content data, the virtual markup content for tagging an emotion of the first user.

In some implementations, the adjustment content includes second content data of the virtual cue content. The scene adjustment module 440 may be specifically configured to: acquiring terminal equipment of a first user corresponding to user emotion information meeting set emotion conditions; and generating virtual prompt content according to the second content data, wherein the virtual prompt content is used for displaying in the terminal equipment of the first user so as to prompt the emotion of the first user.

In some embodiments, the processing apparatus 400 of the virtual scene may further include: the emotion information acquisition module and the feedback picture generation module. The emotion information acquisition module is used for acquiring user emotion information of other terminal equipment in the remote session except the target equipment, and the target equipment is the terminal equipment with the identity identification as the preset identification; and the feedback picture generation module is used for generating feedback data according to the user emotion information of other terminal equipment and generating an emotion feedback picture according to the feedback data, and the emotion feedback picture is used for being displayed in the target equipment.

In some embodiments, the user data includes voice data. The processing apparatus 400 of the virtual scene may further include: the device comprises a voice extraction module, a voice judgment module and a voice adjustment module. The voice extraction module is used for acquiring voice data of the terminal equipment corresponding to the user emotion information meeting the set emotion condition; the voice judging module is used for judging whether the decibel value of the acquired voice data is greater than a set threshold value or not; the voice adjusting module is used for reducing the decibel value of the acquired voice data when the decibel value is larger than the set threshold value.

In summary, according to the scheme provided by the application, a virtual session scene corresponding to a remote session is generated, the virtual session scene at least includes virtual objects corresponding to one or more terminal devices in the remote session, user data acquired by the one or more terminal devices is acquired, the user data is analyzed, emotion information of a user corresponding to the one or more terminal devices is acquired, when the emotion information of the user meets a set emotion condition, adjustment content matched with the emotion information of the user meeting the set emotion condition is acquired, the virtual session scene is adjusted according to the adjustment content, the emotion of the user in the remote session can be fed back, real feeling of the user is given, and the effect of the remote session is improved.

In this embodiment of the present application, the electronic device that executes the processing method of the virtual scene provided in the foregoing embodiment may be a server, or may be a terminal device.

Referring to fig. 12, a block diagram of a terminal device according to an embodiment of the present application is shown. The terminal device 100 may be a terminal device capable of running an application, such as a smart phone, a tablet computer, a head-mounted display device, and the like. The terminal device 100 in the present application may include one or more of the following components: a processor 110, a memory 120, and one or more applications, wherein the one or more applications may be stored in the memory 120 and configured to be executed by the one or more processors 110, the one or more programs configured to perform a method as described in the aforementioned method embodiments.

Processor 110 may include one or more processing cores. The processor 110 connects various parts within the entire terminal device 100 using various interfaces and lines, and performs various functions of the terminal device 100 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 120 and calling data stored in the memory 120. Alternatively, the processor 110 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 110 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 110, but may be implemented by a communication chip.

The Memory 120 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 120 may be used to store instructions, programs, code sets, or instruction sets. The memory 120 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created by the terminal device 100 in use, and the like.

In some embodiments, the terminal device 100 may further include an image sensor 130 for capturing images of real objects and capturing scene images of the target scene. The image sensor 130 may be an infrared camera or a visible light camera, and the specific type is not limited in the embodiment of the present application.

In one embodiment, the terminal device is a head-mounted display device, and may further include one or more of the following components in addition to the processor, the memory, and the image sensor described above: display module assembly, optical module assembly, communication module and power.

The display module may include a display control unit. The display control unit is used for receiving the display image of the virtual content rendered by the processor, and then displaying and projecting the display image onto the optical module, so that a user can view the virtual content through the optical module. The display device may be a display screen or a projection device, and may be used to display an image.

The optical module can adopt an off-axis optical system or a waveguide optical system, and a display image displayed by the display device can be projected to eyes of a user after passing through the optical module. The user sees the display image that display device throws through optical module group simultaneously. In some embodiments, the user can also observe the real environment through the optical module, and experience the augmented reality effect after the virtual content and the real environment are superimposed.

The communication module can be a module such as Bluetooth, WiFi (Wireless-Fidelity), ZigBee (Violet technology) and the like, and the head-mounted display device can be in communication connection with the terminal equipment through the communication module. The head-mounted display device in communication connection with the terminal equipment can perform information and instruction interaction with the terminal equipment. For example, the head-mounted display device may receive image data transmitted from the terminal device via the communication module, and generate and display virtual content of a virtual world from the received image data.

The power supply can supply power for the whole head-mounted display device, and the normal operation of each part of the head-mounted display device is ensured.

Referring to fig. 13, a block diagram of a server according to an embodiment of the present disclosure is shown. The server 200 may be a cloud server, a local server, or the like, and the server 200 may include one or more of the following components: a processor 210, a memory 220, and one or more applications, wherein the one or more applications may be stored in the memory 220 and configured to be executed by the one or more processors 210, the one or more programs configured to perform a method as described in the aforementioned method embodiments.

Referring to fig. 14, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable storage medium 800 has stored therein program code that can be invoked by a processor to perform the methods described in the method embodiments above.

The computer-readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 800 includes a non-volatile computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 810 may be compressed, for example, in a suitable form.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A processing method of a virtual scene is applied to a server, and the method comprises the following steps:

generating a virtual session scene corresponding to a remote session, wherein the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session, the virtual objects comprise virtual character models, the generated virtual session scene is used for generating virtual scene pictures displayed by the terminal devices, the terminal devices are head-mounted display devices, the virtual scene pictures are used for being displayed by the terminal devices so that users can observe the virtual scene pictures through the head-mounted display devices, and the virtual scene pictures can comprise virtual characters of other users participating in the remote session except the users corresponding to the terminal devices;

acquiring user data corresponding to the one or more terminal devices;

analyzing the user data to obtain user emotion information corresponding to the one or more terminal devices, wherein the user emotion information at least comprises joy, anger, sadness, surprise, fear, confusion, concentration and vague feeling;

when the user emotion information meets a set emotion condition, obtaining adjustment content matched with the user emotion information meeting the set emotion condition, wherein the user emotion information meeting the set emotion condition at least comprises anger and hurry; acquiring a first user corresponding to the user emotion information meeting the set emotion condition;

acquiring eyeball data of the first user, and determining the gazing direction of the first user;

according to the gazing direction, determining the focus content watched by the first user;

determining a second user associated with the first user according to the focus content, wherein the second user is a user influencing the user emotion of the first user, and acquiring a second virtual object corresponding to the terminal equipment of the second user;

and according to the adjustment content, the definition of the second virtual object is reduced or the second virtual object is replaced so as to reduce the influence of the second user on the emotion of the first user.

2. The method of claim 1, wherein the user data comprises a facial image of a user, and the analyzing the user data to obtain emotion information of the user corresponding to the one or more terminal devices comprises:

acquiring at least one of user expressions and user moods corresponding to the one or more terminal devices according to the facial image;

acquiring user emotion information corresponding to the one or more terminal devices according to at least one of the user expression and the user emotion;

and/or

The user data includes voice data of a user, and the analyzing the user data to obtain user emotion information corresponding to the one or more terminal devices includes:

acquiring user voice atmosphere corresponding to the one or more terminal devices according to the voice data;

and acquiring user emotion information corresponding to the one or more terminal devices according to the user tone.

3. The method according to claim 1 or 2, wherein after obtaining the user emotion information corresponding to the one or more terminal devices, the method further comprises:

acquiring user emotion information of other terminal equipment in the remote session except for target equipment, wherein the target equipment is terminal equipment of which the identity is a preset identity;

and generating feedback data according to the user emotion information of the other terminal equipment, and generating an emotion feedback picture according to the feedback data, wherein the emotion feedback picture is used for being displayed in the target equipment.

4. The method of claim 1 or 2, wherein the user data comprises voice data, the method further comprising:

acquiring voice data of the terminal equipment corresponding to the user emotion information meeting the set emotion condition;

judging whether the decibel value of the acquired voice data is greater than a set threshold value or not;

and when the decibel value is larger than the set threshold value, reducing the decibel value of the acquired voice data.

5. An apparatus for processing a virtual scene, applied to a server, the apparatus comprising: a scene generation module, a data acquisition module, an emotion analysis module and a scene adjustment module, wherein,

the scene generation module is used for generating a virtual session scene corresponding to a remote session, the virtual session scene at least comprises virtual objects corresponding to one or more terminal devices in the remote session, the virtual objects comprise virtual character models, the generated virtual session scene is used for generating virtual scene pictures displayed by the terminal devices, the terminal devices are head-mounted display devices, the virtual scene pictures are used for being displayed by the terminal devices so that users can observe the virtual scene pictures through the head-mounted display devices, and the virtual scene pictures can comprise virtual characters of other users participating in the remote session except the users corresponding to the terminal devices;

the data acquisition module is used for acquiring user data corresponding to the one or more terminal devices;

the emotion analysis module is used for analyzing the user data to obtain user emotion information corresponding to the one or more terminal devices, wherein the user emotion information at least comprises joy, anger, sadness, surprise, fear, confusion, concentration and vague feeling;

the scene adjusting module is used for acquiring adjusting content matched with the user emotion information meeting the set emotion condition when the user emotion information meets the set emotion condition, wherein the user emotion information meeting the set emotion condition at least comprises anger and injury; acquiring a first user corresponding to the user emotion information meeting the set emotion condition; acquiring eyeball data of the first user, and determining the gazing direction of the first user; according to the gazing direction, determining the focus content watched by the first user; determining a second user associated with the first user according to the focus content, wherein the second user is a user influencing the user emotion of the first user, and acquiring a second virtual object corresponding to the terminal equipment of the second user; and according to the adjustment content, the definition of the second virtual object is reduced or the second virtual object is replaced so as to reduce the influence of the second user on the emotion of the first user.

6. An electronic device, comprising:

one or more processors;

a memory;

one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-4.

7. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 4.