CN117398682A - Scene construction method and device and electronic equipment - Google Patents

Scene construction method and device and electronic equipment Download PDF

Info

Publication number
CN117398682A
CN117398682A CN202311149229.7A CN202311149229A CN117398682A CN 117398682 A CN117398682 A CN 117398682A CN 202311149229 A CN202311149229 A CN 202311149229A CN 117398682 A CN117398682 A CN 117398682A
Authority
CN
China
Prior art keywords
virtual
scene
description
virtual scene
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311149229.7A
Other languages
Chinese (zh)
Inventor
贺杰
胡永涛
戴景文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Virtual Reality Technology Co Ltd
Original Assignee
Guangdong Virtual Reality Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Virtual Reality Technology Co Ltd filed Critical Guangdong Virtual Reality Technology Co Ltd
Priority to CN202311149229.7A priority Critical patent/CN117398682A/en
Publication of CN117398682A publication Critical patent/CN117398682A/en
Pending legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/52Controlling the output signals based on the game progress involving aspects of the displayed game scene
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/54Controlling the output signals based on the game progress involving acoustic signals, e.g. for simulating revolutions per minute [RPM] dependent engine sounds in a driving game or reverberation against a virtual wall
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6063Methods for processing data by generating or executing the game program for sound processing
    • A63F2300/6072Methods for processing data by generating or executing the game program for sound processing of an input signal, e.g. pitch and rhythm extraction, voice recognition

Abstract

The application discloses a scene construction method, a scene construction device and electronic equipment, wherein the method comprises the steps of obtaining description audio for describing a virtual scene by a user; identifying description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene; based on a preset construction model, constructing a corresponding virtual scene according to the description text. Therefore, a professional is not required to write a corresponding script according to scene requirements, and a corresponding virtual scene is constructed according to the manually written script. The description audio can be converted to determine the description text for describing the virtual scene only by acquiring the description audio for describing any scene by a user, and the description text is input into a preset construction model to directly construct the corresponding virtual scene. Therefore, the technical scheme for automatically constructing the virtual scene according to the description audio of the user increases the scene type for constructing the virtual scene and improves the efficiency for constructing the virtual scene.

Description

Scene construction method and device and electronic equipment
Technical Field
The application relates to the technical field of internet, in particular to a scene construction method, a scene construction device and electronic equipment.
Background
With the rapid development of internet technology, electronic games are becoming an important part of daily life. Constructing a corresponding game scene according to different scene requirements of an electronic game is an important link of an electronic game design process and an updating process, so how to construct the game scene becomes a technical hotspot. In the related art, it is often required to manually write corresponding scripts according to different scene requirements, and construct virtual scenes in an electronic game according to the scripts. However, the technical scheme of manually writing scripts to construct virtual scenes results in higher cost of constructing virtual scenes and low efficiency of constructing virtual scenes.
Disclosure of Invention
The embodiment of the application provides a scene construction method, a scene construction device and electronic equipment, which can improve the efficiency of constructing virtual scenes.
In a first aspect, an embodiment of the present application discloses a scene construction method, where the method includes:
acquiring description audio for describing the virtual scene by a user;
identifying the description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene;
Based on a preset construction model, constructing a corresponding virtual scene according to the description text.
In a second aspect, embodiments of the present application disclose a scene building apparatus, including:
the acquisition unit is used for acquiring description audio for describing the virtual scene by a user;
the identification unit is used for identifying the description audio to obtain a description text for describing the virtual scene, wherein the description text is at least one condition for describing the virtual scene;
the construction unit is used for constructing a corresponding virtual scene according to the description text based on a preset construction model.
In a third aspect, an embodiment of the present application discloses an electronic device, where the electronic device includes a processor and a memory, where the memory stores a computer program, and the processor invokes the computer program to implement the above-mentioned scene construction method.
In a fourth aspect, embodiments of the present application disclose a computer readable storage medium storing program code that is invoked by a processor to implement the above-described scene construction method.
In a fifth aspect, the application discloses a computer program product comprising computer program code which, when run by a processor, causes the above-mentioned communication method to be performed.
In the embodiment of the application, the description audio for describing the virtual scene by the user is obtained; identifying description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene; based on a preset construction model, constructing a corresponding virtual scene according to the description text. Therefore, a professional is not required to write a corresponding script according to scene requirements, and a corresponding virtual scene is constructed according to the manually written script. The description audio can be converted to determine the description text for describing the virtual scene only by acquiring the description audio for describing any scene by a user, and the description text is input into a preset construction model to directly construct the corresponding virtual scene. Therefore, the technical scheme for automatically constructing the virtual scene according to the description audio of the user increases the scene type for constructing the virtual scene and improves the efficiency for constructing the virtual scene.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a system architecture of a scene construction system of a scene construction method disclosed in an embodiment of the present application;
FIG. 2 is a schematic flow chart of a scene construction method disclosed in an embodiment of the present application;
FIG. 3 is a schematic flow chart of a specific scenario of a scenario construction method disclosed in the embodiments of the present application;
FIG. 4 is another flow chart of a scene construction method disclosed in an embodiment of the present application;
FIG. 5 is a schematic illustration of a specific scenario of a scenario construction method disclosed in an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a scene building apparatus according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.
In order to better understand the solution of the present application, the following description will make clear and complete descriptions of the technical solution of the embodiment of the present application with reference to the accompanying drawings in the embodiment of the present application. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Currently, under the drive of internet technology, electronic games are increasingly important in people's daily life. Constructing different virtual scenes according to different scene requirements is an important link in the design and updating process of the electronic game. At least one virtual entity may be included in a different virtual scenario to execute a corresponding behavior policy, e.g. a virtual entity and its corresponding behavior policy may be leaf (virtual entity) flutter (behavior policy), soldier (virtual entity) run forward (behavior policy). It should be noted that, the virtual scene includes at least one virtual entity and a behavior policy being executed by the virtual entity, a plurality of virtual entities in the virtual scene and corresponding behavior policies thereof can be displayed for the user, and the user can execute corresponding virtual interaction in the virtual scene based on the virtual entity displayed in the virtual scene and the behavior policy being executed by the virtual entity, thereby realizing the function of virtual interaction of the user through the electronic game.
In the related art, virtual scenes in an electronic game need to be manually written with corresponding scripts according to different scene requirements in different electronic games in the development and design stage of the electronic game, and virtual scenes corresponding to the scene requirements are constructed according to the scripts. However, the content of the virtual scene constructed according to the scene requirement of the electronic game is simple and single, the requirement of expanding virtual interaction to various fields cannot be met, and virtual scenes corresponding to different scenes cannot be constructed. In addition, the technical scheme of manually writing scripts according to scene requirements to construct virtual scenes also results in higher cost of constructing the virtual scenes, and further has low efficiency of constructing various different virtual scenes.
In order to solve the above-mentioned problem, in the embodiment of the present application, description audio that a user describes a virtual scene is obtained; identifying description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene; based on a preset construction model, constructing a corresponding virtual scene according to the description text. Therefore, the user can output description audio for describing any scene requirement in a voice mode, and a corresponding virtual scene is constructed according to the audio, so that the scene type for constructing the virtual scene is increased, and the field for constructing the virtual scene is expanded. In addition, the method only needs to acquire the description audio of a user describing any scene, the description audio can be converted to determine the description text describing the virtual scene, and the description text is input into a preset construction model to directly construct the corresponding virtual scene. Corresponding scripts are not required to be written by professionals according to scene requirements, and corresponding virtual scenes are constructed according to the manually written scripts. Therefore, the technical scheme for automatically constructing the virtual scene according to the description audio of the user increases the scene type for constructing the virtual scene and improves the efficiency for constructing the virtual scene.
For a better understanding of the embodiments of the present application, a system architecture diagram is first described below.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a scene building system 100 according to an embodiment of the disclosure. As shown in fig. 1, the scene construction system 100 may include an input device 110, a data processor 120, a state machine 130, and a display device 140.
The input device 110 may be a microphone, a recorder, or the like, which obtains audio information of a user; an input device such as a keyboard and a mouse may be used to obtain text information or operation information of the user.
The data processor 120 may be configured to obtain data of the input device 110, and construct a corresponding virtual scene according to the data of the input device 110. The data processor 120 may include a natural language processing module that identifies descriptive text resulting from descriptive audio conversion; the data processor 120 may also include a scene generation module that constructs a corresponding virtual scene from the descriptive text. The data processor 120 may also be used to send the constructed virtual scene to the state machine 130 and/or the display device 140.
The state machine 130 may also be configured to store a scene description file describing the virtual scene, and in some embodiments, the data processor 120 may obtain the scene description file describing the virtual scene stored in the state machine 130, and construct a corresponding virtual scene according to the scene description file. The state machine 130 may also be used to store the virtual scene sent by the data processor 120, and in some embodiments, the virtual scene stored in the state machine 130 is sent to the display device 140 for playing and displaying.
The display device 140 may be used to play back virtual scenes generated by the display data processor 120 and/or virtual scenes stored in the state machine 130. The display device 140 may want a user to present a virtual entity in a virtual scene and a behavior policy executed by the virtual entity.
As an embodiment, the user may input description audio for language description of the target virtual scene to the microphone of the input device 110, and the natural language processing module of the data processor 120 converts the description audio into description text, and the scene generating module of the data processor 120 constructs a corresponding virtual scene according to the description text. After the virtual scene is constructed, the data processor 120 sends the virtual scene to the state machine 130 for storage. After the virtual scene is constructed, the data processor 120 may also send the virtual scene to the display device 140 for playing and displaying. The user performs virtual interactions in the virtual scene based on the virtual scene displayed on the display device 140.
It should be noted that the system architecture shown in fig. 1 is only an example, and does not limit the technical solution disclosed in the embodiments of the present application. With the evolution of the interactive system architecture and the appearance of new application scenes, the technical scheme disclosed by the embodiment of the application is applicable to similar technical problems.
Referring to fig. 2, fig. 2 is a schematic flow chart of a scene construction method according to an embodiment of the present application. The scene construction method may be a data processor. The scene construction method may include the following steps.
201. And acquiring description audio for describing the virtual scene by the user.
In the construction stage of the virtual scene, virtual scenes with different types and different contents can be constructed according to different scene requirements, so that virtual interaction of users in various virtual scenes is realized. In the update stage of the virtual scene, a virtual entity and a behavior strategy corresponding to the virtual entity can be added into the constructed virtual scene according to the updated scene requirement. Behavior strategies to be updated can also be added to virtual entities existing in the virtual scene. Therefore, the interactive content in the virtual scene is updated by adding additional virtual entities and corresponding behavior strategies thereof in the virtual scene and/or updating the corresponding behavior strategies of the virtual entities on the basis of the corresponding behavior strategies of the original virtual entities. Therefore, the interactive content which can be interacted by the user in the virtual scene is increased, and the playability of the application of the virtual scene in the electronic game or the effectiveness of the interactive content applied in other fields is enhanced.
At least one virtual entity and a behavior policy correspondingly executed by the virtual entity can be included in the virtual scene. The virtual scene can be constructed by determining at least one virtual entity and a behavior policy correspondingly executed by the virtual entity to determine scene requirements for constructing the virtual scene.
In some implementations, the scene requirements for constructing the virtual scene may be determined by receiving descriptive audio that describes the virtual scene by a user. The description audio of the user describing the virtual scene can comprise a virtual entity and a behavior strategy correspondingly executed by the virtual entity.
Specifically, receiving description audio that a user describes a virtual scene may include: receiving audio data of a user; preprocessing the audio data to obtain processed descriptive audio; and performing voice coding on the processed description audio to obtain the description audio describing the virtual scene.
It should be noted that, the description audio of the user describing the virtual scene may be audio that is described by any user according to his/her own requirement on the virtual scene. In the process of describing the virtual scene by the user, the virtual scene can be described in a spoken and non-elite description form, so that the description audio of any user for describing the virtual scene is obtained. Therefore, the virtual scene is not required to be described in a description mode of a professional term by a professional user, the threshold for providing scene requirements is reduced, and the cost for constructing the virtual scene according to the scene requirements later is reduced.
When the user describes the virtual scene, the input device collects audio data of the user and then transmits the collected audio data to the data processor. After receiving the audio data describing the virtual scene by the user, the data processor can preprocess the audio data describing the virtual scene by the user, clean the audio data of the user, remove the audio which influences the definition of the audio data, such as noise, repeated content, incomplete sentences and the like in the audio data, and obtain the processed description audio. After the processed description audio is obtained, the audio data can be encoded to obtain the description audio corresponding to the audio data describing the virtual scene. The descriptive audio may include audio that describes at least one object (virtual entity) in the virtual scene by the user, and may also include audio that describes actions, states (behavior policies) of at least one object in the virtual scene by the user. It should be noted that the process of encoding audio data is to convert data describing a virtual scene by a user from sound to audio.
202. And identifying descriptive audio to obtain descriptive text describing the virtual scene, wherein the descriptive text is at least one condition for describing the virtual scene.
After the data processor acquires the description audio for describing the virtual scene by the user, the description audio can be subjected to text recognition to obtain a description text obtained by converting the description audio for expressing the scene requirement of the user. The descriptive text is descriptive text describing the conversion of audio from audio to text. The description text corresponds to the description audio, and can express the scene requirement of the user on the virtual scene.
In some embodiments, natural language processing (Natural Language Processing, NLP) may be performed on descriptive audio describing the virtual scene by the user, resulting in descriptive text corresponding to the descriptive audio. Specifically, natural language processing of the descriptive audio may include: performing voice decoding on the description audio to obtain text content; the text content is analyzed to determine descriptive text in the text content, which may be used to describe virtual entities in the virtual scene and/or behavior policies corresponding to the virtual entities.
It should be noted that the description audio is encoded content of audio converted from sound described by the user for the virtual scene, and correspondingly, the description text is decoded content of text converted from description audio determined from the user for the virtual scene. The description audio may be audio information of describing the virtual entity and/or the behavior policy in the virtual scene by the user, that is, the user modifies the audio information of the virtual scene through the virtual entity and the behavior policy correspondingly executed by the virtual entity. Correspondingly, the description text may be corresponding text information for describing the virtual entity existing in the virtual scene and the behavior policy executed by the virtual entity, that is, a condition set for defining the virtual scene, where the condition set includes at least one condition for defining the virtual scene.
203. Based on a preset construction model, constructing a corresponding virtual scene according to the description text.
The description text determined according to the description audio may be used to represent a scene requirement of a virtual scene, which may include at least one virtual entity existing in the virtual scene, and a behavior policy corresponding to the at least one virtual entity. In the related art, corresponding script data can be determined through a manual writing manner according to the scene requirement of the virtual scene, and the manual writing manner has long time and high cost in the process from the scene requirement to the script data.
In order to solve the above-mentioned problem, in the embodiment of the present application, a description text describing a virtual scene may be input into a preset construction model, and corresponding script data may be automatically generated according to scene requirements in the description text.
It should be noted that the build model may be a large language model (Large Language Model, LLM) that may be used to understand and generate natural language commonly used by humans. The large language model can realize tasks of context learning, context learning and instruction compliance on the basis of training a large amount of text data. After the data processor determines the description text for describing the virtual scene, a condition set (description for the virtual entity and/or the behavior strategy) for limiting the virtual scene is input into the large language model for recognition, a construction instruction for constructing the virtual scene is determined, and corresponding script data is generated according to the instruction to construct the corresponding virtual scene.
After the data processor constructs a corresponding virtual scene according to the scene requirement of the user, the constructed virtual scene can be sent to the display device for display and play. The user realizes virtual interaction based on the virtual scene displayed and played on the display device.
In other embodiments, the data processor may also obtain, from the state machine, a specialized scene description file describing the virtual scene. Therefore, the virtual scene can be constructed by acquiring the scene description file in the state machine, and the corresponding virtual scene can be generated according to the scene description file in the state machine.
Specifically, building the virtual scene may further include: responding to the selection operation of a user on a scene description file, acquiring the scene description file from a state machine, wherein the scene description file comprises a text for describing a virtual entity and/or a behavior strategy corresponding to the virtual entity through the virtual machine; based on the construction model, constructing a corresponding virtual scene according to the scene description file.
It should be noted that, the user may call the scene description file from the state machine. And acquiring the scene description file from the state machine without inputting data for describing the virtual scene by a user, directly determining text content for describing a virtual entity and/or a behavior strategy corresponding to the virtual entity in the virtual scene according to selection operation of the user on the preset scene description file in the state machine, inputting the scene description file into the large language model, identifying the scene description file through the large language model, and determining script data for constructing the virtual scene, thereby constructing the virtual scene corresponding to the scene description file.
In other embodiments, the build model, for example, the large language model, may be obtained by acquiring description sample data and script sample data corresponding to the description sample data, and training the initial build model according to the description sample data and the script sample data. Deep learning can be performed according to the description sample data and the corresponding script sample data, and a representation form between the description sample data and the script sample data is determined. Specifically, the training process of the large language model requires a large amount of description sample data, and the model deep learns the structure, grammar and semantics of human natural language through self-supervision technology. In the self-supervised learning process, the large language learning model can be determined to be a label corresponding to the description sample data through deep learning of the description sample data, and the representation form between the description sample and the script sample data is determined based on the label.
Further, in the process of building the large language model (build model), after training the large language model, fine tuning may also be performed on the large language model. Specifically, in the fine tuning stage, the large language model may use user data used in a specific use process of the large language model to perform data adjustment, so that the large language model is close to a relevant field of a specific application of the large language model. Thus, the learning training process for the large language model (building model) further includes fine tuning the large language model, and the fine tuning for the large language model may include: and acquiring interaction information of executing corresponding behavior strategies by virtual entities in the virtual scene constructed based on the description text, and performing fine adjustment on the initial construction model according to the interaction information to obtain the fine-adjusted large language model.
Referring to fig. 3, fig. 3 is a schematic flow chart of a specific process for constructing a virtual scene. Specifically, the data processor may receive a conditional editing-like voice collected from the input device, which may include descriptive audio obtained by encoding the voice of the AI digital person (virtual entity) in its own state, in its observation state, and in its output behavior (behavior policy) by the user; the method can also comprise descriptive audio obtained by encoding the voice of the scene description (virtual entity) of the entity by the user; descriptive audio resulting from the user encoding the tactical description (behavioral strategy) speech may also be included. After the data processor receives the conditional editing voice, natural language processing is carried out on the corresponding description audio to obtain a corresponding description text, the description text is input into a preset large language model, and a field Jing Shengcheng module constructs a corresponding virtual scene according to the description text. For example, AI numbers, scene objects (buildings, signs, tactical door panels, and wall structures), and tactical planning (tactical scene construction and indoor and outdoor tactical scene layout) may be included. In addition, the data processor can also directly call the professional state machine description file and the scene description file from the state machine, and construct a corresponding virtual scene according to the state machine description file and the scene description file.
In the method embodiment depicted in fig. 2, the description audio of the virtual scene described by the user is obtained; identifying description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene; based on a preset construction model, constructing a corresponding virtual scene according to the description text. Therefore, a professional is not required to write a corresponding script according to scene requirements, and a corresponding virtual scene is constructed according to the manually written script. The description audio can be converted to determine the description text for describing the virtual scene only by acquiring the description audio for describing any scene by a user, and the description text is input into a preset construction model to directly construct the corresponding virtual scene. Therefore, the technical scheme for automatically constructing the virtual scene according to the description audio of the user increases the scene type for constructing the virtual scene and improves the efficiency for constructing the virtual scene.
Referring to fig. 4, fig. 4 is a schematic flow chart of another scenario construction method disclosed in an embodiment of the present application. The scene construction method may include the following steps.
401. And acquiring description audio for describing the virtual scene by the user.
402. And identifying the description audio to obtain a description text for describing the virtual scene.
The specific implementation process of steps 401 and 402 may be referred to the description of steps 201 and 202 in the embodiment corresponding to fig. 2, and the detailed description will not be repeated here.
403. The description type of the description text is identified.
After converting the description audio to determine the corresponding description text, the description type of the description text can also be determined. The descriptive text may be a behavior policy that is correspondingly executed on the virtual entity and/or virtual entities in the virtual scene. The description type of the description text is determined according to different contents described in the description text, and the description type of the description text can be classified according to virtual entities or behavior strategies.
For example, in the case where the description text is "soldier runs on a runway," the description type of "soldier" as a virtual entity and "run on runway" as a description type of behavior strategy may be determined in the description text.
404. And determining a construction model corresponding to the description type to obtain a preset construction model.
Further, after determining the description type of the description text, a corresponding build model may be determined according to the description type of the description text. It should be noted that, description texts of different description types may have corresponding build models, and correspondence between description types and build models may be determined in a learning training phase of the build models. Thus, after determining different description types of description texts, a build model corresponding to the description texts of the different description types can be determined. The construction model corresponding to the description text of the different description types can comprise: an action processing model, an expression processing model and a behavior processing model.
It should be noted that, the construction models corresponding to the description texts of different description types may be stored in knowledge base data of the data processor. Referring to fig. 5, fig. 5 is a specific schematic diagram of knowledge base data, in which image data, text data, state machine description files, scene description files, video data, and audio data may be included. Based on the data in the knowledge base data, the effect and experience of building the virtual model are improved.
For example, in the case where it is determined that the description type of the description text of "run on runway" is the behavior policy, a large language model corresponding to the description text of "run on runway" may be determined as the corresponding behavior policy model.
405. Based on the construction model, script data of at least one virtual entity and a behavior strategy corresponding to the virtual entity in the constructed virtual scene are determined according to the description text.
After the description text portions of different description types in the description text are input into the corresponding construction model, the construction model can determine a behavior policy correspondingly executed on at least one virtual entity and/or the virtual entities in the virtual scene according to the description text. And automatically matching corresponding execution instructions according to at least one virtual entity and/or a behavior strategy executed by the virtual entity in the virtual scene, and automatically generating script data corresponding to scene requirements in the description text.
406. And constructing a virtual scene corresponding to the description text according to the script data, wherein the virtual scene comprises at least one virtual entity and a behavior strategy corresponding to the virtual entity.
After determining script data satisfying scene requirements corresponding to the description text, the script data may be constructed into a virtual scene. The virtual scene can comprise a plurality of virtual entities and behavior strategies corresponding to the plurality of virtual entities.
407. And determining the virtual state of the virtual object execution behavior strategy in the virtual scene.
After the corresponding virtual scene is constructed according to the scene requirement of the user, the virtual state of each virtual object in the virtual scene can be determined by detecting the virtual state of the virtual object in the virtual scene. The virtual state may be an emergency state triggered by the virtual object in the virtual scene.
408. And activating the virtual entity and/or the passive strategy of the action strategy corresponding to the virtual entity in the virtual scene according to the virtual state.
And determining a passive strategy triggered by the virtual entity entering the emergency state in the virtual scene under the condition that the virtual state enters the emergency state. It should be noted that, the corresponding relationship between the passive policy triggered when the virtual entity enters the emergency state and the virtual entity may be set by the user at the development stage.
For example, when a soldier enters the territory of an enemy camp after running, the soldier may trigger a corresponding passive strategy in the event of entering the enemy camp, which may be exiting the territory of the enemy camp.
409. And according to the passive strategy, adjusting the virtual entity in the virtual scene and/or the behavior strategy corresponding to the virtual entity to obtain the adjusted virtual scene.
After the corresponding passive strategy is executed by the virtual entity in the virtual scene, the state information of the virtual entity and the action strategy correspondingly executed by the virtual entity can be correspondingly adjusted, and after the corresponding adjustment of the action strategy correspondingly executed by the virtual entity and the virtual entity, the whole virtual scene is adjusted, so that the corresponding adjusted virtual scene is obtained.
In the method embodiment depicted in fig. 4, the description audio of the virtual scene described by the user is obtained; identifying description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene; based on a preset construction model, constructing a corresponding virtual scene according to the description text. Therefore, a professional is not required to write a corresponding script according to scene requirements, and a corresponding virtual scene is constructed according to the manually written script. The description audio can be converted to determine the description text for describing the virtual scene only by acquiring the description audio for describing any scene by a user, and the description text is input into a preset construction model to directly construct the corresponding virtual scene. Therefore, the technical scheme for automatically constructing the virtual scene according to the description audio of the user increases the scene type for constructing the virtual scene and improves the efficiency for constructing the virtual scene.
It is to be understood that the same or corresponding information as in the different embodiments described above may be referred to each other.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a scene building apparatus 600 according to an embodiment of the disclosure. As shown in fig. 6, the scene constructing apparatus 600 may include:
an obtaining unit 601, configured to obtain description audio that describes a virtual scene by a user;
the identifying unit 602 is configured to identify the description audio, and obtain a description text describing the virtual scene, where the description text is at least one condition describing the virtual scene;
the first construction unit 603 is configured to construct a corresponding virtual scene according to the description text based on a preset construction model.
In some embodiments, the obtaining unit 601 may be specifically configured to:
receiving audio data of the user;
preprocessing the audio data to obtain processed descriptive audio;
and performing voice coding on the processed description audio to obtain the description audio describing the virtual scene.
In some embodiments, the identification unit 602 may be specifically configured to:
performing voice decoding on the description audio to obtain text content;
Analyzing the text content to determine description text in the text content, wherein the description text can be used for describing virtual entities in a virtual scene and/or behavior strategies corresponding to the virtual entities.
In some embodiments, the scene building apparatus 600 may further include:
a determining unit 604, configured to identify a description type of the description text;
the determining unit 604 is further configured to determine a build model corresponding to the description type, and obtain the preset build model.
In some embodiments, the first construction unit 603 may be specifically configured to:
determining script data for constructing at least one virtual entity and a behavior strategy corresponding to the virtual entity in the virtual scene according to the description text based on the construction model;
and constructing a virtual scene corresponding to the description text according to the script data, wherein the virtual scene comprises at least one virtual entity and a behavior strategy corresponding to the virtual entity.
In some embodiments, the scene building apparatus 600 may further include:
a training unit 605, configured to obtain description sample data and script sample data corresponding to the description sample data;
the training unit 605 is further configured to train an initial build model according to the description sample data and the script sample data, so as to obtain the build model.
In some embodiments, the scene building apparatus 600 may further include:
and the updating unit 606 is used for acquiring interaction information of executing the corresponding behavior strategy by the virtual entity in the virtual scene constructed based on the description text.
In some embodiments, the training unit 605 may be specifically configured to:
training an initial construction model according to the description sample data, the script sample data and the interaction information to obtain the construction model.
In some embodiments, the scene building apparatus 600 may further include:
a passive trigger unit 607, configured to determine a virtual state of executing the behavior policy by a virtual object in the virtual scene;
the passive triggering unit 607 is further configured to activate a passive policy of a virtual entity and/or a behavior policy corresponding to the virtual entity in the virtual scene according to the virtual state;
the passive triggering unit 607 is further configured to adjust a virtual entity in the virtual scene and/or a behavior policy corresponding to the virtual entity according to the passive policy, so as to obtain an adjusted virtual scene.
In some embodiments, the scene building apparatus 600 may further include:
a second construction unit 608, configured to obtain, from a state machine, a scene description file in response to a user selection operation of the scene description file, where the scene description file includes text describing a virtual entity and/or a behavior policy corresponding to the virtual entity by the virtual machine;
The second construction unit 608 is further configured to construct a corresponding virtual scene according to the scene description file based on the construction model.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus and modules described above may refer to the corresponding process in the foregoing method embodiment, which is not repeated herein.
In several of the embodiments disclosed herein, the coupling of the modules to each other may be electrical, mechanical, or other.
In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.
As shown in fig. 7, the embodiment of the present application further discloses a schematic structural diagram of an electronic device 700, where the electronic device 700 includes a processor 710 and a memory 720, and the memory 720 stores computer program instructions, and when the computer program instructions are called by the processor 710, the various method steps disclosed in the foregoing embodiment can be executed. It will be appreciated by those skilled in the art that the structure of the electronic device shown in the drawings does not constitute a limitation of the electronic device, and may include more or less components than those illustrated, or may combine certain components, or may have a different arrangement of components. Wherein:
Processor 710 may include one or more processing cores. The processor 710 connects various parts within the overall battery management system using various interfaces and lines, and monitors the electronic device as a whole by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 720, invoking data stored in the memory 720, performing various functions of the battery management system and processing data, and performing various functions of the electronic device and processing data. Alternatively, the processor 710 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 710 may integrate one or a combination of several of a central processor 710 (Central Processing Unit, CPU), an image processor 710 (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 710 and may be implemented solely by a single communication chip.
The Memory 720 may include a random access Memory 720 (Random Access Memory, RAM) or a Read-Only Memory 720 (Read-Only Memory). Memory 720 may be used to store instructions, programs, code sets, or instruction sets. The memory 620 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, etc. The storage data area may also store data created by the electronic device in use (e.g., phonebook, audio-video data, chat-record data), etc. Accordingly, memory 720 may also include a memory controller to disclose access to memory 720 by processor 710.
Although not shown, the electronic device 700 may further include a display unit or the like, which is not described herein. In particular, in this embodiment, the processor 710 in the electronic device loads executable files corresponding to the processes of one or more application programs into the memory 720 according to the following instructions, and the processor 710 executes the application programs stored in the memory 720, so as to implement the various method steps disclosed in the foregoing embodiment.
According to one aspect of the present application, a computer-readable storage medium is disclosed, which may include computer instructions, which may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, the computer readable storage medium comprises a Non-volatile computer readable storage medium (Non-Transitory Computer-Readable Storage Medium). The computer readable storage medium has storage space for program code to perform any of the method steps described above.
According to one aspect of the present application, a computer program product or computer program is disclosed, comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer-readable storage medium and executes the computer instructions to cause the electronic device to perform the methods disclosed in the various alternative implementations disclosed in the above embodiments.
The foregoing description is not intended to limit the preferred embodiments of the present application, but is not intended to limit the scope of the present application, and any such modifications, equivalents and adaptations of the embodiments described above in accordance with the principles of the present application should and are intended to be within the scope of the present application, as long as they do not depart from the scope of the present application.

Claims (10)

1. A method of scene construction, the method comprising:
acquiring description audio for describing the virtual scene by a user;
identifying the description audio to obtain a description text describing the virtual scene, wherein the description text is at least one condition describing the virtual scene;
based on a preset construction model, constructing a corresponding virtual scene according to the description text.
2. The method of claim 1, wherein the obtaining descriptive audio describing the virtual scene by the user comprises:
receiving audio data of the user;
preprocessing the audio data to obtain processed descriptive audio;
and performing voice coding on the processed description audio to obtain the description audio describing the virtual scene.
3. The method of claim 1, wherein the identifying the descriptive audio to obtain descriptive text describing the virtual scene comprises:
performing voice decoding on the description audio to obtain text content;
analyzing the text content to determine description text in the text content, wherein the description text can be used for describing virtual entities in a virtual scene and/or behavior strategies corresponding to the virtual entities.
4. A method as claimed in claim 3, wherein the method further comprises:
identifying the description type of the description text;
determining a construction model corresponding to the description type to obtain the preset construction model;
the constructing the corresponding virtual scene according to the description text based on the preset construction model comprises the following steps:
determining script data for constructing at least one virtual entity and a behavior strategy corresponding to the virtual entity in the virtual scene according to the description text based on the construction model;
and constructing a virtual scene corresponding to the description text according to the script data, wherein the virtual scene comprises at least one virtual entity and a behavior strategy corresponding to the virtual entity.
5. The method of claim 1, wherein the method further comprises:
acquiring description sample data and script sample data corresponding to the description sample data;
training an initial construction model according to the description sample data and the script sample data to obtain the construction model.
6. The method of claim 5, wherein the method further comprises:
acquiring interaction information of executing corresponding behavior strategies by virtual entities in a virtual scene constructed based on the description text;
Training an initial construction model according to the description sample data and the script sample data, and obtaining the construction model comprises the following steps:
training an initial construction model according to the description sample data, the script sample data and the interaction information to obtain the construction model.
7. The method of claim 4, wherein the method further comprises:
determining a virtual state of executing the behavior strategy by a virtual object in the virtual scene;
activating a virtual entity and/or a passive strategy of a behavior strategy corresponding to the virtual entity in the virtual scene according to the virtual state;
and according to the passive strategy, adjusting a virtual entity in the virtual scene and/or a behavior strategy corresponding to the virtual entity to obtain an adjusted virtual scene.
8. The method of any one of claims 1-7, wherein the method further comprises:
responding to the selection operation of a user on a scene description file, acquiring the scene description file from a state machine, wherein the scene description file comprises a text for describing a virtual entity and/or a behavior strategy corresponding to the virtual entity through the virtual machine;
and constructing a corresponding virtual scene according to the scene description file based on the construction model.
9. A scene building apparatus, characterized in that the scene building apparatus comprises:
the acquisition unit is used for acquiring description audio for describing the virtual scene by a user;
the identification unit is used for identifying the description audio to obtain a description text for describing the virtual scene, wherein the description text is at least one condition for describing the virtual scene;
the first construction unit is used for constructing a corresponding virtual scene according to the description text based on a preset construction model.
10. An electronic device comprising a memory storing a computer program and a processor that invokes the computer program to implement the method of any one of claims 1-8.
CN202311149229.7A 2023-09-06 2023-09-06 Scene construction method and device and electronic equipment Pending CN117398682A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311149229.7A CN117398682A (en) 2023-09-06 2023-09-06 Scene construction method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311149229.7A CN117398682A (en) 2023-09-06 2023-09-06 Scene construction method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN117398682A true CN117398682A (en) 2024-01-16

Family

ID=89487799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311149229.7A Pending CN117398682A (en) 2023-09-06 2023-09-06 Scene construction method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN117398682A (en)

Similar Documents

Publication Publication Date Title
CN109658928B (en) Cloud multi-mode conversation method, device and system for home service robot
JP6902683B2 (en) Virtual robot interaction methods, devices, storage media and electronic devices
WO2021072875A1 (en) Intelligent dialogue generation method, device, computer apparatus and computer storage medium
JP2021103328A (en) Voice conversion method, device, and electronic apparatus
US20130262114A1 (en) Crowdsourced, Grounded Language for Intent Modeling in Conversational Interfaces
CN110234018B (en) Multimedia content description generation method, training method, device, equipment and medium
CN107040452B (en) Information processing method and device and computer readable storage medium
CN111145732B (en) Processing method and system after multi-task voice recognition
KR102104294B1 (en) Sign language video chatbot application stored on computer-readable storage media
CN112329451B (en) Sign language action video generation method, device, equipment and storage medium
CN111968678B (en) Audio data processing method, device, equipment and readable storage medium
CN111862280A (en) Virtual role control method, system, medium, and electronic device
CN113505198A (en) Keyword-driven generating type dialogue reply method and device and electronic equipment
CN114268747A (en) Interview service processing method based on virtual digital people and related device
CN109859747A (en) Voice interactive method, equipment and storage medium
CN116524924A (en) Digital human interaction control method, device, electronic equipment and storage medium
CN113923475B (en) Video synthesis method and video synthesizer
San-Segundo et al. Proposing a speech to gesture translation architecture for Spanish deaf people
CN117398682A (en) Scene construction method and device and electronic equipment
CN110781329A (en) Image searching method and device, terminal equipment and storage medium
CN110781327A (en) Image searching method and device, terminal equipment and storage medium
CN116975214A (en) Text generation method, device, storage medium and computer equipment
CN113763925B (en) Speech recognition method, device, computer equipment and storage medium
CN112527105B (en) Man-machine interaction method and device, electronic equipment and storage medium
CN110263346B (en) Semantic analysis method based on small sample learning, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination