CN111939558A - Method and system for driving virtual character action by real-time voice - Google Patents

Method and system for driving virtual character action by real-time voice Download PDF

Info

Publication number
CN111939558A
CN111939558A CN202010836241.5A CN202010836241A CN111939558A CN 111939558 A CN111939558 A CN 111939558A CN 202010836241 A CN202010836241 A CN 202010836241A CN 111939558 A CN111939558 A CN 111939558A
Authority
CN
China
Prior art keywords
voice
virtual character
unity
engine
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010836241.5A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Shenzhi Technology Co ltd
Original Assignee
Beijing Zhongke Shenzhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Shenzhi Technology Co ltd filed Critical Beijing Zhongke Shenzhi Technology Co ltd
Priority to CN202010836241.5A priority Critical patent/CN111939558A/en
Publication of CN111939558A publication Critical patent/CN111939558A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/20Input arrangements for video game devices
    • A63F13/21Input arrangements for video game devices characterised by their sensors, purposes or types
    • A63F13/215Input arrangements for video game devices characterised by their sensors, purposes or types comprising means for detecting acoustic signals, e.g. using a microphone
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/40Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment
    • A63F13/42Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment by mapping the input signals into game commands, e.g. mapping the displacement of a stylus on a touch screen to the steering angle of a virtual vehicle
    • A63F13/424Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment by mapping the input signals into game commands, e.g. mapping the displacement of a stylus on a touch screen to the steering angle of a virtual vehicle involving acoustic input signals, e.g. by using the results of pitch or rhythm extraction or voice recognition
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/80Special adaptations for executing a specific game genre or game mode
    • A63F13/825Fostering virtual characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/10Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
    • A63F2300/1081Input via voice recognition
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6063Methods for processing data by generating or executing the game program for sound processing
    • A63F2300/6072Methods for processing data by generating or executing the game program for sound processing of an input signal, e.g. pitch and rhythm extraction, voice recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a method and a system for driving virtual character actions by real-time voice, wherein the method comprises the following steps: establishing a virtual character action scene by using a Unity engine (but not limited to a Unity game engine, all real-time game engines support, such as a Unreal game engine and the like); adding corresponding variable conditions for different actions executed by the virtual character; integrating a voice interface into a Unity engine; acquiring voice data; uploading the acquired voice data to a voice recognition system through a voice interface, and outputting a voice recognition result after the voice recognition system performs content recognition on the voice data; the Unity engine receives the voice recognition result through the voice interface and matches the action variable conditions of the virtual character according to the voice recognition result; and the Unity engine drives the virtual character to execute corresponding action according to the matched variable condition. The invention directly drives the virtual character to act in a voice control mode, simplifies the operation process of the virtual character, reduces the limb interaction in reality and ensures that the control mode of the virtual character is simpler and more convenient.

Description

Method and system for driving virtual character action by real-time voice
Technical Field
The invention relates to the technical field of motion simulation and animation games, in particular to a method and a system for driving virtual character actions by real-time voice.
Background
VR (Virtual Reality) Virtual Reality technology, also known as smart technology, is a new practical technology developed in the 20 th century. The virtual reality technology comprises a computer, electronic information and simulation technology, and the basic realization mode is that the computer simulates a virtual environment so as to provide people with environmental immersion.
Along with the development of virtual reality technology, people no longer satisfy as the viewer, and people more hope to participate in the VR scene of viewing, the VR scene interactive mode that is comparatively general at present is that the user immerses in the VR scene with first visual angle through wearing the VR helmet, then utilizes operating handle to carry out gesture transform, action realization such as object snatchs and the interaction with the VR scene. However, the existing interaction mode is established on the basis of limb movement or manual operation, the operation on the virtual character is not simple enough, and the virtual character cannot be directly driven to move in a real-time voice driving mode.
Disclosure of Invention
The invention aims to provide a method and a system for driving a virtual character to act by real-time voice.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for driving virtual character actions by real-time voice is provided, which comprises the following steps:
establishing a virtual character action scene by using a Unity engine (but not limited to a Unity game engine, all real-time game engines support, such as a Unreal game engine and the like);
adding corresponding variable conditions for the virtual character to execute different actions;
integrating a voice interface into a Unity engine;
acquiring voice data;
uploading the acquired voice data to a voice recognition system through the voice interface, and outputting a voice recognition result after the voice recognition system performs content recognition on the voice data;
the Unity engine receives the voice recognition result through the voice interface and matches the action variable conditions of the virtual character according to the voice recognition result;
and the Unity engine drives the virtual character to execute corresponding actions according to the matched variable conditions.
As a preferred aspect of the present invention, the voice interface integrated in the Unity engine is provided by a third party voice platform.
As a preferred scheme of the present invention, the voice interface provided by the third-party voice platform includes but is not limited to a REST API voice interface provided by the Baidu AI open platform or an Android SDK interface provided by google corporation.
As a preferred aspect of the present invention, the speech recognition system performs content recognition on the speech data through a speech recognition model, and the speech recognition model is trained through a RBM restricted boltzmann machine stochastic neural network.
As a preferable aspect of the present invention, the method of driving the virtual character motion is expressed by the following formula (1):
Figure BDA0002639123330000021
in the formula (1), the first and second groups,
Figure BDA0002639123330000022
representing a kinematic deformation of a skeletal model of a virtual character;
Figure BDA0002639123330000023
to represent joints j on a virtual character skeleton model1A dual quaternion of the motion attitude of (a);
w1is a joint j1The weight of (c);
Figure BDA0002639123330000024
representing joints j on a virtual character skeletal modelnA dual quaternion of the motion attitude of (a);
wnis a joint jnThe weight of (c).
As a preferred aspect of the present invention, the dual quaternion expressing the posture of the joint motion is expressed by the following formula (2):
Figure BDA0002639123330000025
in the above formula, the first and second carbon atoms are,
Figure BDA0002639123330000026
is a dual quaternion representing the joint pose on the virtual character skeleton model;
s0a rotation axis for articulation;
θ0is the angle of rotation of the joint movement;
is a dual operator;
stranslation of the joint along the rotation axis;
s=r×s0and r is the center of rotation of the joint.
The invention also provides a system for driving the virtual character to act by real-time voice, which can realize the method for driving the virtual character to act by real-time voice, and the system comprises:
the virtual character action scene establishing module is used for providing designers with a Unity engine (but not limited to a Unity game engine, all real-time game engines support, such as a Unreal game engine) to establish a virtual character action scene;
the virtual character action condition setting module is connected with the virtual character action scene establishing module and is used for providing the designer with variable conditions for adding corresponding different actions to the virtual character;
the voice interface integration module is used for providing designers with voice interfaces integrated into the Unity engine;
the voice data acquisition module is used for automatically acquiring and storing externally input voice data;
the voice data uploading module is connected with the voice data acquisition module and used for uploading the acquired voice data to a voice recognition system, and the voice recognition system performs content recognition on the voice data and then outputs a voice recognition result;
the voice recognition result receiving module is connected with the voice interface integration module and used for receiving the voice recognition result through the voice interface integrated in the Unity engine;
the variable condition matching module is respectively connected with the virtual character action condition setting module and the voice recognition result receiving module and is used for automatically matching the variable conditions of the virtual character actions according to the voice recognition result;
and the virtual character driving module is respectively connected with the variable condition matching module and the virtual character action scene establishing module and is used for generating a driving signal according to the matched variable condition and driving the virtual character to execute corresponding action.
As a preferred solution of the present invention, the voice interface integrated in the Unity engine is provided by a third-party voice platform, and the voice interface provided by the third-party voice platform includes, but is not limited to, a REST API voice interface provided by a Baidu AI open platform or an Android SDK interface provided by google corporation.
The invention directly drives the virtual character to act in a voice control mode, simplifies the operation process of the virtual character, reduces the limb interaction in reality and ensures that the control mode of the virtual character is simpler and more convenient.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a flowchart illustrating a method for real-time voice-driven virtual character movement according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a system for real-time voice-driven virtual character movement according to an embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "inner", "outer", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not indicated or implied that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and the specific meanings of the terms may be understood by those skilled in the art according to specific situations.
In the description of the present invention, unless otherwise explicitly specified or limited, the term "connected" or the like, if appearing to indicate a connection relationship between the components, is to be understood broadly, for example, as being fixed or detachable or integral; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or may be connected through one or more other components or may be in an interactive relationship with one another. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The method for driving the virtual character to act by the real-time voice provided by an embodiment of the present invention, as shown in fig. 1, includes:
step S1, a Unity engine (but not limited to a Unity game engine, all real-time game engines support, such as a Unreal game engine) is used for establishing a virtual character action scene; the virtual character action scene means that a virtual character is drawn through the Unity engine, and then the virtual character is rendered to execute corresponding actions, such as raising hands, running and the like;
step S2, corresponding variable conditions are added for the virtual character to execute different actions; the variable condition mentioned here refers to model variable parameters required for driving the virtual character to perform different actions, such as "running" of the virtual character, so that each joint point on the skeletal model of the virtual character needs to move according to a desired "running" state, and the movement of the joint points is controlled by corresponding joint variable parameters, such as a variable parameter of a first joint point (such as a knee joint) in the skeletal model in the running state is a rotation angle of 5 °, a translation distance of not more than 10cm, and the like. The rotation angle and the translation distance of the knee joint are variable parameters for controlling the knee joint to enter the running state, namely, the variable conditions.
Step S3, the speech interface is integrated into the Unity engine, and the Unity engine itself does not have a speech interface at present, so the Unity engine itself does not have a function of driving the action of the avatar by real-time speech. The voice interface provided by the third party speech recognition platform may be integrated into the Unity engine through an interface provided by the Unity engine.
Step S4, voice data is obtained;
step S5, the obtained voice data are uploaded to a voice recognition system through a voice interface, and the voice recognition system carries out content recognition on the voice data and then outputs a voice recognition result;
step S6, the Unity engine receives the voice recognition result through the voice interface and matches the action variable conditions of the virtual character according to the voice recognition result;
in step S7, the Unity engine drives the virtual character to execute corresponding actions according to the matched variable conditions.
In view of the high cost and difficulty of independently developing the voice recognition system, the function of recognizing the voice content is realized by a third-party voice platform. The technical innovation point of the invention is that a voice interface provided by a third-party voice platform is integrated into a Unity engine, after the voice content is recognized by the third-party voice platform, the recognized voice content is matched with the variable condition for driving the action of the virtual character, and after the matching is successful, the virtual character is driven to execute the corresponding action. The variable condition for driving the virtual character to act has a matching relation with the voice content, for example, the recognized voice content is 'running', and the variable condition for driving the virtual character 'running' is 'driving strategy one', so that when the recognized voice content is 'running', the real-time voice driving virtual character system provided by the invention generates a signal for driving running, and then the system drives the virtual character 'running' according to the preset 'running' variable condition.
In the above technical solution, the method for integrating the semantic interface provided by the third-party speech platform into the Unity engine may be implemented by software programming, and a specific integration process is not described herein.
At present, the speech recognition technology is mature and there are companies that can provide open interfaces, such as Baidu corporation, Google corporation, etc. The user can upload voice data through an REST API voice interface provided by the Baidu AI open platform, and the Baidu AI open platform identifies the voice data and then outputs a voice content identification result and feeds the voice content identification result back to the user. The REST API supports three languages of Mandarin, Cantonese and English, needs to upload a complete recording file, has the time length of no more than 60s, and supports three file formats of pcm (uncompressed), wav (uncompressed, pcm coding) and amr (compressed format) for uploading.
Before calling a voice recognition interface through REST API in the Unity script, authentication is required to be obtained according to an authentication mechanism. And recording voice after acquiring the correct token, and converting the voice into byte stream data encoded by base 64. And then packaging parameters such as the voice format, the sampling rate, the number of channels, the token and the like in a json data format, uploading the parameters through a POST request, and obtaining a feedback identification result. In addition, the recorded data can be placed in an http body, the data type of the request header is defined, and then the website with the request header is accessed to obtain the identification result. The voice recognition results obtained by the two data uploading modes are consistent. The text of the voice recognition result can be extracted through json analysis.
The accuracy of recognizing the voice content by the Baidu AI open platform is high, but the Baidu AI open platform can perform voice recognition only after the voice data uploaded through the REST API voice interface needs to be uploaded integrally with the recorded voice data, so that the response speed of driving the virtual character to act is influenced to a certain extent. Therefore, in order to solve the problem, the invention provides another voice recognition scheme, voice data is uploaded through an Android SDK interface provided by google, and voice content is recognized by a voice recognition service provided by google. The integration of Unity and Android SDK requires additional knowledge of Android engineering development to provide support, so the integration difficulty is high.
The Android SDK adopts a streaming protocol to identify voice data, and adopts a mode of processing and feeding back at the same time to identify voice. Compared with the full platform support of REST API, the Android SDK only supports the Android platform. And the Android SDK can not be directly called, and developers need to establish Android library engineering to write self-defining classes and call event classes of the SDK, so that the call of the voice recognition function interface is realized. And after the Android library is customized, integrating the voice recognition technology into a Unity engine through a communication mechanism between Unity and Android.
The Android SDK interface supports uploading and voice recognition, and can improve the efficiency of voice recognition, but the Android SDK also has the defects that the Android SDK cannot be directly used, a developer needs to call a required database, and the requirement on a programmer is higher.
In recent years, deep learning technology has been developed rapidly, and some scholars begin to research methods for recognizing speech content by using deep learning models, but deep learning models cannot be integrated into a Unity engine, and virtual characters cannot be directly driven to move by real-time speech in the Unity engine. But the recognized voice content can be converted into a corresponding driving instruction and sent to the Unity engine, the driving instruction is matched with the variable condition for driving the action of the virtual character, and the Unity engine can drive the virtual character to execute the corresponding action after receiving the driving instruction. Although the driving process is complex, a voice interface does not need to be integrated in the Unity engine, and the method has certain application value. Therefore, as a preferred solution, the speech recognition system provided by the present invention performs content recognition on speech data through a speech recognition model, and more preferably, the speech recognition model is obtained through random neural training of RBM limited boltzmann machine. The training process for the speech recognition model is not described here.
As shown in fig. 2, the present invention further provides a system for real-time voice-driven virtual character movement, including:
the virtual character action scene establishing module 1 is used for providing designers with virtual character action scenes established by a Unity engine;
the virtual character action condition setting module 2 is connected with the virtual character action scene establishing module and is used for providing the designer with variable conditions for adding corresponding actions for the virtual character;
the voice interface integration module 3 is used for providing designers with voice interfaces integrated into the Unity engine;
the voice data acquisition module 4 is used for automatically acquiring and storing externally input voice data; the voice data acquisition module can be a microphone;
the voice data uploading module 5 is connected with the voice data acquiring module 4 and is used for uploading the acquired voice data to a voice recognition system 100, and the voice recognition system 100 performs content recognition on the voice data and then outputs a voice recognition result;
the voice recognition result receiving module 6 is connected with the voice interface integration module 3 and used for receiving the voice recognition result through a voice interface integrated in the Unity engine;
the variable condition matching module 7 is respectively connected with the virtual character action condition setting module 2 and the voice recognition result receiving module 6 and is used for automatically matching the variable conditions of the virtual character actions according to the voice recognition result;
and the virtual character driving module 8 is respectively connected with the variable condition matching module 7 and the virtual character action scene and establishing module 1, and is used for generating a driving signal according to the matched variable condition and driving the virtual character to execute a corresponding action.
Preferably, the voice interface integrated in the Unity engine is provided by a third-party voice platform, and the voice interface of the third-party voice platform comprises a REST API voice interface provided by a Baidu AI open platform or an Android SDK interface provided by google corporation.
In order to ensure the fidelity of the action of the virtual character, the invention also provides a virtual character driving method, which is expressed by the following formula (1):
Figure BDA0002639123330000071
in the formula (1), the first and second groups,
Figure BDA0002639123330000072
representing a kinematic deformation of a skeletal model of a virtual character;
Figure BDA0002639123330000073
to represent joints j on a virtual character skeleton model1A dual quaternion of the motion attitude of (a);
w1is a joint j1The weight of (c);
Figure BDA0002639123330000074
representing joints j on a virtual character skeletal modelnA dual quaternion of the motion attitude of (a);
wnis a joint jnThe weight of (c).
In the present embodiment, the dual quaternion representing the joint movement posture is expressed by the following formula (2):
Figure BDA0002639123330000075
in the above formula, the first and second carbon atoms are,
Figure BDA0002639123330000076
is a dual quaternion representing the joint pose on the virtual character skeleton model;
s0a rotation axis for articulation;
θ0is the angle of rotation of the joint movement;
is a dual operator;
stranslation of the joint along the rotation axis;
s=r×s0and r is the center of rotation of the joint.
In summary, the present invention
It should be understood that the above-described embodiments are merely preferred embodiments of the invention and the technical principles applied thereto. It will be understood by those skilled in the art that various modifications, equivalents, changes, and the like can be made to the present invention. However, such variations are within the scope of the invention as long as they do not depart from the spirit of the invention. In addition, certain terms used in the specification and claims of the present application are not limiting, but are used merely for convenience of description.

Claims (8)

1. A method for driving virtual character actions by real-time voice is characterized by comprising the following steps:
establishing a virtual character action scene by using a Unity engine (but not limited to a Unity game engine, all real-time game engines support, such as a Unreal game engine and the like);
adding corresponding variable conditions for the virtual character to execute different actions;
integrating a voice interface into a Unity engine;
acquiring voice data;
uploading the acquired voice data to a voice recognition system through the voice interface, and outputting a voice recognition result after the voice recognition system performs content recognition on the voice data;
the Unity engine receives the voice recognition result through the voice interface and matches the action variable conditions of the virtual character according to the voice recognition result;
and the Unity engine drives the virtual character to execute corresponding actions according to the matched variable conditions.
2. A method for real-time voice-driven virtual character actions according to claim 1, wherein the voice interface integrated in the Unity engine (but not limited to Unity game engine, all real-time game engines support, such as a urea game engine, etc.) is provided by a third party voice platform.
3. The method of claim 2, wherein the voice interface provided by any third party voice platform includes, but is not limited to, the REST API voice interface provided by Baidu AI open platform or the Android SDK interface provided by Google.
4. The method of claim 1, wherein the speech recognition system performs content recognition on the speech data through a speech recognition model, and the speech recognition model is trained through a stochastic neural network (stochastic neural network) of Restricted Boltzmann Machine (RBM).
5. The method for driving virtual character motion by real-time voice according to claim 1, is characterized in that the method for driving the virtual character motion is expressed by the following formula (1):
Figure FDA0002639123320000011
in the formula (1), the first and second groups,
Figure FDA0002639123320000012
representing a kinematic deformation of a skeletal model of a virtual character;
Figure FDA0002639123320000013
to represent joints j on a virtual character skeleton model1A dual quaternion of the motion attitude of (a);
w1is a joint j1The weight of (c);
Figure FDA0002639123320000014
representing joints j on a virtual character skeletal modelnA dual quaternion of the motion attitude of (a);
wnis a joint jnThe weight of (c).
6. The method of real-time voice-driven virtual character movement according to claim 5, wherein the dual quaternion representing the joint movement posture is expressed by the following formula (2):
Figure FDA0002639123320000021
in the above formula, the first and second carbon atoms are,
Figure FDA0002639123320000022
is a dual quaternion representing the joint pose on the virtual character skeleton model;
s0a rotation axis for articulation;
θ0is the angle of rotation of the joint movement;
is a dual operator;
stranslation of the joint along the rotation axis;
s=r×s0and r is the center of rotation of the joint.
7. A system for driving virtual character actions by real-time voice, which can realize the method as claimed in any one of claims 1 to 6, is characterized by comprising:
the virtual character action scene establishing module is used for providing designers with a Unity engine (but not limited to a Unity game engine, all real-time game engines support, such as a Unreal game engine) to establish a virtual character action scene;
the virtual character action condition setting module is connected with the virtual character action scene establishing module and is used for providing the designer with variable conditions for adding corresponding different actions to the virtual character;
the voice interface integration module is used for providing designers with voice interfaces integrated into the Unity engine;
the voice data acquisition module is used for automatically acquiring and storing externally input voice data;
the voice data uploading module is connected with the voice data acquisition module and used for uploading the acquired voice data to a voice recognition system, and the voice recognition system performs content recognition on the voice data and then outputs a voice recognition result;
a voice recognition result receiving module, connected to the voice interface integration module, for receiving the voice recognition result through the voice interface integrated in the Unity engine;
the variable condition matching module is respectively connected with the virtual character action condition setting module and the voice recognition result receiving module and is used for automatically matching the variable conditions of the virtual character actions according to the voice recognition result;
and the virtual character driving module is respectively connected with the variable condition matching module and the virtual character action scene establishing module and is used for generating a driving signal according to the matched variable condition and driving the virtual character to execute corresponding action.
8. The system of claim 7, wherein the voice interface integrated into the Unity engine is provided by a third party voice platform, the voice interface provided by a third party voice platform including, but not limited to, the REST API voice interface provided by the Baidu AI open platform or the Android SDK interface provided by Google.
CN202010836241.5A 2020-08-19 2020-08-19 Method and system for driving virtual character action by real-time voice Pending CN111939558A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010836241.5A CN111939558A (en) 2020-08-19 2020-08-19 Method and system for driving virtual character action by real-time voice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010836241.5A CN111939558A (en) 2020-08-19 2020-08-19 Method and system for driving virtual character action by real-time voice

Publications (1)

Publication Number Publication Date
CN111939558A true CN111939558A (en) 2020-11-17

Family

ID=73342809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010836241.5A Pending CN111939558A (en) 2020-08-19 2020-08-19 Method and system for driving virtual character action by real-time voice

Country Status (1)

Country Link
CN (1) CN111939558A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763532A (en) * 2021-04-19 2021-12-07 腾讯科技(深圳)有限公司 Human-computer interaction method, device, equipment and medium based on three-dimensional virtual object
CN114283227A (en) * 2021-11-26 2022-04-05 北京百度网讯科技有限公司 Virtual character driving method and device, electronic device and readable storage medium
CN116168686A (en) * 2023-04-23 2023-05-26 碳丝路文化传播(成都)有限公司 Digital human dynamic simulation method, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930599A (en) * 2012-10-18 2013-02-13 浙江大学 Hand motion three-dimensional simulation method based on dual quaternion
CN106485774A (en) * 2016-12-30 2017-03-08 当家移动绿色互联网技术集团有限公司 Expression based on voice Real Time Drive person model and the method for attitude
CN106710590A (en) * 2017-02-24 2017-05-24 广州幻境科技有限公司 Voice interaction system with emotional function based on virtual reality environment and method
CN107424602A (en) * 2017-05-25 2017-12-01 合肥泽诺信息科技有限公司 A kind of man-machine interactive game engine based on speech recognition and human body attitude
US20180247443A1 (en) * 2017-02-28 2018-08-30 International Business Machines Corporation Emotional analysis and depiction in virtual reality
CN110782513A (en) * 2019-10-30 2020-02-11 北京中科深智科技有限公司 Method for real-time motion capture data debouncing composite algorithm
CN110895931A (en) * 2019-10-17 2020-03-20 苏州意能通信息技术有限公司 VR (virtual reality) interaction system and method based on voice recognition

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930599A (en) * 2012-10-18 2013-02-13 浙江大学 Hand motion three-dimensional simulation method based on dual quaternion
CN106485774A (en) * 2016-12-30 2017-03-08 当家移动绿色互联网技术集团有限公司 Expression based on voice Real Time Drive person model and the method for attitude
CN106710590A (en) * 2017-02-24 2017-05-24 广州幻境科技有限公司 Voice interaction system with emotional function based on virtual reality environment and method
US20180247443A1 (en) * 2017-02-28 2018-08-30 International Business Machines Corporation Emotional analysis and depiction in virtual reality
CN107424602A (en) * 2017-05-25 2017-12-01 合肥泽诺信息科技有限公司 A kind of man-machine interactive game engine based on speech recognition and human body attitude
CN110895931A (en) * 2019-10-17 2020-03-20 苏州意能通信息技术有限公司 VR (virtual reality) interaction system and method based on voice recognition
CN110782513A (en) * 2019-10-30 2020-02-11 北京中科深智科技有限公司 Method for real-time motion capture data debouncing composite algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高颖 等: "《虚拟现实视景仿真技术》", 31 March 2014, 西北工业大学出版社 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113763532A (en) * 2021-04-19 2021-12-07 腾讯科技(深圳)有限公司 Human-computer interaction method, device, equipment and medium based on three-dimensional virtual object
CN113763532B (en) * 2021-04-19 2024-01-19 腾讯科技(深圳)有限公司 Man-machine interaction method, device, equipment and medium based on three-dimensional virtual object
CN114283227A (en) * 2021-11-26 2022-04-05 北京百度网讯科技有限公司 Virtual character driving method and device, electronic device and readable storage medium
CN114283227B (en) * 2021-11-26 2023-04-07 北京百度网讯科技有限公司 Virtual character driving method and device, electronic equipment and readable storage medium
CN116168686A (en) * 2023-04-23 2023-05-26 碳丝路文化传播(成都)有限公司 Digital human dynamic simulation method, device and storage medium
CN116168686B (en) * 2023-04-23 2023-07-11 碳丝路文化传播(成都)有限公司 Digital human dynamic simulation method, device and storage medium

Similar Documents

Publication Publication Date Title
CN110688911B (en) Video processing method, device, system, terminal equipment and storage medium
US20230316643A1 (en) Virtual role-based multimodal interaction method, apparatus and system, storage medium, and terminal
WO2021043053A1 (en) Animation image driving method based on artificial intelligence, and related device
KR102503413B1 (en) Animation interaction method, device, equipment and storage medium
US20220150285A1 (en) Communication assistance system, communication assistance method, communication assistance program, and image control program
TWI430189B (en) System, apparatus and method for message simulation
CN111939558A (en) Method and system for driving virtual character action by real-time voice
CN107765852A (en) Multi-modal interaction processing method and system based on visual human
CN110400251A (en) Method for processing video frequency, device, terminal device and storage medium
CN107340859A (en) The multi-modal exchange method and system of multi-modal virtual robot
CN105126355A (en) Child companion robot and child companioning system
CN111045582A (en) Personalized virtual portrait activation interaction system and method
JP2014519082A5 (en)
CN111414506B (en) Emotion processing method and device based on artificial intelligence, electronic equipment and storage medium
US20240070397A1 (en) Human-computer interaction method, apparatus and system, electronic device and computer medium
CN110874859A (en) Method and equipment for generating animation
CN107972028A (en) Man-machine interaction method, device and electronic equipment
WO2023284435A1 (en) Method and apparatus for generating animation
WO2022267380A1 (en) Face motion synthesis method based on voice driving, electronic device, and storage medium
CN113205569A (en) Image drawing method and device, computer readable medium and electronic device
JP2017182261A (en) Information processing apparatus, information processing method, and program
CN117830476A (en) Virtual image generation method and related device
KR102120936B1 (en) System for providing customized character doll including smart phone
CN117857892B (en) Data processing method, device, electronic equipment, computer program product and computer readable storage medium based on artificial intelligence
CN112634684B (en) Intelligent teaching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201117