CN110895931A - VR (virtual reality) interaction system and method based on voice recognition - Google Patents

VR (virtual reality) interaction system and method based on voice recognition Download PDF

Info

Publication number
CN110895931A
CN110895931A CN201910986351.7A CN201910986351A CN110895931A CN 110895931 A CN110895931 A CN 110895931A CN 201910986351 A CN201910986351 A CN 201910986351A CN 110895931 A CN110895931 A CN 110895931A
Authority
CN
China
Prior art keywords
module
voice
user
processing
scene
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910986351.7A
Other languages
Chinese (zh)
Inventor
刘雨松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Yi Neng Tong Information Technology Co Ltd
Original Assignee
Suzhou Yi Neng Tong Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Yi Neng Tong Information Technology Co Ltd filed Critical Suzhou Yi Neng Tong Information Technology Co Ltd
Priority to CN201910986351.7A priority Critical patent/CN110895931A/en
Publication of CN110895931A publication Critical patent/CN110895931A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention relates to the related field of voice recognition systems, and discloses a VR (virtual reality) interaction system based on voice recognition, which comprises a cloud and a VR peripheral end, wherein the cloud comprises a voice recognition module, a semantic recognition module, a scene processing module, a storage module and a communication module, the VR peripheral end comprises a display module, a voice input module and a voice input module, and the VR peripheral end also comprises the communication module, and the invention also discloses a method for the VR interaction system based on voice recognition, which comprises the following steps: constructing a knowledge base dialog library; opening a cloud end and a VR peripheral end; the user wears the VR peripheral; a user input; and (6) cloud processing. The method effectively overcomes the defects of poor interactivity and strong drawing-off feeling of the existing VR product, and realizes more natural interactive experience of people and virtual scene characters.

Description

VR (virtual reality) interaction system and method based on voice recognition
Technical Field
The invention relates to the related field of voice recognition systems, in particular to a VR (virtual reality) interaction system and method based on voice recognition.
Background
VR, referred to as virtual reality technology for short, is an important direction of simulation technology, and is a collection of various technologies such as simulation technology, computer graphics man-machine interface technology, multimedia technology, sensing technology, network technology, and the like, which is a challenging leading discipline and research field of cross technology. Virtual reality technology (VR) mainly includes aspects of simulating environment, perception, natural skills, sensing equipment and the like. The simulated environment is a three-dimensional realistic image generated by a computer and dynamic in real time. Perception means that an ideal VR should have the perception that everyone has. In addition to the visual perception generated by computer graphics technology, there are also perceptions such as auditory sensation, tactile sensation, force sensation, and movement, and even olfactory sensation and taste sensation, which are also called multi-perception. The natural skill refers to the head rotation, eyes, gestures or other human body behavior actions of a human, and data adaptive to the actions of the participants are processed by the computer, respond to the input of the user in real time and are respectively fed back to the five sense organs of the user. The sensing device refers to a three-dimensional interaction device.
Virtual reality was proposed by the company vpl in the united states in the 80's of the 20 th century. The concrete connotations are as follows: a technique for providing an immersive sensation in an interactive three-dimensional environment generated on a computer by comprehensively utilizing a computer graphics system and various interface devices for reality and control. The virtual reality technology is a computer simulation system which can create and experience a virtual world, and the computer simulation system generates a system simulation of interactive three-dimensional dynamic visual and entity behaviors simulating multi-source information fusion by using a computer so as to immerse a user in the environment.
VR technology has a wide prospect in medical treatment, education, real estate and design. At present, VR interaction technology mainly relies on motion capture and gesture recognition, and user experience is not good, so that voice interaction becomes a strong appeal for users under the condition. Speech recognition techniques are now largely divided into two directions, namely traditional acoustic models and deep learning models. The traditional speech recognition technology, i.e. acoustic model, generates a model by extracting the audio features of the speaker under the simulation of some algorithms. The deep learning model is a technology which rises rapidly in recent years, the current comparative fire is a hidden Markov model based on a deep neural network, and the technology simulates a discriminability model based on data calculation. With the continuous progress of an algorithm and the continuous upgrade of hardware, the advantages of a deep learning model are more and more obvious, a voice recognition model based on deep learning is adopted, the existing VR product based on the voice recognition model at present is poor in interactivity and strong in drawing sense, and the natural interactive experience of people and virtual scene characters cannot be realized and needs to be improved.
Disclosure of Invention
The present invention is directed to a VR interaction system and method based on speech recognition, so as to solve the problems in the background art.
In order to achieve the purpose, the invention provides the following technical scheme: a VR (virtual reality) interaction system based on voice recognition comprises a cloud end and a VR peripheral end, wherein the cloud end comprises a voice recognition module, a semantic recognition module, a scene processing module, a storage module and a communication module, the VR peripheral end comprises a display module, a voice input module and a voice input module, and the VR peripheral end also comprises a communication module;
the voice recognition module is mainly used for primarily processing the voice of a user, namely extracting voice characteristics by means of noise reduction and reverberation removal on the basis of the voice input module, and then generating and checking a voice model by means of an algorithm based on deep learning, wherein the voice recognition module uses a plurality of algorithms and processing tools and is connected with the semantic recognition module;
the semantic recognition module carries out semantic processing again on the basis of the voice recognition module and deduces the user intention, the part needs to be analyzed according to the context to improve the accuracy, and the semantic recognition module is connected with the scene processing module;
the scene processing module analyzes the recognition result of the semantic recognition module, adjusts the layout transformation of the scene according to the result, and outputs the result through the display module, which needs the module to call a knowledge base in the storage module for relevant processing, and the scene processing module is connected with the storage module and the display module;
the scene processing module is used for calling the required dialog base knowledge base stored in the storage module according to the result of the previous step and outputting the dialog base through the voice output module, and the knowledge base is output through the display module;
the voice input module comprises a plurality of audio input devices and is connected with the voice output module;
the voice output module carries out voice output on the result in the storage module;
the communication module is responsible for communication among the peripheral devices.
Preferably, the audio input device of the voice input module comprises a microphone.
Preferably, the device of the voice input module comprises an earphone power amplifier.
A method of a VR interactive system based on speech recognition, comprising the following method steps:
constructing a knowledge base dialog library: firstly, storing a corresponding dialogue database in a storage module;
open high in the clouds and VR peripheral hardware end: after the cloud end and the VR peripheral end are opened, the communication module is ensured to be normal;
the user wears the vR peripheral: the user can feel a virtual scene after wearing the vR peripheral;
and (3) user input: a user inputs voice according to the prompt of the virtual scene or actively through the audio input peripheral;
cloud processing: through the processing at the cloud, the user can receive response information through the earphone at the vR terminal, and meanwhile response actions and expressions of the virtual scene are obtained from the display equipment of the display module.
Preferably, the method comprises the specific application based on the method steps, when the method is used, a user inputs audio through input equipment such as a microphone and the like, the audio is transmitted to the cloud, the voice recognition module is used for carrying out voice recognition to initially obtain user information, then the semantic recognition module is used for carrying out semantic recognition, the cloud understands a user instruction and deduces the intention of the user, and then the user instruction is processed in the scene processing module according to the intention of the user, and the result is transmitted to the display module; meanwhile, a knowledge base in the storage module is called, a corresponding result is returned to the VR peripheral end, and the user can listen to the dialogue information through the output equipment of the VR peripheral end, so that the user can watch the dialogue information through the display module and listen to the dialogue information through the voice input module and the audio equipment corresponding to the voice output module, double feedback of vision and hearing is obtained, and more immersion is achieved.
Compared with the prior art, the invention has the beneficial effects that: in the invention, the user can communicate with the character in the virtual scene through the voice input module on the VR peripheral end, such as micThe voice recognition module firstly reduces noise, removes reverberation, and the like to remove interference factors in the surrounding environment, then extracts voice characteristics, carries out analysis modeling through a deep learning algorithm based on a deep neural network to generate a voice model, then compares and recognizes voice information input by a user, analyzes the content and instruction information of the voice information of the user, enters a semantic recognition module on the basis of the voice recognition, carries out NLP word segmentation, keyword analysis and the like on the basis of the voice recognition by combining a context environment to further deduce the possible intention of the user, enters a scene processing module, and carries out scene processing according to the result in the semantic recognition module, the module can call a knowledge base in a storage module to perform corresponding scene response, the module comprises graph adjustment, context processing and the like, the feedback to the output is the action transformation of a virtual character, the processing results are transmitted to a display module on a VR peripheral end through a data line or other modes, and corresponding voice information is output at the same timeAnd (3) the action is carried out, so that the double perception of vision and hearing can be achieved, and the immersion of the user is greatly enhanced.
The method effectively overcomes the defects of poor interactivity and strong drawing-off feeling of the existing VR product, and realizes more natural interactive experience of people and virtual scene characters.
Drawings
Fig. 1 is a schematic diagram of a module structure according to the present invention.
In the figure: 1. a cloud end; 2. a VR peripheral end; 3. a voice recognition module; 4. a semantic recognition module; 5. a scene processing module; 6. a storage module; 7. a display module; 8. a voice input module; 9. and a voice output module.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "disposed," "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
Referring to fig. 1, the present invention provides a technical solution: a VR interactive system based on voice recognition comprises a cloud end 1 and a VR peripheral end 2, wherein the cloud end 1 comprises a voice recognition module 3, a semantic recognition module 4, a scene processing module 5, a storage module 6 and a communication module, the VR peripheral end 2 comprises a display module 7, a voice input module 8 and a voice input module 8, and the VR peripheral end 2 also comprises a communication module;
the voice recognition module 3 mainly performs primary processing on the voice of a user, namely on the basis of the voice input module 8, extracts voice features in a noise reduction and reverberation removal mode, and then generates and checks a voice model through an algorithm based on deep learning, wherein the voice recognition module 3 is connected with the semantic recognition module 4 by using a plurality of algorithms and processing tools;
the semantic recognition module 4 carries out semantic processing again on the basis of the voice recognition module 3 and deduces the user intention, and the part needs to be analyzed according to the context to improve the accuracy, and the semantic recognition module 4 is connected with the scene processing module 5;
the scene processing module 5 analyzes the recognition result of the semantic recognition module 4, adjusts the layout transformation of the scene according to the result, and outputs the result through the display module 7, which needs the module to call a knowledge base in the storage module 6 for relevant processing, and the scene processing module 5 is connected with the storage module 6 and the display module 7;
the storage module 6 is used for storing a knowledge base and a conversation base, the scene processing module 5 outputs the required conversation base knowledge base which is called and stored in the storage module 6 according to the result of the previous step, the conversation base is output through the voice output module 9, and the knowledge base is output through the display module 7;
the voice input module 8 comprises a plurality of audio input devices, and the voice input module 8 is connected with the voice output module 9;
the voice output module 9 outputs the result in the storage module 6 in voice;
the communication module is responsible for communication among the peripheral devices.
Further, the audio input device of the voice input module 8 includes a microphone, and the device of the voice input module 8 includes an earphone power amplifier.
The principle method based on the system in the embodiment comprises the following steps:
constructing a knowledge base dialog library: firstly, storing a corresponding dialogue database in a storage module 6;
open high in the clouds 1 and VR peripheral hardware end 2: after the cloud 1 and the VR peripheral end 2 are started, the communication module is ensured to be normal;
the user wears the VR peripheral: the user can feel the virtual scene after wearing the VR peripheral;
and (3) user input: a user inputs voice according to the prompt of the virtual scene or actively through the audio input peripheral;
cloud 1 processing: through the processing at the cloud 1, the user receives the response information at the VR terminal through the earphone, and obtains the response action and the expression of the virtual scene from the display device of the display module 7.
Based on the specific application of the steps of the method, the specific use steps are as follows: when the system is used, a user inputs audio through input equipment such as a microphone and the like, the audio is transmitted to the cloud 1, the voice recognition module 3 is used for voice recognition to obtain user information primarily, the semantic recognition module 4 is used for semantic recognition, the cloud 1 understands a user instruction and deduces the intention of the user, the user instruction is processed in the scene processing module 5 according to the intention of the user, and the result is transmitted to the display module 7; meanwhile, a knowledge base in the storage module 6 is called, a corresponding result is returned to the VR peripheral end 2, and the user can listen to the dialogue information through the output equipment of the VR peripheral end 2, so that the user can watch the dialogue information through the display module 7 and listen to the dialogue information through the audio equipment corresponding to the voice input module 8 and the voice output module 9, double feedback of vision and hearing is obtained, and more immersion is achieved.
It should be noted that the present invention is not limited to the cloud and the VR peripheral, but refers to all devices of which the scene control system and the intelligent voice system are independent from the VR peripheral, and may also be the cloud, etc., and for the purpose of presentation, the present invention takes the cloud as an example for convenience of understanding.
A general core processor is independent of a VR peripheral end, a processor with high computing performance is needed due to the fact that a large amount of computing data are needed, at the present stage, the processor cannot meet the requirement of an all-in-one machine integrating peripherals, so that the idea is changed, a brand-new method for placing the processing process in the cloud is provided, the mode has the advantage that the processing process in the cloud can have better networking performance, and the processing method is more suitable for processing of big data.
Under the idea of the new method, in the invention, the user communicates with the character in the virtual scene through the voice input module 8 on the VR peripheral 2, for example, micThe voice recognition module 3 firstly reduces noise, removes reverberation, and the like to remove interference factors in the surrounding environment, then extracts voice characteristics, carries out analysis modeling through a deep learning algorithm based on a deep neural network to generate a voice model, then compares and recognizes the voice information input by a user, analyzes the content and instruction information of the voice information of the user, enters the semantic recognition module 4 on the basis of the voice recognition, carries out NLP word segmentation, keyword analysis and the like on the basis of the voice recognition by the cloud 1, deduces possible intentions of the user by combining a context environment, enters the scene processing module 5, and carries out scene processing according to the result in the semantic recognition module 4, the module can call a knowledge base in the storage module 6, corresponding scene response is carried out, the image adjustment is included, context processing and the like are carried out, the feedback to the output is the action transformation of virtual characters, the processing results are transmitted to the display module 7 on the VR peripheral end 2 through a data line or other modes, corresponding voice information is output at the same time, the storage module 6 comprises a knowledge graph base, a conversation base and the like, the cloud end is on the basis of the scene processing and returns response conversation from the conversation base of the storage module 6, meanwhile, the cloud end 1 is used for carrying out relevant scene processing in the scene processing module 5, according to instructions of a user, if expressions or actions of response are carried out, the double perception of visual sense and auditory sense can be achieved, and the immersion sense of the user is greatly enhanced.
Through the process analysis, the defects of poor interactivity and strong extraction sense of the existing VR products are effectively overcome, and more natural interactive experience of people and virtual scene characters is realized.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative examples and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any icons in the claims should not be construed as limiting the claim concerned.

Claims (5)

1. A VR interactive system based on speech recognition is characterized in that: the device comprises a cloud end (1) and a VR (virtual reality) peripheral end (2), wherein the cloud end (1) comprises a voice recognition module (3), a semantic recognition module (4), a scene processing module (5), a storage module (6) and a communication module, the VR peripheral end (2) comprises a display module (7), a voice input module (8) and a voice input module (8), and the VR peripheral end (2) also comprises the communication module;
the voice recognition module (3) mainly performs primary processing on the voice of a user, namely on the basis of the voice input module (8), voice features are extracted in a noise reduction and reverberation removal mode, then a voice model is generated and checked through an algorithm based on deep learning, a plurality of algorithms and processing tools are used in the voice feature extraction and reverberation removal mode, and the voice recognition module (3) is connected with the semantic recognition module (4);
the semantic recognition module (4) carries out semantic processing again on the basis of the voice recognition module (3) and deduces the user intention, and the part needs to be analyzed according to the context to improve the accuracy, and the semantic recognition module (4) is connected with the scene processing module (5);
the scene processing module (5) analyzes the recognition result of the semantic recognition module (4), adjusts the layout transformation of the scene according to the result, and outputs the result through the display module (7), which needs the module to call a knowledge base in the storage module (6) for relevant processing, and the scene processing module (5) is connected with the storage module (6) and the display module (7);
the scene processing module (5) outputs the required dialog library knowledge base which is called and stored in the memory module (6) according to the result of the previous step, the dialog library is output through the voice output module (9), and the knowledge base is output through the display module (7);
the voice input module (8) comprises a plurality of audio input devices, and the voice input module (8) is connected with the voice output module (9);
the voice output module (9) outputs the result in the storage module (6) in a voice mode;
the communication module is responsible for communication among the peripheral devices.
2. The VR interaction system of claim 1, wherein: the audio input device of the voice input module (8) comprises a microphone.
3. The VR interaction system of claim 1, wherein: the equipment of the voice input module (8) comprises an earphone power amplifier.
4. The method for a voice recognition based VR interaction system of any of claims 1-3, further comprising: the method comprises the following steps:
constructing a knowledge base dialog library: firstly, storing a corresponding dialog library in a storage module (6);
open high in the clouds (1) and VR peripheral end (2): after the cloud end (1) and the VR peripheral end (2) are started, the communication module is ensured to be normal;
the user wears the VR peripheral: the user can feel the virtual scene after wearing the VR peripheral;
and (3) user input: a user inputs voice according to the prompt of the virtual scene or actively through the audio input peripheral;
cloud (1) processing: through the processing at the cloud (1), the user can receive response information through the earphone at the VR terminal, and meanwhile response actions and expressions of the virtual scene are obtained from the display equipment of the display module (7).
5. The method of claim 4, wherein the method comprises: the method comprises the specific application based on the method steps, when the method is used, a user inputs audio through input equipment such as a microphone and the like, the audio is transmitted to a cloud end (1), the voice recognition module (3) is used for carrying out voice recognition to obtain user information preliminarily, then the semantic recognition module (4) is used for carrying out semantic recognition, the cloud end (1) understands a user instruction and deduces the intention of the user, then the user instruction is processed in a scene processing module (5) according to the intention of the user, and the result is transmitted to a display module (7); meanwhile, a knowledge base in the storage module (6) is called, a corresponding result is returned to the VR peripheral end (2), and the user can listen to the dialogue information through the output equipment of the VR peripheral end (2), so that the user can watch the dialogue information through the display module (7) and listen to the dialogue information through the audio equipment corresponding to the voice input module (8) and the voice output module (9), double feedback of vision and hearing is obtained, and more immersion is achieved.
CN201910986351.7A 2019-10-17 2019-10-17 VR (virtual reality) interaction system and method based on voice recognition Pending CN110895931A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910986351.7A CN110895931A (en) 2019-10-17 2019-10-17 VR (virtual reality) interaction system and method based on voice recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910986351.7A CN110895931A (en) 2019-10-17 2019-10-17 VR (virtual reality) interaction system and method based on voice recognition

Publications (1)

Publication Number Publication Date
CN110895931A true CN110895931A (en) 2020-03-20

Family

ID=69786337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910986351.7A Pending CN110895931A (en) 2019-10-17 2019-10-17 VR (virtual reality) interaction system and method based on voice recognition

Country Status (1)

Country Link
CN (1) CN110895931A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111696536A (en) * 2020-06-05 2020-09-22 北京搜狗科技发展有限公司 Voice processing method, apparatus and medium
CN111768768A (en) * 2020-06-17 2020-10-13 北京百度网讯科技有限公司 Voice processing method and device, peripheral control equipment and electronic equipment
CN111939558A (en) * 2020-08-19 2020-11-17 北京中科深智科技有限公司 Method and system for driving virtual character action by real-time voice
CN111986297A (en) * 2020-08-10 2020-11-24 山东金东数字创意股份有限公司 Virtual character facial expression real-time driving system and method based on voice control
CN112216278A (en) * 2020-09-25 2021-01-12 威盛电子股份有限公司 Speech recognition system, instruction generation system and speech recognition method thereof
CN113672155A (en) * 2021-07-02 2021-11-19 浪潮金融信息技术有限公司 Self-service operating system, method and medium based on VR technology
CN117391822A (en) * 2023-12-11 2024-01-12 中汽传媒(天津)有限公司 VR virtual reality digital display method and system for automobile marketing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106550156A (en) * 2017-01-23 2017-03-29 苏州咖啦魔哆信息技术有限公司 A kind of artificial intelligence's customer service system and its implementation based on speech recognition
CN109841217A (en) * 2019-01-18 2019-06-04 苏州意能通信息技术有限公司 A kind of AR interactive system and method based on speech recognition
US20190198019A1 (en) * 2017-12-26 2019-06-27 Baidu Online Network Technology (Beijing) Co., Ltd Method, apparatus, device, and storage medium for voice interaction
CN110335595A (en) * 2019-06-06 2019-10-15 平安科技(深圳)有限公司 Slotting based on speech recognition asks dialogue method, device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106550156A (en) * 2017-01-23 2017-03-29 苏州咖啦魔哆信息技术有限公司 A kind of artificial intelligence's customer service system and its implementation based on speech recognition
US20190198019A1 (en) * 2017-12-26 2019-06-27 Baidu Online Network Technology (Beijing) Co., Ltd Method, apparatus, device, and storage medium for voice interaction
CN109841217A (en) * 2019-01-18 2019-06-04 苏州意能通信息技术有限公司 A kind of AR interactive system and method based on speech recognition
CN110335595A (en) * 2019-06-06 2019-10-15 平安科技(深圳)有限公司 Slotting based on speech recognition asks dialogue method, device and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111696536A (en) * 2020-06-05 2020-09-22 北京搜狗科技发展有限公司 Voice processing method, apparatus and medium
CN111696536B (en) * 2020-06-05 2023-10-27 北京搜狗智能科技有限公司 Voice processing method, device and medium
CN111768768A (en) * 2020-06-17 2020-10-13 北京百度网讯科技有限公司 Voice processing method and device, peripheral control equipment and electronic equipment
CN111768768B (en) * 2020-06-17 2023-08-29 北京百度网讯科技有限公司 Voice processing method and device, peripheral control equipment and electronic equipment
CN111986297A (en) * 2020-08-10 2020-11-24 山东金东数字创意股份有限公司 Virtual character facial expression real-time driving system and method based on voice control
CN111939558A (en) * 2020-08-19 2020-11-17 北京中科深智科技有限公司 Method and system for driving virtual character action by real-time voice
CN112216278A (en) * 2020-09-25 2021-01-12 威盛电子股份有限公司 Speech recognition system, instruction generation system and speech recognition method thereof
CN113672155A (en) * 2021-07-02 2021-11-19 浪潮金融信息技术有限公司 Self-service operating system, method and medium based on VR technology
CN113672155B (en) * 2021-07-02 2023-06-30 浪潮金融信息技术有限公司 VR technology-based self-service operation system, method and medium
CN117391822A (en) * 2023-12-11 2024-01-12 中汽传媒(天津)有限公司 VR virtual reality digital display method and system for automobile marketing
CN117391822B (en) * 2023-12-11 2024-03-15 中汽传媒(天津)有限公司 VR virtual reality digital display method and system for automobile marketing

Similar Documents

Publication Publication Date Title
CN110895931A (en) VR (virtual reality) interaction system and method based on voice recognition
WO2022048403A1 (en) Virtual role-based multimodal interaction method, apparatus and system, storage medium, and terminal
CN106653052B (en) Virtual human face animation generation method and device
WO2022052481A1 (en) Artificial intelligence-based vr interaction method, apparatus, computer device, and medium
CN111145322B (en) Method, apparatus, and computer-readable storage medium for driving avatar
US20230042654A1 (en) Action synchronization for target object
CN113454708A (en) Linguistic style matching agent
CN108877336A (en) Teaching method, cloud service platform and tutoring system based on augmented reality
JP2022524944A (en) Interaction methods, devices, electronic devices and storage media
CN107003825A (en) System and method with dynamic character are instructed by natural language output control film
CN112668407A (en) Face key point generation method and device, storage medium and electronic equipment
Morishima Real-time talking head driven by voice and its application to communication and entertainment
CN205451551U (en) Speech recognition driven augmented reality human -computer interaction video language learning system
El Haddad et al. Laughter and smile processing for human-computer interactions
KR20060091329A (en) Interactive system and method for controlling an interactive system
US20220301250A1 (en) Avatar-based interaction service method and apparatus
Ding et al. Interactive multimedia mirror system design
CN114201596A (en) Virtual digital human use method, electronic device and storage medium
Chandrasiri et al. Internet communication using real-time facial expression analysis and synthesis
Morishima et al. Face-to-face communicative avatar driven by voice
Leandro Parreira Duarte et al. Coarticulation and speech synchronization in MPEG-4 based facial animation
Ohsugi et al. A Comparative Study of Statistical Conversion of Face to Voice Based on Their Subjective Impressions.
Santos-Pérez et al. AVATAR: an open source architecture for embodied conversational agents in smart environments
Sundblad et al. OLGA—a multimodal interactive information assistant
Zoric et al. Towards facial gestures generation by speech signal analysis using huge architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200320

RJ01 Rejection of invention patent application after publication