CN117711369A - Intelligent communication system for human and animal situational language - Google Patents
Intelligent communication system for human and animal situational language Download PDFInfo
- Publication number
- CN117711369A CN117711369A CN202311737461.2A CN202311737461A CN117711369A CN 117711369 A CN117711369 A CN 117711369A CN 202311737461 A CN202311737461 A CN 202311737461A CN 117711369 A CN117711369 A CN 117711369A
- Authority
- CN
- China
- Prior art keywords
- component
- animal
- language
- model
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 241001465754 Metazoa Species 0.000 title claims abstract description 105
- 230000008449 language Effects 0.000 title claims abstract description 55
- 238000004891 communication Methods 0.000 title claims abstract description 26
- 238000010171 animal model Methods 0.000 claims abstract description 30
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 29
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 29
- 238000007781 pre-processing Methods 0.000 claims abstract description 22
- 230000009471 action Effects 0.000 claims abstract description 21
- 230000003993 interaction Effects 0.000 claims abstract description 12
- 238000000605 extraction Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims abstract description 10
- 230000008030 elimination Effects 0.000 claims abstract description 4
- 238000003379 elimination reaction Methods 0.000 claims abstract description 4
- 230000014509 gene expression Effects 0.000 claims description 33
- 238000013519 translation Methods 0.000 claims description 22
- 238000001514 detection method Methods 0.000 claims description 16
- 238000006243 chemical reaction Methods 0.000 claims description 10
- 210000000988 bone and bone Anatomy 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 230000003595 spectral effect Effects 0.000 claims description 4
- 230000002708 enhancing effect Effects 0.000 claims description 3
- 238000009432 framing Methods 0.000 claims description 2
- 230000007704 transition Effects 0.000 claims 1
- 238000001228 spectrum Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 208000025174 PANDAS Diseases 0.000 description 12
- 208000021155 Paediatric autoimmune neuropsychiatric disorders associated with streptococcal infection Diseases 0.000 description 12
- 240000004718 Panda Species 0.000 description 12
- 235000016496 Panda oleosa Nutrition 0.000 description 12
- 230000000875 corresponding effect Effects 0.000 description 12
- 230000006399 behavior Effects 0.000 description 7
- 238000000034 method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 238000004088 simulation Methods 0.000 description 3
- 230000008451 emotion Effects 0.000 description 2
- 230000001737 promoting effect Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000037237 body shape Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000009191 jumping Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Machine Translation (AREA)
Abstract
The invention belongs to the field of language communication, and provides a human and animal situation language intelligent communication system, which comprises a database, wherein animal information and animal model information are stored in the database; the voice and word recognition module comprises a voice signal preprocessing component, a feature extraction component, an acoustic model component and a voice model component; the voice signal preprocessing component is used for carrying out noise elimination and voice enhancement on an input voice signal; the feature extraction component is used for extracting frequency spectrum features in the voice signals; the acoustic model component is used for carrying out voice recognition on the frequency spectrum characteristics; the voice model component is used for further correcting and optimizing the recognition result; the invention realizes the intelligent communication of the situation language between the human and the animal by combining the technologies of voice recognition, image processing, animal model establishment, voice synthesis and the like, can provide the feedback of the real animal for the user on the language and the action of the animal, and can realize personalized interaction and provide more accurate feedback.
Description
Technical Field
The invention belongs to the field of language communication, in particular to an intelligent communication system for human and animal situational language.
Background
In daily life, there is often a need for interaction and communication with animals, such as when visiting zoos, it is desirable to be able to better understand the animal's habits, emotions, and corresponding needs. However, due to the enormous hurdles that humans and animals have in language, people often cannot truly communicate and understand the intent and emotion of animals.
In the prior art, when solving the problem of language communication between human and animals, the method mainly depends on the expertise of animal specialists, but only unidirectional language translation or simulation is often realized, for example, the voice or behavior of animals is translated into human language by the animal specialists, or the human language is simulated into the voice of the animals to be transmitted to the animals; the method has certain limitation and inconvenience in communication, and can not completely realize the two-way communication between the human and the animal. Accordingly, there is a need for further development and application of new techniques and methods to achieve more accurate, natural and bi-directional language communication.
Disclosure of Invention
In order to solve the technical problems, the invention provides a language intelligent communication system for human and animal situation, which aims to solve the problems that the prior art mainly depends on the professional knowledge of animal specialists, but only one-way language translation or simulation and the like can be realized.
An intelligent communication system for human and animal situation language comprises,
the database is used for storing animal information and animal model information;
the voice and word recognition module comprises a voice signal preprocessing component, a feature extraction component, an acoustic model component, a voice model component and a word recognition component; the voice signal preprocessing component is used for carrying out noise elimination and voice enhancement on an input voice signal; the characteristic extraction component is used for extracting frequency spectrum characteristics in the voice signal; the acoustic model component is used for carrying out voice recognition on the frequency spectrum characteristics; the voice model component is used for further correcting and optimizing the recognition result; the character recognition component is used for recognizing character contents;
the shooting module comprises an image capturing component, an image processing component and an animal expression and action recognition component; the image capturing component is used for acquiring animal images; the image processing component is used for preprocessing and enhancing the captured image; the animal expression and action recognition component is used for analyzing the image processed by the image, and recognizing and extracting expression characteristics of the animal from the image;
the animal image display module is used for comparing the animal image shot by the shooting module with animal information in the database, acquiring animal model information corresponding to the animal information in the database and generating a corresponding animal model; the animal image display module comprises a model making component, a bone animal component, an expression transformation component and an action generating component; the model making component is used for creating an animal model; the bone animal component is used for adding a bone system to the animal model, so that the animal model can perform animation performance according to the behavior and the expression of the animal; the expression transformation component is used for adjusting the expression of the animal model; the action generating component is used for generating actions of the animal model;
a speech synthesis module comprising a phoneme conversion component, an acoustic parameter generation component, and a speech synthesis model component; the phoneme conversion component is used for converting the processed text information into a corresponding phoneme sequence; the acoustic parameter generation component is used for generating acoustic parameters according to the phoneme sequence; the speech synthesis model component is for converting acoustic parameters into a speech signal.
Preferably, the system further comprises a computer vision module, wherein the computer vision module comprises an image preprocessing component, a target detection component and a key point identification component; the image preprocessing component is used for preprocessing the captured animal images; the object detection component is used for identifying an animal object in an animal image and framing the position of the animal object; the keypoint identification component is for identifying keypoints of an animal in an image.
Preferably, the system further comprises a context understanding module, wherein the context understanding module comprises a dialogue management component and a dialogue history tracking component; the dialogue management component is used for managing dialogue flows; the conversation history tracking component is configured to track and analyze previous conversation histories.
Preferably, the session flow includes tracking and converting session states.
Preferably, the database also stores user interaction information.
Preferably, the system further comprises a multi-language support module, wherein the multi-language support module comprises a language detection component, a translation engine interface component and a speech synthesis engine interface component; the language detection component is used for detecting the language type of the input text; the translation engine interface component is used for interacting with an external translation engine to realize translation of the text; the speech synthesis engine interface component is used for interacting with an external speech synthesis engine to realize multi-language speech synthesis.
Preferably, language translation information is also stored in the database.
Compared with the prior art, the invention has the following beneficial effects:
1. the invention realizes the intelligent communication of the situation language between the human and the animal by combining the technologies of voice recognition, image processing, animal model establishment, voice synthesis and the like, can provide the feedback of the real animal for the user on the language and action of the animal, can realize personalized interaction and provide more accurate feedback, thereby improving the intelligent degree and user satisfaction of the system, providing new theoretical basis and reference for the research of animal behaviors and languages, and promoting the research and exploration of the related fields.
Drawings
FIG. 1 is a schematic diagram of the overall structure of the present invention;
FIG. 2 is a schematic diagram of a computer vision module structure according to the present invention;
FIG. 3 is a schematic diagram of a context understanding module structure of the present invention;
FIG. 4 is a schematic diagram of a multi-language support module according to the present invention;
fig. 5 is a schematic structural diagram of a second embodiment of the present invention.
In the figure:
1. a database; 11. animal information; 12. language translation information; 13. user interaction information; 14. animal model information;
2. a speech and text recognition module; 21. a voice signal preprocessing component; 22. a feature extraction component; 23. an acoustic model component; 24. a speech model component; 25. a text recognition component;
3. a shooting module; 31. an image capturing assembly; 32. an image processing component; 33. an animal expression and motion recognition component;
4. an animal image display module; 41. a model making component; 42. a skeletal animal component; 43. an expression conversion component; 44. an action generating component;
5. a speech synthesis module; 51. a phoneme conversion component; 52. an acoustic parameter generation component; 53. a speech synthesis model component;
6. a computer vision module; 61. an image preprocessing component; 62. a target detection component; 63. a key point identification component;
7. a context understanding module; 71. a dialog management component; 72. a conversation history tracking component;
8. a multilingual support module; 81. a language detection component; 82. a translation engine interface component; 83. a speech synthesis engine interface component.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings and examples. The following examples are illustrative of the invention but are not intended to limit the scope of the invention.
Embodiment one: as shown in fig. 1 to 4: the invention provides an intelligent communication system for human and animal situation languages, which comprises a database 1, wherein animal information 11, animal model information 14, user interaction information 13 and language translation information 12 are stored in the database 1;
animal information 11: storing information such as characteristics, expressions, sounds, behaviors and the like of various animals so that the system can perform correct translation and simulation according to animal types;
animal model information 14: storing information about various animal body structural features, such as geometry, body shape, weight, bone structure, and organ size;
language translation information 12: storing the mapping relation between the human language and the animal expression action and sound so that the system can translate the human language into correct animal expression and sound;
user interaction information 13: storing information such as user input, historical interaction records and the like so that the system can perform personalized interaction or provide more accurate feedback;
a voice and text recognition module 2, the voice and text recognition module 2 including a voice signal preprocessing component 21, a feature extraction component 22, an acoustic model component 23, a voice model component 24, and a text recognition component 25; the voice signal preprocessing component 21 is used for performing noise elimination and voice enhancement on an input voice signal; the feature extraction component 22 is configured to extract spectral features in the speech signal; the acoustic model component 23 is used for speech recognition of the spectral features; the voice model component 24 is used for further correcting and optimizing the recognition result; the character recognition component 25 is used for recognizing character content;
a photographing module 3, the photographing module 3 including an image capturing component 31, an image processing component 32, and an animal expression and motion recognition component 33; the image capturing component 31 is for capturing images of animals, and may capture still images or a continuous stream of images; the image processing component 32 is used for preprocessing and enhancing the captured image, including noise removal, image enhancement, image segmentation, etc.; the animal expression and motion recognition component 33 is configured to analyze the image processed image, recognize and extract expression features of the animal therefrom, and may employ computer vision and machine learning techniques, such as face detection, key point positioning, feature extraction, etc., to recognize and quantify the expression state of the animal;
the animal image display module 4 is used for comparing the animal image shot by the shooting module 3 with the animal information 11 in the database 1, acquiring animal model information 14 corresponding to the animal information 11 in the database 1 and generating a corresponding animal model; the animal image display module 4 comprises a model making component 41, a bone animal component 42, an expression transformation component 43 and a motion generating component 44; the modeling component 41 is used to create an animal model; the skeletal animal component 42 is used to add skeletal systems to animal models that enable the animal models to perform animated performances based on the animal's behavior and expression; the expression transformation component 43 is used for adjusting the expression of the animal model, such as changing the shape of eyes and mouth; the action generation component 44 is used to generate actions of the animal model, such as walking, jumping, etc.;
a speech synthesis module 5, the speech synthesis module 5 comprising a phoneme conversion component 51, an acoustic parameter generation component 52 and a speech synthesis model component 53; the phoneme conversion component 51 is configured to convert the processed text information into a corresponding phoneme sequence; the acoustic parameter generation component 52 is configured to generate acoustic parameters from the phoneme sequence; the speech synthesis model component 53 is used to convert acoustic parameters into speech signals.
The system realizes the intelligent communication of the situation language between the human and the animal by combining the technologies of voice recognition, image processing, animal model establishment, voice synthesis and the like, can provide the feedback of the real animal for the user on the language and action of the animal, can realize personalized interaction and provide more accurate feedback, thereby improving the intelligent degree and user satisfaction of the system, providing new theoretical basis and reference for the research of animal behaviors and languages, and promoting the research and exploration of related fields.
As shown in fig. 2, the system further comprises a computer vision module 6, wherein the computer vision module 6 comprises an image preprocessing component 61, a target detection component 62 and a key point identification component 63; the image preprocessing component 61 is used for performing preprocessing operations such as denoising, uniform scale or clipping on the captured animal image, so as to provide better input for subsequent target detection and key point identification; the object detection component 62 is used to identify and frame the location of an animal object in an animal image; the keypoint identification component 63 is operative to identify keypoints of animals in the image.
Through setting up computer vision module 6 for the system has possessed the ability of discernment and analysis to animal image, thereby has promoted the intelligent level of system, through accurate positioning target, discernment key point etc. technique, the system can understand more accurately and respond to animal's action and expression, realizes more natural, true situational language intelligent communication.
As shown in fig. 3, the system further comprises a context understanding module 7, wherein the context understanding module 7 comprises a dialogue management component 71 and a dialogue history tracking component 72; the dialog management component 71 is used for managing dialog flows, including tracking and conversion of dialog states; the conversation history tracking component 72 is operative to track and analyze previous conversation histories.
By tracking the dialogue flow and history, the interaction of the system can be optimized, the response speed can be improved, and the user satisfaction can be improved.
As shown in fig. 4, the system further comprises a multi-language support module 8, wherein the multi-language support module 8 comprises a language detection component 81, a translation engine interface component 82 and a speech synthesis engine interface component 83; the language detection component 81 is used for detecting the language type of the input text; the translation engine interface component 82 is used for interacting with an external translation engine to realize translation of texts, so that the system can accurately understand the input of a user, and translate the texts of the user into different languages according to the needs, thereby realizing accurate multi-language communication; the speech synthesis engine interface component 83 is configured to interact with an external speech synthesis engine to implement speech synthesis in multiple languages, so that the system can generate speech feedback in a corresponding language according to the language selection of the user, thereby providing an interactive experience closer to the user's needs.
Embodiment two: as shown in fig. 5, an input system is provided in a panda museum, a tourist can express his/her own meaning through words or voices, the information can be collected and stored in a database and translated into sound and expression actions of the panda, and meanwhile, a digital panda or panda doll can display corresponding expressions to the panda in the panda museum according to the translation results, so that the panda can understand the meaning of the tourist.
The expression and sound of pandas can be captured by the snap camera and the microphone in the panda library, the data can be matched and interpreted through the database, then the data are converted into human language, the interpreted result can be displayed in the text on the screen of the live panda, and the corresponding dubbing is matched, so that the audience can see the expression and the action of the panda and understand the meaning of the panda.
Embodiment III: the household system is provided with an input system, the family members can express their own meaning through characters or voices, the information can be collected and stored in a family database and converted into the voice and action of the pet, and meanwhile, the digital pet or the pet doll can display the corresponding action to the pet in the household according to the conversion results, so that the user can understand the intention of the family members.
The action and sound of the pet can be captured by the snap camera and the microphone in the home, the data can be matched and interpreted through the database, then the data are converted into human language, the interpreted result can be displayed by words on a home screen and matched with corresponding dubbing, thus, family members can see the behavior and the expression of the pet and understand the meaning of the pet.
While embodiments of the present invention have been shown and described above for purposes of illustration and description, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives, and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.
Claims (7)
1. The intelligent communication system for the human and animal situational language is characterized by comprising:
a database (1), wherein animal information (11) and animal model information (14) are stored in the database (1);
a speech and text recognition module (2), the speech and text recognition module (2) comprising a speech signal preprocessing component (21), a feature extraction component (22), an acoustic model component (23), a speech model component (24) and a text recognition component (25); the voice signal preprocessing component (21) is used for carrying out noise elimination and voice enhancement on an input voice signal; -the feature extraction component (22) is for extracting spectral features in a speech signal; -the acoustic model component (23) is for speech recognition of spectral features; the voice model component (24) is used for further correcting and optimizing the recognition result; the character recognition component (25) is used for recognizing character contents;
a shooting module (3), wherein the shooting module (3) comprises an image capturing component (31), an image processing component (32) and an animal expression and action recognition component (33); -the image capturing assembly (31) is for capturing an image of an animal; the image processing component (32) is for pre-processing and enhancing a captured image; the animal expression and action recognition component (33) is used for analyzing the image processed by the image, and recognizing and extracting expression characteristics of the animal;
the animal image display module (4), the animal image display module (4) is used for comparing the animal image shot by the shooting module (3) with the animal information (11) in the database (1), acquiring the animal model information (14) corresponding to the animal information (11) in the database (1), and generating a corresponding animal model; the animal image display module (4) comprises a model making component (41), a bone animal component (42), an expression transformation component (43) and a motion generating component (44); -the modeling component (41) is for creating an animal model; the skeletal animal component (42) is configured to add skeletal systems to the animal model such that the animal model is capable of performing an animated performance based on the animal's behavior and expression; the expression transformation component (43) is used for adjusting the expression of the animal model; the action generating component (44) is for generating actions of an animal model;
a speech synthesis module (5), the speech synthesis module (5) comprising a phoneme conversion component (51), an acoustic parameter generation component (52) and a speech synthesis model component (53); the phoneme conversion component (51) is configured to convert the processed text information into a corresponding phoneme sequence; the acoustic parameter generation component (52) is configured to generate acoustic parameters from a sequence of phonemes; the speech synthesis model component (53) is for converting acoustic parameters into a speech signal.
2. The human and animal situational language intelligent communication system of claim 1, wherein: the system also comprises a computer vision module (6), wherein the computer vision module (6) comprises an image preprocessing component (61), a target detection component (62) and a key point identification component (63); the image preprocessing component (61) is used for preprocessing the captured animal images; the object detection component (62) is used for identifying an animal object in an animal image and framing the position of the animal object; the keypoint identification component (63) is for identifying keypoints of an animal in an image.
3. The human and animal situational language intelligent communication system of claim 1, wherein: also comprises a context-aware module (7), the context-aware module (7) comprising a dialog management component (71) and a dialog history tracking component (72); the dialogue management component (71) is used for managing dialogue flows; the conversation history tracking component (72) is for tracking and analyzing previous conversation histories.
4. The human and animal situational language intelligent communication system of claim 3, wherein: the dialog flow includes tracking and transition of dialog states.
5. The human and animal situational language intelligent communication system of claim 3, wherein: user interaction information (13) is also stored in the database (1).
6. The human and animal situational language intelligent communication system of claim 1, wherein: the system also comprises a multi-language support module (8), wherein the multi-language support module (8) comprises a language detection component (81), a translation engine interface component (82) and a speech synthesis engine interface component (83); the language detection component (81) is used for detecting the language type of the input text; the translation engine interface component (82) is used for interacting with an external translation engine to realize translation of the text; the speech synthesis engine interface component (83) is configured to interact with an external speech synthesis engine to implement multilingual speech synthesis.
7. The human and animal situational language intelligent communication system of claim 6, wherein: language translation information (12) is also stored in the database (1).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311737461.2A CN117711369A (en) | 2023-12-16 | 2023-12-16 | Intelligent communication system for human and animal situational language |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311737461.2A CN117711369A (en) | 2023-12-16 | 2023-12-16 | Intelligent communication system for human and animal situational language |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117711369A true CN117711369A (en) | 2024-03-15 |
Family
ID=90149431
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311737461.2A Pending CN117711369A (en) | 2023-12-16 | 2023-12-16 | Intelligent communication system for human and animal situational language |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117711369A (en) |
-
2023
- 2023-12-16 CN CN202311737461.2A patent/CN117711369A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110688911B (en) | Video processing method, device, system, terminal equipment and storage medium | |
US20230316643A1 (en) | Virtual role-based multimodal interaction method, apparatus and system, storage medium, and terminal | |
US8224652B2 (en) | Speech and text driven HMM-based body animation synthesis | |
CN112650831A (en) | Virtual image generation method and device, storage medium and electronic equipment | |
US20240070397A1 (en) | Human-computer interaction method, apparatus and system, electronic device and computer medium | |
CN102568023A (en) | Real-time animation for an expressive avatar | |
JP2014519082A (en) | Video generation based on text | |
CN114401438A (en) | Video generation method and device for virtual digital person, storage medium and terminal | |
CN113539240B (en) | Animation generation method, device, electronic equipment and storage medium | |
KR101738142B1 (en) | System for generating digital life based on emotion and controlling method therefore | |
CN114121006A (en) | Image output method, device, equipment and storage medium of virtual character | |
CN111939558A (en) | Method and system for driving virtual character action by real-time voice | |
CN116564338A (en) | Voice animation generation method, device, electronic equipment and medium | |
KR20230151162A (en) | An Apparatus and method for generating lip sync avatar face based on emotion analysis in voice | |
CN116051692A (en) | Three-dimensional digital human face animation generation method based on voice driving | |
CN110910898B (en) | Voice information processing method and device | |
San-Segundo et al. | Proposing a speech to gesture translation architecture for Spanish deaf people | |
Bharti et al. | Automated speech to sign language conversion using Google API and NLP | |
KR20230151155A (en) | An apparatus for providing avatar speech services and a method for operating it | |
KR20230151157A (en) | A method of an avatar speech service providing device using TTS and STF technology based on artificial intelligence neural network learning | |
CN114898018A (en) | Animation generation method and device for digital object, electronic equipment and storage medium | |
CN117711369A (en) | Intelligent communication system for human and animal situational language | |
CN112634861B (en) | Data processing method, device, electronic equipment and readable storage medium | |
CN115731917A (en) | Voice data processing method, model training method, device and storage medium | |
CN117453932B (en) | Virtual person driving parameter generation method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |