CN117369681A - Digital human image interaction system based on artificial intelligence technology - Google Patents

Digital human image interaction system based on artificial intelligence technology Download PDF

Info

Publication number
CN117369681A
CN117369681A CN202311404134.5A CN202311404134A CN117369681A CN 117369681 A CN117369681 A CN 117369681A CN 202311404134 A CN202311404134 A CN 202311404134A CN 117369681 A CN117369681 A CN 117369681A
Authority
CN
China
Prior art keywords
module
image
real
dimensional modeling
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311404134.5A
Other languages
Chinese (zh)
Inventor
陈章勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimai Technology Co ltd
Original Assignee
Weimai Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimai Technology Co ltd filed Critical Weimai Technology Co ltd
Priority to CN202311404134.5A priority Critical patent/CN117369681A/en
Publication of CN117369681A publication Critical patent/CN117369681A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04815Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings

Abstract

The invention relates to a digital human image interaction system based on an artificial intelligence technology, which comprises a real-time three-dimensional modeling module, wherein one side of the real-time three-dimensional modeling module is provided with a facial expression database, one side of the facial expression database is provided with a network connection module, and a processor is arranged between the three-dimensional modeling module and one side of the network connection module. According to the invention, the audio information of the voice signal is analyzed, the three-dimensional module is generated in real time for display, the facial expressions of the model can be replaced in real time according to different characters, and the facial expression database is arranged for storing various facial expressions, so that a digital person can form different mouth actions, and the voice track synchronization module is arranged, so that the facial expressions of the digital person and the voice content are synchronously carried out, the fidelity and expressive force of the image of the digital person can be greatly improved, the digital person is more close to the expression and the mouth shape of the real human, and better use experience and immersion feeling are brought to a user.

Description

Digital human image interaction system based on artificial intelligence technology
Technical Field
The invention relates to the technical field of digital human figure interaction systems, in particular to a digital human figure interaction system based on an artificial intelligence technology.
Background
A digital human interactive system is a system that simulates and creates virtual characters that interact with a human user using computer technology. These virtual characters are typically constructed based on artificial intelligence and computer graphics technology that enable human interaction with users for conversations, actions, and expressions.
Digital human interactive systems may be applied in a number of fields including virtual assistants, virtual anchor, virtual tour guides, etc. The following are some common applications of digital human interactive systems.
However, in the prior art, although the existing digital person can interact with the user, the existing digital person has obvious defects in actual use, and the specific defects are as follows:
most digital people tend to have the condition of asynchronous audio frequency when speaking, so that uncoordinated feeling can be caused, and the digital people face of the user tends to be a fixed expression, so that immersive feeling cannot be given to the user.
Accordingly, those skilled in the art have proposed a digital human image interaction system based on artificial intelligence techniques.
Disclosure of Invention
In view of the foregoing problems with the prior art, it is a primary object of the present invention to provide a digital human image interaction system based on artificial intelligence technology.
The technical scheme of the invention is as follows: the digital human figure interactive system based on the artificial intelligence technology comprises a real-time three-dimensional modeling module, wherein one side of the real-time three-dimensional modeling module is provided with a facial expression database, one side of the facial expression database is provided with a network connection module, a processor is arranged between one side of the three-dimensional modeling module and one side of the network connection module, one side of the processor, which is far away from the real-time three-dimensional modeling module, is provided with a dialect database, one side of the real-time three-dimensional modeling module, which is far away from the processor, is provided with a human voice simulation module, one side of the real-time three-dimensional modeling module, which is far away from the facial expression database, is provided with an image transmission module, and a control module is arranged between the image transmission module and one side of the processor;
the real-time three-dimensional modeling module can be set into software such as Blender or ZBrush;
the facial expression database is set and stores specific parameters of facial expressions according to corresponding characters;
the network connection module is used for collecting facial expressions in a networking mode and storing different facial expression parameters in a facial expression database;
the processor is used for calling parameters in the facial database and controlling the real-time three-dimensional modeling module to use;
the dialect database is used for storing local languages, can be called, is suitable for users with different mandarin non-standards, and can improve the recognition function of the system;
the human voice simulation module is used for matching with the facial expression database, and when the facial expression database calls specific facial data of a certain word, the human voice simulation module calls the corresponding word;
the image transmission module is used for transmitting the model constructed by the three-dimensional modeling module, so that the model is displayed to the outside through the carrier;
the control module is used for receiving the command sent by the processor and controlling the command.
As a preferred implementation manner, a display screen is arranged on one side, far away from the real-time three-dimensional modeling module, of the image transmission module, a microphone is arranged on the outer side of the display screen, a loudspeaker is arranged on one side, located on the microphone, of the display screen, an alternating current module is arranged in the display screen, and an audio same-track module is arranged on one side, located in the alternating current module, of the display screen;
the display screen is used as a carrier, and the image transmission module can display the modeled model through the display screen;
the microphone is used for receiving voice commands of a user;
the loudspeaker is used for transmitting sound generated by the human voice simulation module;
the communication module is used for converting the information collected by the microphone to change the information into command information which can be recognized by the system;
the audio co-track module is used for matching the information transmitted by the image transmission module and the voice simulation module, so that the image information and the voice information of the audio co-track module are matched.
As a preferred implementation mode, the network connection module is in bidirectional electrical connection with the facial expression database, the facial expression database is in bidirectional electrical connection with the real-time three-dimensional modeling module, the processor, the voice simulation module and the image transmission module are in bidirectional electrical connection with the real-time three-dimensional modeling module, the image transmission module is in electrical connection with the display screen, the alternating current module and the audio co-track module are in bidirectional electrical connection with the control module, the microphone is in bidirectional electrical connection with the alternating current module, the image transmission module is in bidirectional electrical connection with the audio co-track module, and the processor is in bidirectional electrical connection with the facial expression database.
As a preferred embodiment, the network connection module may be configured to be used in a wired connection, a WIFI connection, and a hotspot connection.
As a preferred embodiment, the ac module is internally provided with a noise reduction function, and noise reduction codes thereof are as follows:
import noisereduce as nr
import soundfile as sf
# read Audio File
audio_data,sample_rate=sf.read('input.wav')
# extract noise samples
noisy_part=audio_data[5000:15000]
Noise reduction processing #
reduced_noise=nr.reduce_noise(audio_clip=audio_data,noise_clip=noisy_part,verbose=False)
# save denoised audio file
sf.write('output.wav',reduced_noise,sample_rate)。
As a preferred embodiment, the image transmission module is internally provided with an image enhancement function, and the code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
As a preferred embodiment, the track co-track module has integrated therein a voice synchronization function, and the running code is configured to:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
Compared with the prior art, the invention has the advantages and positive effects that,
according to the invention, the audio information of the voice signal is analyzed, the three-dimensional module is generated in real time for display, the facial expressions of the model can be replaced in real time according to different characters, and the facial expression database is arranged for storing various facial expressions, so that a digital person can form different mouth actions, and the voice track synchronization module is arranged, so that the facial expressions of the digital person and the voice content are synchronously carried out, the fidelity and expressive force of the image of the digital person can be greatly improved, the digital person is more close to the expression and the mouth shape of the real person, and a user can carry out vivid and interactive conversation with the digital person, thereby realizing more natural and real man-machine interaction experience. The system is widely applied to the fields of online education, virtual tour guide, intelligent accompanying and nursing and the like, provides a brand new interaction mode for users, and brings more intelligent and convenient service experience.
Drawings
FIG. 1 is a schematic diagram of the overall structure of a digital human image interaction system based on artificial intelligence technology.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention will be further described with reference to the drawings and the specific embodiments
Examples
As shown in fig. 1, the present invention provides a technical solution: including real-time three-dimensional modeling module, its characterized in that: a facial expression database is arranged on one side of the real-time three-dimensional modeling module, a network connection module is arranged on one side of the facial expression database, a processor is arranged between the three-dimensional modeling module and one side of the network connection module, a dialect database is arranged on one side of the processor, which is far away from the real-time three-dimensional modeling module, a human voice simulation module is arranged on one side, which is far away from the processor, of the real-time three-dimensional modeling module, an image transmission module is arranged on one side, which is far away from the facial expression database, and a control module is arranged between the image transmission module and one side of the processor;
the real-time three-dimensional modeling module can be set into software such as Blender or ZBrush, is mainly used for modeling a model face, and can call different face models for filling;
blender has wide functions including modeling, animation, rendering, physical simulation, video editing and the like, and is suitable for various fields from game development, movie production, building visualization and the like, so that the software can be suitable for the use requirements of the system.
The facial expression database is arranged, and can be used for storing specific parameters of the facial expression according to the corresponding characters, and the facial expression can be called in real time after being stored;
the facial expression database may be set as a local database or a cloud database;
the local database refers to a database system and data stored on a local computer or server, and both the data and database management systems are located on their own physical devices. It is commonly used for storing and managing data in a local environment, with higher data access speed and control rights;
and cloud databases refer to databases that store database systems and data on a cloud platform. The data is connected to the cloud server through a network and managed and maintained by a cloud service provider. The user may access and manage the data through the cloud service provider's interface.
The user can select according to the actual use requirement.
The network connection module is used for collecting facial expressions in a networking way and storing different facial expression parameters in a facial expression database, and the network connection module can directly download corresponding data from the network or the database;
the processor is used for calling parameters in the facial database and controlling the real-time three-dimensional modeling module to use, and can control the three-dimensional modeling to use in combination with various data;
the dialect database is used for storing local languages and can be called, is suitable for users with different mandarin non-standards, can improve the recognition function of the system, and can be provided with the language recognition function of the system;
the voice simulation module is used for matching with the facial expression database, and when the facial expression database calls specific facial data of a certain word, the voice simulation module can call out the corresponding word, so that the system can have a voice playing function;
the image transmission module is used for transmitting the model constructed by the three-dimensional modeling module, so that the model is displayed to the outside through the carrier;
the control module is used for receiving the command sent by the processor and controlling the command;
the system can generate mouth actions of the digital human image in real time by analyzing the audio information of the voice signal, so that the mouth actions and the voice content are synchronously carried out. The technology can greatly improve the fidelity and expressive force of the digital human image, so that the digital human image is more close to the real human expression and mouth shape.
The device comprises an image transmission module, a real-time three-dimensional modeling module, a microphone, a loudspeaker, an alternating current module and an audio same-track module, wherein the image transmission module is arranged on one side far away from the real-time three-dimensional modeling module, the microphone is arranged on the outer side of the display screen, the loudspeaker is arranged on one side of the display screen, which is positioned on the microphone, the alternating current module is arranged in the display screen, and the audio same-track module is arranged on one side of the display screen, which is positioned in the alternating current module;
the display screen is used as a carrier, the image transmission module can display the modeled model through the display screen, and the display screen can be set as a liquid crystal screen and play;
the microphone is used for receiving voice commands of a user, and can be an embedded microphone or an externally-hung microphone;
the loudspeaker is used for transmitting sound generated by the human voice simulation module, and can transmit the sound;
the communication module is used for converting information collected by the microphone to enable the information to be changed into command information which can be identified by the system, and the communication module is used for detecting information transmitted by the microphone;
the audio co-track module is used for matching the information transmitted by the image transmission module and the voice simulation module, so that the image information of the audio co-track module is matched with the voice information;
through the technology, the voice input of the user can be accurately identified and converted into the text form, so that the understanding and analysis of the language content of the user can be realized.
The network connection module is in bidirectional electrical connection with the facial expression database, the facial expression database is in bidirectional electrical connection with the real-time three-dimensional modeling module, the processor, the voice simulation module and the image transmission module are all in bidirectional electrical connection with the real-time three-dimensional modeling module, the image transmission module is in electrical connection with the display screen, the alternating current module and the audio co-track module are all in bidirectional electrical connection with the control module, the microphone is in bidirectional electrical connection with the alternating current module, the image transmission module is in bidirectional electrical connection with the audio co-track module, and the processor is in bidirectional electrical connection with the facial expression database;
the network connection module can be used in a wired connection mode, a WIFI connection mode and a hot spot connection mode;
the alternating current module is internally provided with a noise reduction function, and the noise reduction code is as follows:
import noisereduce as nr
import soundfile as sf
# read Audio File
audio_data,sample_rate=sf.read('input.wav')
# extract noise samples
noisy_part=audio_data[5000:15000]
Noise reduction processing #
reduced_noise=nr.reduce_noise(audio_clip=audio_data,noise_clip=noisy_part,verbose=False)
# save denoised audio file
sf.write('output.wav',reduced_noise,sample_rate);
The method comprises the steps of reading an audio file to be denoised, selecting a noise-containing part from the audio file as a noise sample, performing denoising treatment on the whole audio, and finally storing the denoised audio data as a new audio file.
The image transmission module is internally provided with an image enhancement function, and the code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path;
Firstly, an image file is read, an image is converted into a gray image, then, the gray image is enhanced by using a Laplacian pyramid algorithm, gaussian blur processing is carried out on the image, then, the image and the blurred image are added to obtain an enhanced image, and finally, the enhanced image is saved and a path of an output image is returned.
The voice synchronization function is integrated in the same track module of the voice track, and the running code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
The audio file is first read. The audio time stamp information is then acquired and aligned with the image using the audio time stamp information or audio processing techniques.
Working principle:
the system can display through analyzing the audio information of the voice signal and generating the three-dimensional module in real time, and can change the facial expression of the model in real time according to different characters, and set up a facial expression database to store various facial expressions, so that a digital person can form different mouth actions, and an audio track synchronization module is arranged, so that the facial expression and the voice content of the digital person can be synchronously carried out, the fidelity and the expressive force of the image of the digital person can be greatly improved, the digital person is more close to the expression and the mouth shape of the real human, and a user can carry out vivid and interactive conversation with the digital person, thereby realizing more natural and real human-computer interaction experience. The system is widely applied to the fields of online education, virtual tour guide, intelligent accompanying and nursing and the like, provides a brand new interaction mode for users, and brings more intelligent and convenient service experience.
Finally, it should be noted that: the embodiments described above are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (8)

1. The digital human image interaction system based on the artificial intelligence technology comprises a real-time three-dimensional modeling module, and is characterized in that: a facial expression database is arranged on one side of the real-time three-dimensional modeling module, a network connection module is arranged on one side of the facial expression database, a processor is arranged between the three-dimensional modeling module and one side of the network connection module, a dialect database is arranged on one side of the processor, which is far away from the real-time three-dimensional modeling module, a human voice simulation module is arranged on one side, which is far away from the processor, of the real-time three-dimensional modeling module, an image transmission module is arranged on one side, which is far away from the facial expression database, and a control module is arranged between the image transmission module and one side of the processor;
the real-time three-dimensional modeling module can be set into software such as Blender or ZBrush;
the facial expression database is set and stores specific parameters of facial expressions according to corresponding characters;
the network connection module is used for collecting facial expressions in a networking mode and storing different facial expression parameters in a facial expression database;
the processor is used for calling parameters in the facial database and controlling the real-time three-dimensional modeling module to use;
the dialect database is used for storing local languages, can be called, is suitable for users with different mandarin non-standards, and can improve the recognition function of the system;
the voice simulation module is used for matching with the facial expression database, and when the facial expression database calls specific facial data of a certain word, the voice simulation module can call out the corresponding word.
2. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the image transmission module is used for transmitting the model constructed by the three-dimensional modeling module, so that the model is displayed to the outside through a carrier, the image transmission module compresses the acquired image data into smaller data packets, and the encoded image data is transmitted to a target place through a network or other communication channels;
the control module receives external input signals or data as input to the control module and sends control commands or interfaces for adjusting parameters to external devices, and the outputs can be electrical signals, digital signals, control signals and the like for controlling the execution mechanism or other controlled devices.
3. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the device comprises an image transmission module, a real-time three-dimensional modeling module, a microphone, a loudspeaker, an alternating current module and an audio same-track module, wherein the image transmission module is arranged on one side far away from the real-time three-dimensional modeling module, the microphone is arranged on the outer side of the display screen, the loudspeaker is arranged on one side of the display screen, which is positioned on the microphone, the alternating current module is arranged in the display screen, and the audio same-track module is arranged on one side of the display screen, which is positioned in the alternating current module;
the display screen is used as a carrier, and the image transmission module can display the modeled model through the display screen;
the microphone is used for receiving voice commands of a user;
the loudspeaker is used for transmitting sound generated by the human voice simulation module;
the communication module is used for converting the information collected by the microphone to change the information into command information which can be recognized by the system;
the audio co-track module is used for matching the information transmitted by the image transmission module and the voice simulation module, so that the image information and the voice information of the audio co-track module are matched.
4. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the network connection module is in bidirectional electrical connection with the facial expression database, the facial expression database is in bidirectional electrical connection with the real-time three-dimensional modeling module, the processor, the voice simulation module and the image transmission module are in bidirectional electrical connection with the real-time three-dimensional modeling module, the image transmission module is in electrical connection with the display screen, the communication module and the audio co-track module are in bidirectional electrical connection with the control module, the microphone is in bidirectional electrical connection with the communication module, the image transmission module is in bidirectional electrical connection with the audio co-track module, and the processor is in bidirectional electrical connection with the facial expression database.
5. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the network connection module can be used in a wired connection mode, a WIFI connection mode and a hot spot connection mode.
6. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the alternating current module is internally provided with a noise reduction function, and the noise reduction code is as follows:
import noisereduce as nr
import soundfile as sf
# read Audio File
audio_data,sample_rate=sf.read('input.wav')
# extract noise samples
noisy_part=audio_data[5000:15000]
Noise reduction processing #
reduced_noise=nr.reduce_noise(audio_clip=audio_data,noise_clip=noisy_part,verbose=False)
# save denoised audio file
sf.write('output.wav',reduced_noise,sample_rate)。
7. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the image transmission module is internally provided with an image enhancement function, and the code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
8. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the voice synchronization function is integrated in the same track module of the voice track, and the running code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
CN202311404134.5A 2023-10-27 2023-10-27 Digital human image interaction system based on artificial intelligence technology Pending CN117369681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311404134.5A CN117369681A (en) 2023-10-27 2023-10-27 Digital human image interaction system based on artificial intelligence technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311404134.5A CN117369681A (en) 2023-10-27 2023-10-27 Digital human image interaction system based on artificial intelligence technology

Publications (1)

Publication Number Publication Date
CN117369681A true CN117369681A (en) 2024-01-09

Family

ID=89394384

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311404134.5A Pending CN117369681A (en) 2023-10-27 2023-10-27 Digital human image interaction system based on artificial intelligence technology

Country Status (1)

Country Link
CN (1) CN117369681A (en)

Similar Documents

Publication Publication Date Title
US20230316643A1 (en) Virtual role-based multimodal interaction method, apparatus and system, storage medium, and terminal
CN110298906B (en) Method and device for generating information
CN109658928A (en) A kind of home-services robot cloud multi-modal dialog method, apparatus and system
CN110400251A (en) Method for processing video frequency, device, terminal device and storage medium
CN108942919B (en) Interaction method and system based on virtual human
US20240070397A1 (en) Human-computer interaction method, apparatus and system, electronic device and computer medium
US10678855B2 (en) Generating descriptive text contemporaneous to visual media
CN110019683A (en) Intelligent sound interaction robot and its voice interactive method
CN112100352A (en) Method, device, client and storage medium for interacting with virtual object
US20230047858A1 (en) Method, apparatus, electronic device, computer-readable storage medium, and computer program product for video communication
CN114895817B (en) Interactive information processing method, network model training method and device
CN109063624A (en) Information processing method, system, electronic equipment and computer readable storage medium
CN114495927A (en) Multi-modal interactive virtual digital person generation method and device, storage medium and terminal
CN112668407A (en) Face key point generation method and device, storage medium and electronic equipment
CN115953521B (en) Remote digital person rendering method, device and system
CN112652041A (en) Virtual image generation method and device, storage medium and electronic equipment
Dhanjal et al. An optimized machine translation technique for multi-lingual speech to sign language notation
CN117292022A (en) Video generation method and device based on virtual object and electronic equipment
CN117313785A (en) Intelligent digital human interaction method, device and medium based on weak population
EP4152269A1 (en) Method and apparatus of generating 3d video, method and apparatus of training model, device, and medium
CN117369681A (en) Digital human image interaction system based on artificial intelligence technology
CN114898018A (en) Animation generation method and device for digital object, electronic equipment and storage medium
CN115222857A (en) Method, apparatus, electronic device and computer readable medium for generating avatar
CN114760425A (en) Digital human generation method, device, computer equipment and storage medium
Cerezo et al. Interactive agents for multimodal emotional user interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination