CN117369681A - Digital human image interaction system based on artificial intelligence technology - Google Patents
Digital human image interaction system based on artificial intelligence technology Download PDFInfo
- Publication number
- CN117369681A CN117369681A CN202311404134.5A CN202311404134A CN117369681A CN 117369681 A CN117369681 A CN 117369681A CN 202311404134 A CN202311404134 A CN 202311404134A CN 117369681 A CN117369681 A CN 117369681A
- Authority
- CN
- China
- Prior art keywords
- module
- image
- real
- dimensional modeling
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 21
- 238000005516 engineering process Methods 0.000 title claims abstract description 19
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 17
- 230000008921 facial expression Effects 0.000 claims abstract description 55
- 230000005540 biological transmission Effects 0.000 claims description 33
- 230000002457 bidirectional effect Effects 0.000 claims description 21
- 238000004088 simulation Methods 0.000 claims description 19
- 230000006870 function Effects 0.000 claims description 15
- 230000008676 import Effects 0.000 claims description 12
- 230000009467 reduction Effects 0.000 claims description 9
- 238000004891 communication Methods 0.000 claims description 7
- 230000002708 enhancing effect Effects 0.000 claims description 6
- 230000001815 facial effect Effects 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 241001672694 Citrus reticulata Species 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 3
- 230000006855 networking Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims 1
- 230000014509 gene expression Effects 0.000 abstract description 6
- 238000007654 immersion Methods 0.000 abstract 1
- 230000002452 interceptive effect Effects 0.000 description 6
- 238000000034 method Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000000474 nursing effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/04815—Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
Abstract
The invention relates to a digital human image interaction system based on an artificial intelligence technology, which comprises a real-time three-dimensional modeling module, wherein one side of the real-time three-dimensional modeling module is provided with a facial expression database, one side of the facial expression database is provided with a network connection module, and a processor is arranged between the three-dimensional modeling module and one side of the network connection module. According to the invention, the audio information of the voice signal is analyzed, the three-dimensional module is generated in real time for display, the facial expressions of the model can be replaced in real time according to different characters, and the facial expression database is arranged for storing various facial expressions, so that a digital person can form different mouth actions, and the voice track synchronization module is arranged, so that the facial expressions of the digital person and the voice content are synchronously carried out, the fidelity and expressive force of the image of the digital person can be greatly improved, the digital person is more close to the expression and the mouth shape of the real human, and better use experience and immersion feeling are brought to a user.
Description
Technical Field
The invention relates to the technical field of digital human figure interaction systems, in particular to a digital human figure interaction system based on an artificial intelligence technology.
Background
A digital human interactive system is a system that simulates and creates virtual characters that interact with a human user using computer technology. These virtual characters are typically constructed based on artificial intelligence and computer graphics technology that enable human interaction with users for conversations, actions, and expressions.
Digital human interactive systems may be applied in a number of fields including virtual assistants, virtual anchor, virtual tour guides, etc. The following are some common applications of digital human interactive systems.
However, in the prior art, although the existing digital person can interact with the user, the existing digital person has obvious defects in actual use, and the specific defects are as follows:
most digital people tend to have the condition of asynchronous audio frequency when speaking, so that uncoordinated feeling can be caused, and the digital people face of the user tends to be a fixed expression, so that immersive feeling cannot be given to the user.
Accordingly, those skilled in the art have proposed a digital human image interaction system based on artificial intelligence techniques.
Disclosure of Invention
In view of the foregoing problems with the prior art, it is a primary object of the present invention to provide a digital human image interaction system based on artificial intelligence technology.
The technical scheme of the invention is as follows: the digital human figure interactive system based on the artificial intelligence technology comprises a real-time three-dimensional modeling module, wherein one side of the real-time three-dimensional modeling module is provided with a facial expression database, one side of the facial expression database is provided with a network connection module, a processor is arranged between one side of the three-dimensional modeling module and one side of the network connection module, one side of the processor, which is far away from the real-time three-dimensional modeling module, is provided with a dialect database, one side of the real-time three-dimensional modeling module, which is far away from the processor, is provided with a human voice simulation module, one side of the real-time three-dimensional modeling module, which is far away from the facial expression database, is provided with an image transmission module, and a control module is arranged between the image transmission module and one side of the processor;
the real-time three-dimensional modeling module can be set into software such as Blender or ZBrush;
the facial expression database is set and stores specific parameters of facial expressions according to corresponding characters;
the network connection module is used for collecting facial expressions in a networking mode and storing different facial expression parameters in a facial expression database;
the processor is used for calling parameters in the facial database and controlling the real-time three-dimensional modeling module to use;
the dialect database is used for storing local languages, can be called, is suitable for users with different mandarin non-standards, and can improve the recognition function of the system;
the human voice simulation module is used for matching with the facial expression database, and when the facial expression database calls specific facial data of a certain word, the human voice simulation module calls the corresponding word;
the image transmission module is used for transmitting the model constructed by the three-dimensional modeling module, so that the model is displayed to the outside through the carrier;
the control module is used for receiving the command sent by the processor and controlling the command.
As a preferred implementation manner, a display screen is arranged on one side, far away from the real-time three-dimensional modeling module, of the image transmission module, a microphone is arranged on the outer side of the display screen, a loudspeaker is arranged on one side, located on the microphone, of the display screen, an alternating current module is arranged in the display screen, and an audio same-track module is arranged on one side, located in the alternating current module, of the display screen;
the display screen is used as a carrier, and the image transmission module can display the modeled model through the display screen;
the microphone is used for receiving voice commands of a user;
the loudspeaker is used for transmitting sound generated by the human voice simulation module;
the communication module is used for converting the information collected by the microphone to change the information into command information which can be recognized by the system;
the audio co-track module is used for matching the information transmitted by the image transmission module and the voice simulation module, so that the image information and the voice information of the audio co-track module are matched.
As a preferred implementation mode, the network connection module is in bidirectional electrical connection with the facial expression database, the facial expression database is in bidirectional electrical connection with the real-time three-dimensional modeling module, the processor, the voice simulation module and the image transmission module are in bidirectional electrical connection with the real-time three-dimensional modeling module, the image transmission module is in electrical connection with the display screen, the alternating current module and the audio co-track module are in bidirectional electrical connection with the control module, the microphone is in bidirectional electrical connection with the alternating current module, the image transmission module is in bidirectional electrical connection with the audio co-track module, and the processor is in bidirectional electrical connection with the facial expression database.
As a preferred embodiment, the network connection module may be configured to be used in a wired connection, a WIFI connection, and a hotspot connection.
As a preferred embodiment, the ac module is internally provided with a noise reduction function, and noise reduction codes thereof are as follows:
import noisereduce as nr
import soundfile as sf
# read Audio File
audio_data,sample_rate=sf.read('input.wav')
# extract noise samples
noisy_part=audio_data[5000:15000]
Noise reduction processing #
reduced_noise=nr.reduce_noise(audio_clip=audio_data,noise_clip=noisy_part,verbose=False)
# save denoised audio file
sf.write('output.wav',reduced_noise,sample_rate)。
As a preferred embodiment, the image transmission module is internally provided with an image enhancement function, and the code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
As a preferred embodiment, the track co-track module has integrated therein a voice synchronization function, and the running code is configured to:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
Compared with the prior art, the invention has the advantages and positive effects that,
according to the invention, the audio information of the voice signal is analyzed, the three-dimensional module is generated in real time for display, the facial expressions of the model can be replaced in real time according to different characters, and the facial expression database is arranged for storing various facial expressions, so that a digital person can form different mouth actions, and the voice track synchronization module is arranged, so that the facial expressions of the digital person and the voice content are synchronously carried out, the fidelity and expressive force of the image of the digital person can be greatly improved, the digital person is more close to the expression and the mouth shape of the real person, and a user can carry out vivid and interactive conversation with the digital person, thereby realizing more natural and real man-machine interaction experience. The system is widely applied to the fields of online education, virtual tour guide, intelligent accompanying and nursing and the like, provides a brand new interaction mode for users, and brings more intelligent and convenient service experience.
Drawings
FIG. 1 is a schematic diagram of the overall structure of a digital human image interaction system based on artificial intelligence technology.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The invention will be further described with reference to the drawings and the specific embodiments
Examples
As shown in fig. 1, the present invention provides a technical solution: including real-time three-dimensional modeling module, its characterized in that: a facial expression database is arranged on one side of the real-time three-dimensional modeling module, a network connection module is arranged on one side of the facial expression database, a processor is arranged between the three-dimensional modeling module and one side of the network connection module, a dialect database is arranged on one side of the processor, which is far away from the real-time three-dimensional modeling module, a human voice simulation module is arranged on one side, which is far away from the processor, of the real-time three-dimensional modeling module, an image transmission module is arranged on one side, which is far away from the facial expression database, and a control module is arranged between the image transmission module and one side of the processor;
the real-time three-dimensional modeling module can be set into software such as Blender or ZBrush, is mainly used for modeling a model face, and can call different face models for filling;
blender has wide functions including modeling, animation, rendering, physical simulation, video editing and the like, and is suitable for various fields from game development, movie production, building visualization and the like, so that the software can be suitable for the use requirements of the system.
The facial expression database is arranged, and can be used for storing specific parameters of the facial expression according to the corresponding characters, and the facial expression can be called in real time after being stored;
the facial expression database may be set as a local database or a cloud database;
the local database refers to a database system and data stored on a local computer or server, and both the data and database management systems are located on their own physical devices. It is commonly used for storing and managing data in a local environment, with higher data access speed and control rights;
and cloud databases refer to databases that store database systems and data on a cloud platform. The data is connected to the cloud server through a network and managed and maintained by a cloud service provider. The user may access and manage the data through the cloud service provider's interface.
The user can select according to the actual use requirement.
The network connection module is used for collecting facial expressions in a networking way and storing different facial expression parameters in a facial expression database, and the network connection module can directly download corresponding data from the network or the database;
the processor is used for calling parameters in the facial database and controlling the real-time three-dimensional modeling module to use, and can control the three-dimensional modeling to use in combination with various data;
the dialect database is used for storing local languages and can be called, is suitable for users with different mandarin non-standards, can improve the recognition function of the system, and can be provided with the language recognition function of the system;
the voice simulation module is used for matching with the facial expression database, and when the facial expression database calls specific facial data of a certain word, the voice simulation module can call out the corresponding word, so that the system can have a voice playing function;
the image transmission module is used for transmitting the model constructed by the three-dimensional modeling module, so that the model is displayed to the outside through the carrier;
the control module is used for receiving the command sent by the processor and controlling the command;
the system can generate mouth actions of the digital human image in real time by analyzing the audio information of the voice signal, so that the mouth actions and the voice content are synchronously carried out. The technology can greatly improve the fidelity and expressive force of the digital human image, so that the digital human image is more close to the real human expression and mouth shape.
The device comprises an image transmission module, a real-time three-dimensional modeling module, a microphone, a loudspeaker, an alternating current module and an audio same-track module, wherein the image transmission module is arranged on one side far away from the real-time three-dimensional modeling module, the microphone is arranged on the outer side of the display screen, the loudspeaker is arranged on one side of the display screen, which is positioned on the microphone, the alternating current module is arranged in the display screen, and the audio same-track module is arranged on one side of the display screen, which is positioned in the alternating current module;
the display screen is used as a carrier, the image transmission module can display the modeled model through the display screen, and the display screen can be set as a liquid crystal screen and play;
the microphone is used for receiving voice commands of a user, and can be an embedded microphone or an externally-hung microphone;
the loudspeaker is used for transmitting sound generated by the human voice simulation module, and can transmit the sound;
the communication module is used for converting information collected by the microphone to enable the information to be changed into command information which can be identified by the system, and the communication module is used for detecting information transmitted by the microphone;
the audio co-track module is used for matching the information transmitted by the image transmission module and the voice simulation module, so that the image information of the audio co-track module is matched with the voice information;
through the technology, the voice input of the user can be accurately identified and converted into the text form, so that the understanding and analysis of the language content of the user can be realized.
The network connection module is in bidirectional electrical connection with the facial expression database, the facial expression database is in bidirectional electrical connection with the real-time three-dimensional modeling module, the processor, the voice simulation module and the image transmission module are all in bidirectional electrical connection with the real-time three-dimensional modeling module, the image transmission module is in electrical connection with the display screen, the alternating current module and the audio co-track module are all in bidirectional electrical connection with the control module, the microphone is in bidirectional electrical connection with the alternating current module, the image transmission module is in bidirectional electrical connection with the audio co-track module, and the processor is in bidirectional electrical connection with the facial expression database;
the network connection module can be used in a wired connection mode, a WIFI connection mode and a hot spot connection mode;
the alternating current module is internally provided with a noise reduction function, and the noise reduction code is as follows:
import noisereduce as nr
import soundfile as sf
# read Audio File
audio_data,sample_rate=sf.read('input.wav')
# extract noise samples
noisy_part=audio_data[5000:15000]
Noise reduction processing #
reduced_noise=nr.reduce_noise(audio_clip=audio_data,noise_clip=noisy_part,verbose=False)
# save denoised audio file
sf.write('output.wav',reduced_noise,sample_rate);
The method comprises the steps of reading an audio file to be denoised, selecting a noise-containing part from the audio file as a noise sample, performing denoising treatment on the whole audio, and finally storing the denoised audio data as a new audio file.
The image transmission module is internally provided with an image enhancement function, and the code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path;
Firstly, an image file is read, an image is converted into a gray image, then, the gray image is enhanced by using a Laplacian pyramid algorithm, gaussian blur processing is carried out on the image, then, the image and the blurred image are added to obtain an enhanced image, and finally, the enhanced image is saved and a path of an output image is returned.
The voice synchronization function is integrated in the same track module of the voice track, and the running code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
The audio file is first read. The audio time stamp information is then acquired and aligned with the image using the audio time stamp information or audio processing techniques.
Working principle:
the system can display through analyzing the audio information of the voice signal and generating the three-dimensional module in real time, and can change the facial expression of the model in real time according to different characters, and set up a facial expression database to store various facial expressions, so that a digital person can form different mouth actions, and an audio track synchronization module is arranged, so that the facial expression and the voice content of the digital person can be synchronously carried out, the fidelity and the expressive force of the image of the digital person can be greatly improved, the digital person is more close to the expression and the mouth shape of the real human, and a user can carry out vivid and interactive conversation with the digital person, thereby realizing more natural and real human-computer interaction experience. The system is widely applied to the fields of online education, virtual tour guide, intelligent accompanying and nursing and the like, provides a brand new interaction mode for users, and brings more intelligent and convenient service experience.
Finally, it should be noted that: the embodiments described above are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (8)
1. The digital human image interaction system based on the artificial intelligence technology comprises a real-time three-dimensional modeling module, and is characterized in that: a facial expression database is arranged on one side of the real-time three-dimensional modeling module, a network connection module is arranged on one side of the facial expression database, a processor is arranged between the three-dimensional modeling module and one side of the network connection module, a dialect database is arranged on one side of the processor, which is far away from the real-time three-dimensional modeling module, a human voice simulation module is arranged on one side, which is far away from the processor, of the real-time three-dimensional modeling module, an image transmission module is arranged on one side, which is far away from the facial expression database, and a control module is arranged between the image transmission module and one side of the processor;
the real-time three-dimensional modeling module can be set into software such as Blender or ZBrush;
the facial expression database is set and stores specific parameters of facial expressions according to corresponding characters;
the network connection module is used for collecting facial expressions in a networking mode and storing different facial expression parameters in a facial expression database;
the processor is used for calling parameters in the facial database and controlling the real-time three-dimensional modeling module to use;
the dialect database is used for storing local languages, can be called, is suitable for users with different mandarin non-standards, and can improve the recognition function of the system;
the voice simulation module is used for matching with the facial expression database, and when the facial expression database calls specific facial data of a certain word, the voice simulation module can call out the corresponding word.
2. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the image transmission module is used for transmitting the model constructed by the three-dimensional modeling module, so that the model is displayed to the outside through a carrier, the image transmission module compresses the acquired image data into smaller data packets, and the encoded image data is transmitted to a target place through a network or other communication channels;
the control module receives external input signals or data as input to the control module and sends control commands or interfaces for adjusting parameters to external devices, and the outputs can be electrical signals, digital signals, control signals and the like for controlling the execution mechanism or other controlled devices.
3. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the device comprises an image transmission module, a real-time three-dimensional modeling module, a microphone, a loudspeaker, an alternating current module and an audio same-track module, wherein the image transmission module is arranged on one side far away from the real-time three-dimensional modeling module, the microphone is arranged on the outer side of the display screen, the loudspeaker is arranged on one side of the display screen, which is positioned on the microphone, the alternating current module is arranged in the display screen, and the audio same-track module is arranged on one side of the display screen, which is positioned in the alternating current module;
the display screen is used as a carrier, and the image transmission module can display the modeled model through the display screen;
the microphone is used for receiving voice commands of a user;
the loudspeaker is used for transmitting sound generated by the human voice simulation module;
the communication module is used for converting the information collected by the microphone to change the information into command information which can be recognized by the system;
the audio co-track module is used for matching the information transmitted by the image transmission module and the voice simulation module, so that the image information and the voice information of the audio co-track module are matched.
4. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the network connection module is in bidirectional electrical connection with the facial expression database, the facial expression database is in bidirectional electrical connection with the real-time three-dimensional modeling module, the processor, the voice simulation module and the image transmission module are in bidirectional electrical connection with the real-time three-dimensional modeling module, the image transmission module is in electrical connection with the display screen, the communication module and the audio co-track module are in bidirectional electrical connection with the control module, the microphone is in bidirectional electrical connection with the communication module, the image transmission module is in bidirectional electrical connection with the audio co-track module, and the processor is in bidirectional electrical connection with the facial expression database.
5. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the network connection module can be used in a wired connection mode, a WIFI connection mode and a hot spot connection mode.
6. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the alternating current module is internally provided with a noise reduction function, and the noise reduction code is as follows:
import noisereduce as nr
import soundfile as sf
# read Audio File
audio_data,sample_rate=sf.read('input.wav')
# extract noise samples
noisy_part=audio_data[5000:15000]
Noise reduction processing #
reduced_noise=nr.reduce_noise(audio_clip=audio_data,noise_clip=noisy_part,verbose=False)
# save denoised audio file
sf.write('output.wav',reduced_noise,sample_rate)。
7. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the image transmission module is internally provided with an image enhancement function, and the code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
8. The artificial intelligence technology-based digital human image interaction system according to claim 1, wherein: the voice synchronization function is integrated in the same track module of the voice track, and the running code is set as follows:
import cv2
def enhance_image(image_path):
# read image
image=cv2.imread(image_path)
# convert image to gray-scale image
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
Enhancing an image using Laplacian pyramid algorithm
blurred=cv2.GaussianBlur(gray,(5,5),0)
sharpened=cv2.addWeighted(gray,1.5,blurred,-0.5,0)
# save enhanced image
output_path='enhanced_image.jpg'
cv2.imwrite(output_path,sharpened)
return output_path。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311404134.5A CN117369681A (en) | 2023-10-27 | 2023-10-27 | Digital human image interaction system based on artificial intelligence technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311404134.5A CN117369681A (en) | 2023-10-27 | 2023-10-27 | Digital human image interaction system based on artificial intelligence technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117369681A true CN117369681A (en) | 2024-01-09 |
Family
ID=89394384
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311404134.5A Pending CN117369681A (en) | 2023-10-27 | 2023-10-27 | Digital human image interaction system based on artificial intelligence technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117369681A (en) |
-
2023
- 2023-10-27 CN CN202311404134.5A patent/CN117369681A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230316643A1 (en) | Virtual role-based multimodal interaction method, apparatus and system, storage medium, and terminal | |
CN110298906B (en) | Method and device for generating information | |
CN109658928A (en) | A kind of home-services robot cloud multi-modal dialog method, apparatus and system | |
CN110400251A (en) | Method for processing video frequency, device, terminal device and storage medium | |
CN108942919B (en) | Interaction method and system based on virtual human | |
US20240070397A1 (en) | Human-computer interaction method, apparatus and system, electronic device and computer medium | |
US10678855B2 (en) | Generating descriptive text contemporaneous to visual media | |
CN110019683A (en) | Intelligent sound interaction robot and its voice interactive method | |
CN112100352A (en) | Method, device, client and storage medium for interacting with virtual object | |
US20230047858A1 (en) | Method, apparatus, electronic device, computer-readable storage medium, and computer program product for video communication | |
CN114895817B (en) | Interactive information processing method, network model training method and device | |
CN109063624A (en) | Information processing method, system, electronic equipment and computer readable storage medium | |
CN114495927A (en) | Multi-modal interactive virtual digital person generation method and device, storage medium and terminal | |
CN112668407A (en) | Face key point generation method and device, storage medium and electronic equipment | |
CN115953521B (en) | Remote digital person rendering method, device and system | |
CN112652041A (en) | Virtual image generation method and device, storage medium and electronic equipment | |
Dhanjal et al. | An optimized machine translation technique for multi-lingual speech to sign language notation | |
CN117292022A (en) | Video generation method and device based on virtual object and electronic equipment | |
CN117313785A (en) | Intelligent digital human interaction method, device and medium based on weak population | |
EP4152269A1 (en) | Method and apparatus of generating 3d video, method and apparatus of training model, device, and medium | |
CN117369681A (en) | Digital human image interaction system based on artificial intelligence technology | |
CN114898018A (en) | Animation generation method and device for digital object, electronic equipment and storage medium | |
CN115222857A (en) | Method, apparatus, electronic device and computer readable medium for generating avatar | |
CN114760425A (en) | Digital human generation method, device, computer equipment and storage medium | |
Cerezo et al. | Interactive agents for multimodal emotional user interaction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |