CN117348736B - Digital interaction method, system and medium based on artificial intelligence - Google Patents

Digital interaction method, system and medium based on artificial intelligence Download PDF

Info

Publication number
CN117348736B
CN117348736B CN202311664192.1A CN202311664192A CN117348736B CN 117348736 B CN117348736 B CN 117348736B CN 202311664192 A CN202311664192 A CN 202311664192A CN 117348736 B CN117348736 B CN 117348736B
Authority
CN
China
Prior art keywords
data
digital
feature
model
lip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311664192.1A
Other languages
Chinese (zh)
Other versions
CN117348736A (en
Inventor
杨良志
白琳
杨安培
吴海锋
崔寅
江梦玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Richinfo Technology Co ltd
Original Assignee
Richinfo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Richinfo Technology Co ltd filed Critical Richinfo Technology Co ltd
Priority to CN202311664192.1A priority Critical patent/CN117348736B/en
Publication of CN117348736A publication Critical patent/CN117348736A/en
Application granted granted Critical
Publication of CN117348736B publication Critical patent/CN117348736B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/04Indexing scheme for image data processing or generation, in general involving 3D image data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Architecture (AREA)
  • Computer Hardware Design (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The application provides a digital interaction method, system and medium based on artificial intelligence. The method comprises the following steps: generating a digital human model according to the natural human feature recognition data and the application scene feature data in combination with the background environment feature data, generating emotion optimization factors for the natural human feature recognition data and the application scene feature data to perform primary optimization on the digital human model, performing lip synchronization according to the natural human feature recognition data to obtain a second optimized digital human model, performing personalized adjustment on the second optimized digital human model according to the basic data and the interaction demand data of the interaction user to obtain a personalized digital human model, performing recognition according to the user interaction data by using a preset artificial intelligent algorithm to generate action instruction parameters, and performing corresponding actions according to the action instruction parameters by the personalized digital human model. The method and the device provide more natural and comfortable intelligent interaction experience for the user through personalized customization of the digital human model.

Description

Digital interaction method, system and medium based on artificial intelligence
Technical Field
The application relates to the technical field of big data and artificial intelligence, in particular to a digital interaction method, a digital interaction system and a digital interaction medium based on artificial intelligence.
Background
The digital person is a digital character image which is created by using a digital technology and is close to a human image, along with the daily and new moon of virtual digital person theory and technology development, the digital person is widely applied to a plurality of application scenes such as live broadcasting, film and television, advertisement, news, personal interaction and the like, and the dialogue, action and behavior interaction of the person and the digital person can be realized through an artificial intelligence algorithm, however, at present, a technology for quickly generating a digital person model according to different application scenes and background environments, continuously optimizing and then customizing the digital person according to the interaction requirement of a user is not yet available.
In view of the above problems, an effective technical solution is currently needed.
Disclosure of Invention
The invention aims to provide a digital interaction method, a system and a medium based on artificial intelligence, and aims to realize a technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the digital person model by the natural person feature identification data and the application scene feature data, carrying out lip synchronization according to the natural person feature identification data, obtaining a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user, obtaining a personalized digital person model, carrying out identification according to user interaction data by utilizing a preset artificial intelligence algorithm, generating action instruction parameters, and carrying out corresponding actions according to the action instruction parameters by the personalized digital person model.
The application also provides a digital interaction method based on artificial intelligence, which comprises the following steps:
acquiring application scene information and extracting application scene feature data, acquiring background environment information and extracting background environment feature data;
acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;
inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;
performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;
acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;
And identifying by utilizing a preset artificial intelligence algorithm according to the user interaction data to generate action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters.
Optionally, in the artificial intelligence based digital interaction method described in the present application, the obtaining natural person feature information and extracting natural person feature identification data, and generating the digital person model according to the natural person feature identification data in combination with the application scene feature data and the background environment feature data includes:
acquiring natural person feature information and extracting natural person feature identification data, including: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;
and generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data.
Optionally, in the artificial intelligence based digital interaction method described in the present application, the inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model to process to obtain an emotion optimization factor, optimizing the digital person model according to the emotion optimization factor, and obtaining a first optimized digital person model includes:
Extracting speech rate data, tone data, intonation data and audio energy data according to the speech recognition data;
inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data, the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;
and optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model.
Optionally, in the artificial intelligence based digital interaction method described in the present application, the performing lip synchronization on the first optimized digital person model according to the natural person feature identification data to obtain a second optimized digital person model includes:
respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;
aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;
and optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin.
Optionally, in the artificial intelligence based digital interaction method described in the present application, the obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data includes: interaction demand data, user interaction data, including:
acquiring user basic information and extracting user basic data, including: age data, occupation data, health status data, personality data, language style data, and dialect data;
acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data.
Optionally, in the artificial intelligence based digital interaction method described in the present application, the performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction requirement data to obtain a personalized digital mannequin includes:
inputting the age data, the occupation data, the health condition data, the character data, the language style data and the interaction demand data into a preset style adjustment model for processing to obtain style adjustment parameters;
Inputting the dialect data into a preset lip optimization model for processing to obtain lip optimization parameters;
and performing personalized adjustment on the second optimized digital mannequin according to the style adjustment parameters and the lip optimization parameters to obtain a personalized digital mannequin.
Optionally, in the digital interaction method based on artificial intelligence, the identifying according to the user interaction data by using a preset artificial intelligence algorithm generates an action instruction parameter, and the personalized digital person model executes a corresponding action according to the action instruction parameter, including:
identifying by using a preset artificial intelligent algorithm according to the facial expression data, the limb motion data, the gesture data and the voice data, and generating motion instruction parameters;
and the personalized digital mannequin executes corresponding actions according to the action instruction parameters.
In a second aspect, the present application provides an artificial intelligence based digital interactive system comprising: the system comprises a memory and a processor, wherein the memory comprises a program of an artificial intelligence-based digital interaction method, and the program of the artificial intelligence-based digital interaction method realizes the following steps when being executed by the processor:
Acquiring application scene information and extracting application scene feature data, acquiring background environment information and extracting background environment feature data;
acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;
inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;
performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;
acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;
and identifying by utilizing a preset artificial intelligence algorithm according to the user interaction data to generate action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters.
Optionally, in the artificial intelligence based digital interaction system described in the present application, the obtaining natural person feature information and extracting natural person feature identification data, and generating the digital person model according to the natural person feature identification data in combination with the application scene feature data and the background environment feature data includes:
acquiring natural person feature information and extracting natural person feature identification data, including: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;
and generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data.
In a third aspect, the present application further provides a computer readable storage medium, including an artificial intelligence based digital interaction method program, which when executed by a processor, implements the steps of the artificial intelligence based digital interaction method as described in any of the above.
In order to realize the technology of quickly generating a digital person model according to different application scenes and background environments and continuously optimizing the digital person model and then carrying out personalized digital person customization according to user interaction requirements, firstly, generating a digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the digital person model by the natural person feature identification data and the application scene feature data, carrying out lip synchronization according to the natural person feature identification data, obtaining a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data of an interactive user and interaction requirement data, obtaining a personalized digital person model, identifying the personalized digital person model according to user interaction data by using a preset artificial intelligent algorithm, generating action instruction parameters, and carrying out corresponding actions by the personalized digital person model according to the action instruction parameters.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objects and other advantages of the present application may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an artificial intelligence based digital interaction method provided in an embodiment of the present application;
FIG. 2 is a flow chart of generating a digital human model based on an artificial intelligence-based digital interaction method provided in an embodiment of the present application;
FIG. 3 is a flowchart of obtaining a first optimized digital human model based on an artificial intelligence-based digital interaction method provided in an embodiment of the present application;
fig. 4 is a flowchart of obtaining a second optimized digital human model according to an artificial intelligence-based digital interaction method provided in an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Referring to fig. 1, fig. 1 is a flow chart of an artificial intelligence based digital interaction method in some embodiments of the present application. The digital interaction method based on the artificial intelligence is used in terminal equipment, such as computers, mobile phone terminals and the like. The digital interaction method based on artificial intelligence comprises the following steps:
s11, acquiring application scene information and extracting application scene feature data, and acquiring background environment information and extracting background environment feature data;
s12, acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;
s13, inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;
s14, performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;
s15, acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, wherein the method comprises the following steps: interaction demand data and user interaction data;
S16, performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;
s17, identifying by utilizing a preset artificial intelligence algorithm according to the user interaction data, generating action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters.
In order to realize the technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing the digital person model, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the natural person feature identification data and the application scene feature data to optimize the digital person model once, carrying out lip synchronization according to the natural person feature identification data to obtain a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user to obtain a personalized digital person model, carrying out identification according to user interaction data by using a preset artificial intelligent algorithm to generate action instruction parameters, and carrying out corresponding actions according to the action instruction parameters.
Referring to fig. 2, fig. 2 is a flow chart of generating a digital human model based on an artificial intelligence-based digital interaction method in some embodiments of the present application. According to the embodiment of the invention, the steps of obtaining the natural person feature information and extracting the natural person feature recognition data, and generating the digital person model according to the natural person feature recognition data and combining the application scene feature data and the background environment feature data comprise the following steps:
s21, acquiring natural person feature information and extracting natural person feature identification data, wherein the method comprises the following steps: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;
s22, generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data.
In the process of generating the digital human model, training is performed according to face recognition data, gesture motion recognition data and voice recognition data of a large number of users in combination with application scene feature data and background environment feature data, so as to generate the digital human model.
Referring to fig. 3, fig. 3 is a flow chart of obtaining a first optimized digital human model based on an artificial intelligence-based digital interaction method in some embodiments of the present application. According to an embodiment of the present invention, the inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain an emotion optimization factor, optimizing the digital person model according to the emotion optimization factor, and obtaining a first optimized digital person model includes:
S31, extracting speech speed data, tone data, intonation data and audio energy data according to the voice recognition data;
s32, inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data with the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;
and S33, optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model.
It should be noted that, in order to impart emotion to a digital human model so as to provide a better interactive experience for a user, face recognition data is required to be input into a preset emotion recognition model in combination with gesture motion recognition data, speech speed data, tone data, intonation data and audio energy data to be processed, so as to obtain emotion feature parameters.
Referring to fig. 4, fig. 4 is a flow chart of obtaining a second optimized digital human model based on an artificial intelligence-based digital interaction method in some embodiments of the present application. According to an embodiment of the present invention, lip synchronization is performed on the first optimized digital person model according to the natural person feature identification data to obtain a second optimized digital person model, including:
s41, respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;
s42, aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;
and S43, optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin.
In order to enable the digital human model to achieve lip synchronization, the audio feature data and the lip feature data are aligned on a time axis, the audio feature data and the lip feature data are input into a preset lip optimization generating model for processing, and lip optimization parameters are obtained, wherein the preset lip optimization generating model is a model obtained by training the audio feature data, the lip feature data and the lip optimization parameters of a large number of historical samples, and the lip optimization parameters which are correspondingly output can be obtained by inputting relevant information for processing.
According to an embodiment of the present invention, the steps of obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data include: interaction demand data, user interaction data, including:
acquiring user basic information and extracting user basic data, including: age data, occupation data, health status data, personality data, language style data, and dialect data;
acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data.
It should be noted that, in order to perform personalized customization of digital people according to personalized requirements of different users, user basic information needs to be acquired and user basic data needs to be extracted, user interaction information needs to be acquired and user interaction data needs to be extracted.
According to an embodiment of the present invention, the personalized adjustment of the second optimized digital mannequin according to the user basic data and the interaction requirement data, to obtain a personalized digital mannequin, includes:
inputting the age data, the occupation data, the health condition data, the character data, the language style data and the interaction demand data into a preset style adjustment model for processing to obtain style adjustment parameters;
Inputting the dialect data into a preset lip optimization model for processing to obtain lip optimization parameters;
and performing personalized adjustment on the second optimized digital mannequin according to the style adjustment parameters and the lip optimization parameters to obtain a personalized digital mannequin.
It should be noted that, in order to implement personalized customization of the digital human model according to different user requirements, the age data, professional data, health status data, personality data, language style data and interaction requirement data are input into a preset style adjustment model to be processed, and style adjustment parameters are obtained, where the preset style adjustment model is a model obtained by obtaining age data, professional data, health status data, personality data, language style data, interaction requirement data and style adjustment parameters of a large number of historical samples to be trained, the style adjustment parameters corresponding to the output can be obtained by inputting relevant information to be processed, then the dialect data is input into a preset lip optimization model to be processed, so as to obtain lip optimization parameters, and the preset lip optimization model is a model obtained by obtaining dialect data and style adjustment parameters of a large number of historical samples to be trained, and the lip optimization parameters corresponding to the output can be obtained by inputting relevant information to be processed.
According to an embodiment of the present invention, the identifying according to the user interaction data by using a preset artificial intelligence algorithm generates an action instruction parameter, and the personalized digital human model performs a corresponding action according to the action instruction parameter, including:
identifying by using a preset artificial intelligent algorithm according to the facial expression data, the limb motion data, the gesture data and the voice data, and generating motion instruction parameters;
and the personalized digital mannequin executes corresponding actions according to the action instruction parameters.
The method is characterized in that facial expression data, limb motion data, gesture data and voice data are identified by a preset artificial intelligence algorithm to generate motion instruction parameters, and the personalized digital human model executes corresponding motions according to the motion instruction parameters.
The invention also discloses a digital interaction system based on artificial intelligence, which comprises a memory and a processor, wherein the memory comprises a digital interaction method program based on artificial intelligence, and the digital interaction method program based on artificial intelligence realizes the following steps when being executed by the processor:
acquiring application scene information and extracting application scene feature data, acquiring background environment information and extracting background environment feature data;
Acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;
inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;
performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;
acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;
and identifying by utilizing a preset artificial intelligence algorithm according to the user interaction data to generate action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters.
In order to realize the technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing the digital person model, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the natural person feature identification data and the application scene feature data to optimize the digital person model once, carrying out lip synchronization according to the natural person feature identification data to obtain a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user to obtain a personalized digital person model, carrying out identification according to user interaction data by using a preset artificial intelligent algorithm to generate action instruction parameters, and carrying out corresponding actions according to the action instruction parameters.
According to the embodiment of the invention, the steps of obtaining the natural person feature information and extracting the natural person feature recognition data, and generating the digital person model according to the natural person feature recognition data and combining the application scene feature data and the background environment feature data comprise the following steps:
Acquiring natural person feature information and extracting natural person feature identification data, including: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;
and generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data.
In the process of generating the digital human model, training is performed according to face recognition data, gesture motion recognition data and voice recognition data of a large number of users in combination with application scene feature data and background environment feature data, so as to generate the digital human model.
According to an embodiment of the present invention, the inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain an emotion optimization factor, optimizing the digital person model according to the emotion optimization factor, and obtaining a first optimized digital person model includes:
extracting speech rate data, tone data, intonation data and audio energy data according to the speech recognition data;
inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data, the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;
And optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model.
It should be noted that, in order to impart emotion to a digital human model so as to provide a better interactive experience for a user, face recognition data is required to be input into a preset emotion recognition model in combination with gesture motion recognition data, speech speed data, tone data, intonation data and audio energy data to be processed, so as to obtain emotion feature parameters.
According to an embodiment of the present invention, lip synchronization is performed on the first optimized digital person model according to the natural person feature identification data to obtain a second optimized digital person model, including:
respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;
Aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;
and optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin.
In order to enable the digital human model to achieve lip synchronization, the audio feature data and the lip feature data are aligned on a time axis, the audio feature data and the lip feature data are input into a preset lip optimization generating model for processing, and lip optimization parameters are obtained, wherein the preset lip optimization generating model is a model obtained by training the audio feature data, the lip feature data and the lip optimization parameters of a large number of historical samples, and the lip optimization parameters which are correspondingly output can be obtained by inputting relevant information for processing.
According to an embodiment of the present invention, the steps of obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data include: interaction demand data, user interaction data, including:
Acquiring user basic information and extracting user basic data, including: age data, occupation data, health status data, personality data, language style data, and dialect data;
acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data.
It should be noted that, in order to perform personalized customization of digital people according to personalized requirements of different users, user basic information needs to be acquired and user basic data needs to be extracted, user interaction information needs to be acquired and user interaction data needs to be extracted.
According to an embodiment of the present invention, the personalized adjustment of the second optimized digital mannequin according to the user basic data and the interaction requirement data, to obtain a personalized digital mannequin, includes:
inputting the age data, the occupation data, the health condition data, the character data, the language style data and the interaction demand data into a preset style adjustment model for processing to obtain style adjustment parameters;
inputting the dialect data into a preset lip optimization model for processing to obtain lip optimization parameters;
And performing personalized adjustment on the second optimized digital mannequin according to the style adjustment parameters and the lip optimization parameters to obtain a personalized digital mannequin.
It should be noted that, in order to implement personalized customization of the digital human model according to different user requirements, the age data, professional data, health status data, personality data, language style data and interaction requirement data are input into a preset style adjustment model to be processed, and style adjustment parameters are obtained, where the preset style adjustment model is a model obtained by obtaining age data, professional data, health status data, personality data, language style data, interaction requirement data and style adjustment parameters of a large number of historical samples to be trained, the style adjustment parameters corresponding to the output can be obtained by inputting relevant information to be processed, then the dialect data is input into a preset lip optimization model to be processed, so as to obtain lip optimization parameters, and the preset lip optimization model is a model obtained by obtaining dialect data and style adjustment parameters of a large number of historical samples to be trained, and the lip optimization parameters corresponding to the output can be obtained by inputting relevant information to be processed.
According to an embodiment of the present invention, the identifying according to the user interaction data by using a preset artificial intelligence algorithm generates an action instruction parameter, and the personalized digital human model performs a corresponding action according to the action instruction parameter, including:
identifying by using a preset artificial intelligent algorithm according to the facial expression data, the limb motion data, the gesture data and the voice data, and generating motion instruction parameters;
and the personalized digital mannequin executes corresponding actions according to the action instruction parameters.
The method is characterized in that facial expression data, limb motion data, gesture data and voice data are identified by a preset artificial intelligence algorithm to generate motion instruction parameters, and the personalized digital human model executes corresponding motions according to the motion instruction parameters.
A third aspect of the present invention provides a readable storage medium having embodied therein an artificial intelligence based digital interaction method program which, when executed by a processor, implements the steps of an artificial intelligence based digital interaction method as described in any of the preceding claims.
The invention discloses a digital interaction method, a system and a medium based on artificial intelligence, which are used for realizing the technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the digital person model by the natural person feature identification data and the application scene feature data, carrying out lip synchronization according to the natural person feature identification data, obtaining a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user, obtaining a personalized digital person model, carrying out identification according to user interaction data by utilizing a preset artificial intelligence algorithm, generating action instruction parameters, and carrying out corresponding actions by the personalized digital person model according to the action instruction parameters.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.
Alternatively, the above-described integrated units of the present invention may be stored in a readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.

Claims (4)

1. The digital interaction method based on artificial intelligence is characterized by comprising the following steps of:
acquiring application scene information and extracting application scene feature data, acquiring background environment information and extracting background environment feature data;
acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;
inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;
performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;
acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;
Identifying by using a preset artificial intelligence algorithm according to the user interaction data, generating action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters;
the step of obtaining natural person feature information and extracting natural person feature recognition data, and generating a digital person model by combining the application scene feature data and the background environment feature data according to the natural person feature recognition data comprises the following steps:
acquiring natural person feature information and extracting natural person feature identification data, including: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;
generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data;
inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model, wherein the method comprises the following steps of:
extracting speech rate data, tone data, intonation data and audio energy data according to the speech recognition data;
Inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data, the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;
optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model;
performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model, including:
respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;
aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;
optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin;
the steps of obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data include: interaction demand data, user interaction data, including:
Acquiring user basic information and extracting user basic data, including: age data, occupation data, health status data, personality data, language style data, and dialect data;
acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data;
the personalized adjustment is performed on the second optimized digital mannequin according to the user basic data and the interaction demand data, so as to obtain a personalized digital mannequin, which comprises the following steps:
inputting the age data, the occupation data, the health condition data, the character data, the language style data and the interaction demand data into a preset style adjustment model for processing to obtain style adjustment parameters;
inputting the dialect data into a preset lip optimization model for processing to obtain lip optimization parameters;
and performing personalized adjustment on the second optimized digital mannequin according to the style adjustment parameters and the lip optimization parameters to obtain a personalized digital mannequin.
2. The digital interaction method based on artificial intelligence according to claim 1, wherein the identifying by using a preset artificial intelligence algorithm according to the user interaction data generates action instruction parameters, and the personalized digital person model performs corresponding actions according to the action instruction parameters, comprising:
Identifying by using a preset artificial intelligent algorithm according to the facial expression data, the limb motion data, the gesture data and the voice data, and generating motion instruction parameters;
and the personalized digital mannequin executes corresponding actions according to the action instruction parameters.
3. An artificial intelligence based digital interactive system comprising a memory and a processor, wherein the memory includes an artificial intelligence based digital interactive method program, which when executed by the processor, performs the steps of:
acquiring application scene information and extracting application scene feature data, acquiring background environment information and extracting background environment feature data;
acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;
inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;
Performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;
acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;
identifying by using a preset artificial intelligence algorithm according to the user interaction data, generating action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters;
the step of obtaining natural person feature information and extracting natural person feature recognition data, and generating a digital person model by combining the application scene feature data and the background environment feature data according to the natural person feature recognition data comprises the following steps:
acquiring natural person feature information and extracting natural person feature identification data, including: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;
generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data;
Inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model, wherein the method comprises the following steps of:
extracting speech rate data, tone data, intonation data and audio energy data according to the speech recognition data;
inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data, the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;
optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model;
performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model, including:
respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;
aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;
Optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin;
the steps of obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data include: interaction demand data, user interaction data, including:
acquiring user basic information and extracting user basic data, including: age data, occupation data, health status data, personality data, language style data, and dialect data;
acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;
the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data;
the personalized adjustment is performed on the second optimized digital mannequin according to the user basic data and the interaction demand data, so as to obtain a personalized digital mannequin, which comprises the following steps:
inputting the age data, the occupation data, the health condition data, the character data, the language style data and the interaction demand data into a preset style adjustment model for processing to obtain style adjustment parameters;
Inputting the dialect data into a preset lip optimization model for processing to obtain lip optimization parameters;
and performing personalized adjustment on the second optimized digital mannequin according to the style adjustment parameters and the lip optimization parameters to obtain a personalized digital mannequin.
4. A computer readable storage medium, characterized in that it comprises therein an artificial intelligence based digital interaction method program, which when executed by a processor, implements the steps of the artificial intelligence based digital interaction method according to any of claims 1 to 2.
CN202311664192.1A 2023-12-06 2023-12-06 Digital interaction method, system and medium based on artificial intelligence Active CN117348736B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311664192.1A CN117348736B (en) 2023-12-06 2023-12-06 Digital interaction method, system and medium based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311664192.1A CN117348736B (en) 2023-12-06 2023-12-06 Digital interaction method, system and medium based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN117348736A CN117348736A (en) 2024-01-05
CN117348736B true CN117348736B (en) 2024-03-19

Family

ID=89367261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311664192.1A Active CN117348736B (en) 2023-12-06 2023-12-06 Digital interaction method, system and medium based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN117348736B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180126818A (en) * 2017-05-18 2018-11-28 이향진 Product design process method and its system reflecting user requirements
CN113873297A (en) * 2021-10-18 2021-12-31 深圳追一科技有限公司 Method and related device for generating digital character video
CN116311456A (en) * 2023-03-23 2023-06-23 应急管理部大数据中心 Personalized virtual human expression generating method based on multi-mode interaction information
CN116524924A (en) * 2023-04-23 2023-08-01 厦门黑镜科技有限公司 Digital human interaction control method, device, electronic equipment and storage medium
CN117058286A (en) * 2023-10-13 2023-11-14 北京蔚领时代科技有限公司 Method and device for generating video by using word driving digital person
CN117132711A (en) * 2023-08-29 2023-11-28 重庆长安汽车股份有限公司 Digital portrait customizing method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20180126818A (en) * 2017-05-18 2018-11-28 이향진 Product design process method and its system reflecting user requirements
CN113873297A (en) * 2021-10-18 2021-12-31 深圳追一科技有限公司 Method and related device for generating digital character video
CN116311456A (en) * 2023-03-23 2023-06-23 应急管理部大数据中心 Personalized virtual human expression generating method based on multi-mode interaction information
CN116524924A (en) * 2023-04-23 2023-08-01 厦门黑镜科技有限公司 Digital human interaction control method, device, electronic equipment and storage medium
CN117132711A (en) * 2023-08-29 2023-11-28 重庆长安汽车股份有限公司 Digital portrait customizing method, device, equipment and storage medium
CN117058286A (en) * 2023-10-13 2023-11-14 北京蔚领时代科技有限公司 Method and device for generating video by using word driving digital person

Also Published As

Publication number Publication date
CN117348736A (en) 2024-01-05

Similar Documents

Publication Publication Date Title
CN110688911B (en) Video processing method, device, system, terminal equipment and storage medium
CN110085244B (en) Live broadcast interaction method and device, electronic equipment and readable storage medium
CN110519636B (en) Voice information playing method and device, computer equipment and storage medium
CN109189544B (en) Method and device for generating dial plate
CN111696572B (en) Voice separation device, method and medium
CN111598979B (en) Method, device and equipment for generating facial animation of virtual character and storage medium
CN112420014A (en) Virtual face construction method and device, computer equipment and computer readable medium
CN113886641A (en) Digital human generation method, apparatus, device and medium
CN114495927A (en) Multi-modal interactive virtual digital person generation method and device, storage medium and terminal
CN114357135A (en) Interaction method, interaction device, electronic equipment and storage medium
CN112330533A (en) Mixed blood face image generation method, model training method, device and equipment
CN116934924A (en) Cartoon image generation method and device and computer equipment
CN113850898A (en) Scene rendering method and device, storage medium and electronic equipment
CN113222841A (en) Image processing method, device, equipment and medium
CN117348736B (en) Digital interaction method, system and medium based on artificial intelligence
CN117440114A (en) Virtual image video generation method, device, equipment and medium
CN115690276A (en) Video generation method and device of virtual image, computer equipment and storage medium
CN111160051B (en) Data processing method, device, electronic equipment and storage medium
CN116561294A (en) Sign language video generation method and device, computer equipment and storage medium
CN110413739B (en) Data enhancement method and system for spoken language semantic understanding
KR20210098623A (en) Method and system for mimicking tone and style of real person
CN117152308B (en) Virtual person action expression optimization method and system
CN116843805B (en) Method, device, equipment and medium for generating virtual image containing behaviors
CN116740788A (en) Virtual person speaking video generation method, server, equipment and storage medium
CN115730048A (en) Session processing method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant