CN117348736B

CN117348736B - Digital interaction method, system and medium based on artificial intelligence

Info

Publication number: CN117348736B
Application number: CN202311664192.1A
Authority: CN
Inventors: 杨良志; 白琳; 杨安培; 吴海锋; 崔寅; 江梦玲
Original assignee: Richinfo Technology Co ltd
Current assignee: Richinfo Technology Co ltd
Priority date: 2023-12-06
Filing date: 2023-12-06
Publication date: 2024-03-19
Anticipated expiration: 2043-12-06
Also published as: CN117348736A

Abstract

The application provides a digital interaction method, system and medium based on artificial intelligence. The method comprises the following steps: generating a digital human model according to the natural human feature recognition data and the application scene feature data in combination with the background environment feature data, generating emotion optimization factors for the natural human feature recognition data and the application scene feature data to perform primary optimization on the digital human model, performing lip synchronization according to the natural human feature recognition data to obtain a second optimized digital human model, performing personalized adjustment on the second optimized digital human model according to the basic data and the interaction demand data of the interaction user to obtain a personalized digital human model, performing recognition according to the user interaction data by using a preset artificial intelligent algorithm to generate action instruction parameters, and performing corresponding actions according to the action instruction parameters by the personalized digital human model. The method and the device provide more natural and comfortable intelligent interaction experience for the user through personalized customization of the digital human model.

Description

Digital interaction method, system and medium based on artificial intelligence

Technical Field

The application relates to the technical field of big data and artificial intelligence, in particular to a digital interaction method, a digital interaction system and a digital interaction medium based on artificial intelligence.

Background

The digital person is a digital character image which is created by using a digital technology and is close to a human image, along with the daily and new moon of virtual digital person theory and technology development, the digital person is widely applied to a plurality of application scenes such as live broadcasting, film and television, advertisement, news, personal interaction and the like, and the dialogue, action and behavior interaction of the person and the digital person can be realized through an artificial intelligence algorithm, however, at present, a technology for quickly generating a digital person model according to different application scenes and background environments, continuously optimizing and then customizing the digital person according to the interaction requirement of a user is not yet available.

In view of the above problems, an effective technical solution is currently needed.

Disclosure of Invention

The invention aims to provide a digital interaction method, a system and a medium based on artificial intelligence, and aims to realize a technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the digital person model by the natural person feature identification data and the application scene feature data, carrying out lip synchronization according to the natural person feature identification data, obtaining a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user, obtaining a personalized digital person model, carrying out identification according to user interaction data by utilizing a preset artificial intelligence algorithm, generating action instruction parameters, and carrying out corresponding actions according to the action instruction parameters by the personalized digital person model.

The application also provides a digital interaction method based on artificial intelligence, which comprises the following steps:

acquiring application scene information and extracting application scene feature data, acquiring background environment information and extracting background environment feature data;

acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;

inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;

performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;

acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;

performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;

And identifying by utilizing a preset artificial intelligence algorithm according to the user interaction data to generate action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters.

Optionally, in the artificial intelligence based digital interaction method described in the present application, the obtaining natural person feature information and extracting natural person feature identification data, and generating the digital person model according to the natural person feature identification data in combination with the application scene feature data and the background environment feature data includes:

acquiring natural person feature information and extracting natural person feature identification data, including: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;

and generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data.

Optionally, in the artificial intelligence based digital interaction method described in the present application, the inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model to process to obtain an emotion optimization factor, optimizing the digital person model according to the emotion optimization factor, and obtaining a first optimized digital person model includes:

Extracting speech rate data, tone data, intonation data and audio energy data according to the speech recognition data;

inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data, the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;

and optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model.

Optionally, in the artificial intelligence based digital interaction method described in the present application, the performing lip synchronization on the first optimized digital person model according to the natural person feature identification data to obtain a second optimized digital person model includes:

respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;

aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;

and optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin.

Optionally, in the artificial intelligence based digital interaction method described in the present application, the obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data includes: interaction demand data, user interaction data, including:

acquiring user basic information and extracting user basic data, including: age data, occupation data, health status data, personality data, language style data, and dialect data;

acquiring user interaction information and extracting user interaction data, including: interaction demand data and user interaction data;

the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data.

Optionally, in the artificial intelligence based digital interaction method described in the present application, the performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction requirement data to obtain a personalized digital mannequin includes:

inputting the age data, the occupation data, the health condition data, the character data, the language style data and the interaction demand data into a preset style adjustment model for processing to obtain style adjustment parameters;

Inputting the dialect data into a preset lip optimization model for processing to obtain lip optimization parameters;

and performing personalized adjustment on the second optimized digital mannequin according to the style adjustment parameters and the lip optimization parameters to obtain a personalized digital mannequin.

Optionally, in the digital interaction method based on artificial intelligence, the identifying according to the user interaction data by using a preset artificial intelligence algorithm generates an action instruction parameter, and the personalized digital person model executes a corresponding action according to the action instruction parameter, including:

identifying by using a preset artificial intelligent algorithm according to the facial expression data, the limb motion data, the gesture data and the voice data, and generating motion instruction parameters;

and the personalized digital mannequin executes corresponding actions according to the action instruction parameters.

In a second aspect, the present application provides an artificial intelligence based digital interactive system comprising: the system comprises a memory and a processor, wherein the memory comprises a program of an artificial intelligence-based digital interaction method, and the program of the artificial intelligence-based digital interaction method realizes the following steps when being executed by the processor:

Optionally, in the artificial intelligence based digital interaction system described in the present application, the obtaining natural person feature information and extracting natural person feature identification data, and generating the digital person model according to the natural person feature identification data in combination with the application scene feature data and the background environment feature data includes:

In a third aspect, the present application further provides a computer readable storage medium, including an artificial intelligence based digital interaction method program, which when executed by a processor, implements the steps of the artificial intelligence based digital interaction method as described in any of the above.

In order to realize the technology of quickly generating a digital person model according to different application scenes and background environments and continuously optimizing the digital person model and then carrying out personalized digital person customization according to user interaction requirements, firstly, generating a digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the digital person model by the natural person feature identification data and the application scene feature data, carrying out lip synchronization according to the natural person feature identification data, obtaining a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data of an interactive user and interaction requirement data, obtaining a personalized digital person model, identifying the personalized digital person model according to user interaction data by using a preset artificial intelligent algorithm, generating action instruction parameters, and carrying out corresponding actions by the personalized digital person model according to the action instruction parameters.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objects and other advantages of the present application may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an artificial intelligence based digital interaction method provided in an embodiment of the present application;

FIG. 2 is a flow chart of generating a digital human model based on an artificial intelligence-based digital interaction method provided in an embodiment of the present application;

FIG. 3 is a flowchart of obtaining a first optimized digital human model based on an artificial intelligence-based digital interaction method provided in an embodiment of the present application;

fig. 4 is a flowchart of obtaining a second optimized digital human model according to an artificial intelligence-based digital interaction method provided in an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

Referring to fig. 1, fig. 1 is a flow chart of an artificial intelligence based digital interaction method in some embodiments of the present application. The digital interaction method based on the artificial intelligence is used in terminal equipment, such as computers, mobile phone terminals and the like. The digital interaction method based on artificial intelligence comprises the following steps:

s11, acquiring application scene information and extracting application scene feature data, and acquiring background environment information and extracting background environment feature data;

s12, acquiring natural person feature information, extracting natural person feature identification data, and generating a digital person model according to the natural person feature identification data and combining the application scene feature data and the background environment feature data;

s13, inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, and optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model;

s14, performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model;

s15, acquiring user basic information and extracting user basic data, acquiring user interaction information and extracting user interaction data, wherein the method comprises the following steps: interaction demand data and user interaction data;

S16, performing personalized adjustment on the second optimized digital mannequin according to the user basic data and the interaction demand data to obtain a personalized digital mannequin;

s17, identifying by utilizing a preset artificial intelligence algorithm according to the user interaction data, generating action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters.

In order to realize the technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing the digital person model, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the natural person feature identification data and the application scene feature data to optimize the digital person model once, carrying out lip synchronization according to the natural person feature identification data to obtain a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user to obtain a personalized digital person model, carrying out identification according to user interaction data by using a preset artificial intelligent algorithm to generate action instruction parameters, and carrying out corresponding actions according to the action instruction parameters.

Referring to fig. 2, fig. 2 is a flow chart of generating a digital human model based on an artificial intelligence-based digital interaction method in some embodiments of the present application. According to the embodiment of the invention, the steps of obtaining the natural person feature information and extracting the natural person feature recognition data, and generating the digital person model according to the natural person feature recognition data and combining the application scene feature data and the background environment feature data comprise the following steps:

s21, acquiring natural person feature information and extracting natural person feature identification data, wherein the method comprises the following steps: facial recognition data, gesture motion recognition data, voice recognition data, and lip recognition data;

s22, generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data.

In the process of generating the digital human model, training is performed according to face recognition data, gesture motion recognition data and voice recognition data of a large number of users in combination with application scene feature data and background environment feature data, so as to generate the digital human model.

Referring to fig. 3, fig. 3 is a flow chart of obtaining a first optimized digital human model based on an artificial intelligence-based digital interaction method in some embodiments of the present application. According to an embodiment of the present invention, the inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain an emotion optimization factor, optimizing the digital person model according to the emotion optimization factor, and obtaining a first optimized digital person model includes:

S31, extracting speech speed data, tone data, intonation data and audio energy data according to the voice recognition data;

s32, inputting the face recognition data into a preset emotion recognition model for processing by combining the gesture motion recognition data with the speech speed data, the tone data, the intonation data and the audio energy data to obtain emotion characteristic parameters;

and S33, optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model.

It should be noted that, in order to impart emotion to a digital human model so as to provide a better interactive experience for a user, face recognition data is required to be input into a preset emotion recognition model in combination with gesture motion recognition data, speech speed data, tone data, intonation data and audio energy data to be processed, so as to obtain emotion feature parameters.

Referring to fig. 4, fig. 4 is a flow chart of obtaining a second optimized digital human model based on an artificial intelligence-based digital interaction method in some embodiments of the present application. According to an embodiment of the present invention, lip synchronization is performed on the first optimized digital person model according to the natural person feature identification data to obtain a second optimized digital person model, including:

s41, respectively carrying out feature recognition according to the voice recognition data and the lip recognition data to obtain audio feature data and lip feature data;

s42, aligning the audio feature data and the lip feature data on a time axis, inputting the audio feature data and the lip feature data into a preset lip optimization generation model for processing, and obtaining lip optimization parameters;

and S43, optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin.

In order to enable the digital human model to achieve lip synchronization, the audio feature data and the lip feature data are aligned on a time axis, the audio feature data and the lip feature data are input into a preset lip optimization generating model for processing, and lip optimization parameters are obtained, wherein the preset lip optimization generating model is a model obtained by training the audio feature data, the lip feature data and the lip optimization parameters of a large number of historical samples, and the lip optimization parameters which are correspondingly output can be obtained by inputting relevant information for processing.

According to an embodiment of the present invention, the steps of obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data include: interaction demand data, user interaction data, including:

It should be noted that, in order to perform personalized customization of digital people according to personalized requirements of different users, user basic information needs to be acquired and user basic data needs to be extracted, user interaction information needs to be acquired and user interaction data needs to be extracted.

According to an embodiment of the present invention, the personalized adjustment of the second optimized digital mannequin according to the user basic data and the interaction requirement data, to obtain a personalized digital mannequin, includes:

It should be noted that, in order to implement personalized customization of the digital human model according to different user requirements, the age data, professional data, health status data, personality data, language style data and interaction requirement data are input into a preset style adjustment model to be processed, and style adjustment parameters are obtained, where the preset style adjustment model is a model obtained by obtaining age data, professional data, health status data, personality data, language style data, interaction requirement data and style adjustment parameters of a large number of historical samples to be trained, the style adjustment parameters corresponding to the output can be obtained by inputting relevant information to be processed, then the dialect data is input into a preset lip optimization model to be processed, so as to obtain lip optimization parameters, and the preset lip optimization model is a model obtained by obtaining dialect data and style adjustment parameters of a large number of historical samples to be trained, and the lip optimization parameters corresponding to the output can be obtained by inputting relevant information to be processed.

According to an embodiment of the present invention, the identifying according to the user interaction data by using a preset artificial intelligence algorithm generates an action instruction parameter, and the personalized digital human model performs a corresponding action according to the action instruction parameter, including:

The method is characterized in that facial expression data, limb motion data, gesture data and voice data are identified by a preset artificial intelligence algorithm to generate motion instruction parameters, and the personalized digital human model executes corresponding motions according to the motion instruction parameters.

The invention also discloses a digital interaction system based on artificial intelligence, which comprises a memory and a processor, wherein the memory comprises a digital interaction method program based on artificial intelligence, and the digital interaction method program based on artificial intelligence realizes the following steps when being executed by the processor:

According to the embodiment of the invention, the steps of obtaining the natural person feature information and extracting the natural person feature recognition data, and generating the digital person model according to the natural person feature recognition data and combining the application scene feature data and the background environment feature data comprise the following steps:

According to an embodiment of the present invention, the inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain an emotion optimization factor, optimizing the digital person model according to the emotion optimization factor, and obtaining a first optimized digital person model includes:

According to an embodiment of the present invention, lip synchronization is performed on the first optimized digital person model according to the natural person feature identification data to obtain a second optimized digital person model, including:

A third aspect of the present invention provides a readable storage medium having embodied therein an artificial intelligence based digital interaction method program which, when executed by a processor, implements the steps of an artificial intelligence based digital interaction method as described in any of the preceding claims.

The invention discloses a digital interaction method, a system and a medium based on artificial intelligence, which are used for realizing the technology of quickly generating a digital person model according to different application scenes and background environments, continuously optimizing, then carrying out personalized digital person customization according to user interaction requirements, firstly generating the digital person model according to natural person feature identification data in combination with application scene feature data and background environment feature data, generating emotion optimization factors for the digital person model by the natural person feature identification data and the application scene feature data, carrying out lip synchronization according to the natural person feature identification data, obtaining a second optimized digital person model, carrying out personalized adjustment on the second optimized digital person model according to basic data and interaction requirement data of an interaction user, obtaining a personalized digital person model, carrying out identification according to user interaction data by utilizing a preset artificial intelligence algorithm, generating action instruction parameters, and carrying out corresponding actions by the personalized digital person model according to the action instruction parameters.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.

Alternatively, the above-described integrated units of the present invention may be stored in a readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.

Claims

1. The digital interaction method based on artificial intelligence is characterized by comprising the following steps of:

Identifying by using a preset artificial intelligence algorithm according to the user interaction data, generating action instruction parameters, and executing corresponding actions by the personalized digital human model according to the action instruction parameters;

the step of obtaining natural person feature information and extracting natural person feature recognition data, and generating a digital person model by combining the application scene feature data and the background environment feature data according to the natural person feature recognition data comprises the following steps:

generating a digital human model through a preset PAAS platform tool according to the face recognition data, the gesture motion recognition data and the voice recognition data in combination with the application scene feature data and the background environment feature data;

inputting the natural person feature recognition data and the application scene feature data into a preset emotion recognition model for processing to obtain emotion optimization factors, optimizing the digital person model according to the emotion optimization factors to obtain a first optimized digital person model, wherein the method comprises the following steps of:

optimizing the digital human model according to the emotion characteristic parameters to obtain a first optimized digital human model;

performing lip synchronization on the first optimized digital human model according to the natural human feature identification data to obtain a second optimized digital human model, including:

optimizing the first optimized digital mannequin according to the lip optimization parameters to obtain a second optimized digital mannequin;

the steps of obtaining user basic information and extracting user basic data, obtaining user interaction information and extracting user interaction data include: interaction demand data, user interaction data, including:

the user interaction data includes: facial expression data, limb motion data, gesture data, and voice data;

the personalized adjustment is performed on the second optimized digital mannequin according to the user basic data and the interaction demand data, so as to obtain a personalized digital mannequin, which comprises the following steps:

2. The digital interaction method based on artificial intelligence according to claim 1, wherein the identifying by using a preset artificial intelligence algorithm according to the user interaction data generates action instruction parameters, and the personalized digital person model performs corresponding actions according to the action instruction parameters, comprising:

3. An artificial intelligence based digital interactive system comprising a memory and a processor, wherein the memory includes an artificial intelligence based digital interactive method program, which when executed by the processor, performs the steps of:

4. A computer readable storage medium, characterized in that it comprises therein an artificial intelligence based digital interaction method program, which when executed by a processor, implements the steps of the artificial intelligence based digital interaction method according to any of claims 1 to 2.