CN109064532B - Automatic mouth shape generating method and device for cartoon character - Google Patents

Automatic mouth shape generating method and device for cartoon character Download PDF

Info

Publication number
CN109064532B
CN109064532B CN201810597021.4A CN201810597021A CN109064532B CN 109064532 B CN109064532 B CN 109064532B CN 201810597021 A CN201810597021 A CN 201810597021A CN 109064532 B CN109064532 B CN 109064532B
Authority
CN
China
Prior art keywords
mouth shape
audio
audio data
data
data unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810597021.4A
Other languages
Chinese (zh)
Other versions
CN109064532A (en
Inventor
刘东东
沈晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Kapu Animation Design Co ltd
Original Assignee
Shenzhen Kapu Animation Design Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Kapu Animation Design Co ltd filed Critical Shenzhen Kapu Animation Design Co ltd
Priority to CN201810597021.4A priority Critical patent/CN109064532B/en
Publication of CN109064532A publication Critical patent/CN109064532A/en
Application granted granted Critical
Publication of CN109064532B publication Critical patent/CN109064532B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/2053D [Three Dimensional] animation driven by audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The application discloses an automatic mouth shape generating method and device for an animation character. The method comprises the following steps: generating a corresponding audio model data set for each preset basic mouth shape; acquiring audio data of a dubbing file; acquiring at least one audio data unit and time information of the audio data unit from the audio data; comparing each audio data unit with audio model data in the audio model data set to determine a basic mouth shape corresponding to each audio data unit; and generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit. The method and the device solve the problems of great labor consumption and low manufacturing efficiency in the prior art when the mouth shape effect is added for the animation role through dubbing.

Description

Automatic mouth shape generating method and device for cartoon character
Technical Field
The present invention relates to the field of animation, and in particular, to a method and apparatus for automatically generating an animation character mouth shape.
Background
In animation, adding mouth-shape effects to an animated character by dubbing is an important task. In the existing animation production, especially in the mass production of animation episodes, adding a mouth shape effect to an animation character through dubbing often needs a large amount of experienced animation production personnel to complete the process for a long time, specifically, the animation production personnel manually determines mouth shape information in dubbing from character dubbing, and then produces a mouth shape effect matched with the character dubbing for the animation character according to the mouth shape information in dubbing. The existing animation mouth shape matching method requires a great deal of manual labor, and the manufacturing efficiency is quite low.
Aiming at the problems of great labor consumption and low manufacturing efficiency in the process of adding mouth shape effects for animation roles through dubbing in the related art, the inventor provides a solution.
Content of the application
The main purpose of the application is to provide an automatic mouth shape generating method for an animation character, so as to solve the problems of great labor consumption and low manufacturing efficiency in the process of adding mouth shape effects for the animation character through dubbing in the related technology.
In order to achieve the above object, according to one aspect of the present application, there is provided an automatic mouth shape generating method of an animated character.
The automatic mouth shape generating method for the animation roles comprises the following steps: generating a corresponding audio model data set for each preset basic mouth shape; acquiring audio data of a dubbing file; acquiring at least one audio data unit and time information of the audio data unit from the audio data; comparing each audio data unit with audio model data in the audio model data set to determine a basic mouth shape corresponding to each audio data unit; and generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit.
Further, the method for automatically generating the mouth shape of the animation role further comprises the following steps: generating a character mouth shape effect corresponding to each basic mouth shape for the animation character; and generating a mouth shape corresponding to the dubbing file for the animation role according to the mouth shape data and the role mouth shape effect.
Further, in said comparing each of said audio data units with audio model data in said audio model data set: acquiring an audio data unit in which matched audio model data cannot be found in the audio model data set; determining a basic mouth shape corresponding to the audio data unit; and adding the audio data unit into an audio model data set corresponding to the basic mouth shape.
Further, the method for automatically generating the mouth shape of the animation role further comprises the following steps: generating a characteristic mouth shape effect of each basic mouth shape under each preset characteristic for the animation character; receiving the characteristics selected by a user; and generating a mouth shape corresponding to the dubbing file for the animation role according to the mouth shape effect of the characteristics corresponding to the characteristics selected by the user and the mouth shape data.
Further, the obtaining at least one audio data unit from the audio data and the time information of the audio data unit includes: splitting the audio data into at least one audio data unit according to single word pronunciation; and acquiring time information of each audio data unit in the audio data.
Further, after comparing each audio data unit with the audio model data in the audio model data set, determining a basic mouth shape corresponding to each audio data unit, the method includes: acquiring waveform data of the audio data unit and audio model data corresponding to the audio data unit; and calculating the mouth shape size data of the basic mouth shape corresponding to the audio data unit through waveform comparison.
Further, the method for automatically generating the mouth shape of the animation role further comprises the following steps: converting the time of the audio data into a frame number, the mouth shape data further comprising: and the frame number information corresponding to each basic mouth shape in the dubbing file.
In order to achieve the above object, according to another aspect of the present application, there is provided an automatic mouth shape generating device for an animated character.
The automatic mouth shape generating device for the animation roles comprises: the audio set generation module is used for generating a corresponding audio model data set for each preset basic mouth shape; the audio data acquisition module is used for acquiring the audio data of the dubbing file; an audio data processing module, configured to obtain at least one audio data unit and time information of the audio data unit from the audio data; the data comparison module is used for comparing each audio data unit with the audio model data in the audio model data set to determine a basic mouth shape corresponding to each audio data unit; and the mouth shape data generation module is used for generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit.
Further, the automatic mouth shape generating device for the animation roles further comprises: the mouth shape effect generating module is used for generating a character mouth shape effect corresponding to each basic mouth shape for the animation character; and the character mouth shape generating module is used for generating mouth shapes corresponding to the dubbing files for the animation characters according to the mouth shape data and the character mouth shape effect.
Further, the mouth shape effect generating module is further configured to generate, for the animated character, a characteristic mouth shape effect of each basic mouth shape under each preset characteristic, and the automatic mouth shape generating device for the animated character further includes: the character selecting module is used for receiving the characteristics selected by the user, and the character mouth shape generating module is also used for generating mouth shapes corresponding to the dubbing files for the animation characters according to the characteristic mouth shape effect corresponding to the characteristics selected by the user and the mouth shape data.
In the embodiment of the application, the mouth shape of Chinese pronunciation is divided into four types through statistical analysis, corresponding audio model databases are respectively built for the four types of basic mouth shapes, the audio data of the dubbing files are compared with the audio model data in the audio data in each audio model database, the purpose of rapidly identifying the mouth shape data corresponding to the audio files is achieved, on one hand, the difficulty of mouth shape matching is reduced through summarizing and simplifying the induction of the mouth shape of Chinese pronunciation, on the other hand, the mouth shape data corresponding to the dubbing files is rapidly identified through building the dubbing mouth shape comparison model, the workload of animation production personnel is further reduced, the mouth shape corresponding to the dubbing files is also accelerated for the animation role generation, and therefore the problems of great labor consumption and low production efficiency in the mouth shape adding effect of the animation role through dubbing in the related technology are solved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, are included to provide a further understanding of the application and to provide a further understanding of the application with regard to the other features, objects and advantages of the application. The drawings of the illustrative embodiments of the present application and their descriptions are for the purpose of illustrating the present application and are not to be construed as unduly limiting the present application. In the drawings:
FIG. 1 is a first flow diagram of an embodiment animated character automatic mouth shape generation method;
FIG. 2 is a second flow diagram of an embodiment animated character automatic mouth shape generation method;
FIG. 3 is a schematic diagram of a third flow chart of an embodiment animated character automatic mouth shape generation method;
FIG. 4 is a fourth flow diagram of an embodiment animated character automatic shape creation method;
FIG. 5 is a first schematic diagram of an embodiment animated character automatic mouth shape generating device;
FIG. 6 is a second schematic diagram of an embodiment animated character automatic mouth shape generating device; and
fig. 7 is a schematic diagram of a third configuration of an automatic mouth shape generating device for an animated character according to an embodiment.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the present application described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
As shown in fig. 1, the automatic mouth shape generating method for the animated character includes steps S101 to S106.
Step S101, generating a corresponding audio model data set for each preset basic mouth shape. In this step, the basic mouth shape is first determined by statistically classifying the mouth shape of the chinese pronunciation, and in an alternative embodiment of the present application, the mouth shapes of Chinese pronunciations can be divided into four types through statistical analysis, namely mouth shapes corresponding to pronunciations a (o), o (e) (forehead) and en (thienyl) respectively. And then respectively establishing corresponding audio model databases aiming at the four basic mouth shapes. In the case of Chinese pronunciation, the pronunciation of two words is different, but the mouth shapes in the case of pronunciation are the same, so that each basic mouth shape corresponds to multiple word pronunciation. In the step, firstly, all audio files of pronunciation corresponding to each basic mouth shape are acquired, then the audio files are converted into digital audio data, the digital audio data are stored in an audio model database corresponding to the basic mouth shape, and the digital audio data are used for judging the basic mouth shape corresponding to pronunciation in dubbing audio subsequently.
Step S102, obtaining the audio data of the dubbing file. In this step, an audio file of dubbing of the animated character is first obtained, and then the audio file is converted into digital data. In order to eliminate the influence of noise, it is necessary to perform noise data removal processing on digital data of a dubbing audio file before identifying mouth shape information corresponding to the dubbing.
Step S103, obtaining at least one audio data unit and time information of the audio data unit from the audio data. In this step, in order to accurately identify the mouth shape corresponding to each pronunciation in the dubbing and determine the time corresponding to each mouth shape in the dubbing, the audio digital data of the dubbing need to be processed in a segmentation mode. The audio data is divided into a plurality of audio data units according to the pronunciation length of the word, so that the processing capacity of subsequent comparison and recognition is effectively reduced, the accuracy of the subsequent comparison and recognition is improved, and in addition, the time information of each mouth shape in dubbing is accurately determined by acquiring the corresponding time interval of each audio data unit in the audio data.
Step S104, comparing each audio data unit with the audio model data in the audio model data set, and determining a basic mouth shape corresponding to each audio data unit. In the step, each audio data unit in dubbed audio data and each audio model data in an audio model library of the basic mouth shape are compared and identified, and the basic mouth shape corresponding to each audio data unit is identified. In the application, the basic mouth shape corresponding to each audio data unit is determined through similarity calculation, namely, when the similarity of the audio data units and the audio model data in the audio model library is calculated during comparison, if the data similarity of the audio data units and the audio model library is higher than a preset threshold value, the audio data units are considered to be matched with the audio model data, and then the basic mouth shape corresponding to the audio data units is determined. In an alternative embodiment of the present application, the threshold value of the similarity match is 70 percent, and the user can freely adjust the value of the threshold value according to the requirement when comparing.
In the application, in order to accelerate the comparison and recognition efficiency and reduce the processing capacity of a computer, the dubbing files and the audio models in the audio model library of each basic mouth shape are converted into digital forms and stored in the system, and the comparison and recognition efficiency is accelerated due to the simple and efficient comparison of digital data.
Step S105, generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit. In the step, the basic mouth shape corresponding to the pronunciation of each word in dubbing is determined through comparison and identification, the corresponding time of the basic mouth shape in dubbing is determined, the mouth shape data corresponding to the dubbing is accurately identified, and further, the situation that animation producer rapidly produces mouth shapes for animation roles according to the mouth shape data corresponding to the dubbing is realized. In the method, the mouth shape data corresponding to dubbing is quickly identified by establishing the mouth shape audio frequency comparison model, so that the manpower labor is effectively reduced, and the mouth shape manufacturing efficiency is improved.
And step S106, generating a mouth shape corresponding to the dubbing file for the animation role according to the mouth shape data. In an alternative embodiment of the present application, the system automatically generates a mouth shape corresponding to dubbing for the animated character according to the mouth shape data corresponding to the dubbing. In other embodiments of the present application, a mouth shape corresponding to dubbing may be manually generated for an animated character by an animator from the mouth shape data.
In the method, firstly, mouth shape data corresponding to dubbing is quickly identified by establishing a mouth shape audio frequency comparison model, and then an animation producer can quickly generate mouth shapes corresponding to dubbing files according to the mouth shape data corresponding to dubbing. As shown in fig. 2, the method of generating a mouth shape by a mouth shape data fast-acting picture character includes steps S201 to S202.
Step S201, generating a character mouth shape effect corresponding to each basic mouth shape for the animation character. Because the mouth shape effect of the character corresponding to the four basic mouth shapes of the animation character is required to be manufactured before the mouth shape data in the dubbing is matched to a certain animation character, in the alternative embodiment of the application, the mouth shape effect of the character can be a two-dimensional mouth shape effect or a three-dimensional mouth shape effect according to the types of the animation, and the mouth shape effect of the character can be presented in a map form or a moving picture form and stored in a system, so that the extraction and the use are convenient.
And step S202, generating a mouth shape corresponding to the dubbing file for the animation character according to the mouth shape data and the character mouth shape effect. In this step, according to the basic mouth shape corresponding to the pronunciation of each word in the dubbing obtained in the above step S105 and the time interval corresponding to the basic mouth shape in the dubbing, and the character mouth shape effects corresponding to the four basic mouth shapes of the animated character generated in the step S201, the mouth shape corresponding to the dubbing file can be quickly generated for the animated character. In the embodiment of the application, the mouth shapes of the animation characters can be automatically generated through the manufacturing software according to the mouth shape data corresponding to dubbing and the mouth shape effect of the characters of each basic mouth shape. Since the mouth shape effect of the character may be a two-dimensional mouth shape effect or a three-dimensional mouth shape effect, the mouth shape data may be directly output to the two-dimensional effect of the animated character or directly output to the three-dimensional effect of the animated character.
In the step S104, when comparing each audio data unit with the audio model data in the audio model database, the situation that the audio data unit cannot find the matched data in the audio model database may occur due to the incomplete audio model data in the audio model database, so as to improve the recognition accuracy, the application adopts the idea of machine learning, and the audio model data in the audio model database is continuously supplemented by unidentified audio data.
As shown in fig. 3, the method for continuously supplementing the audio model data in the audio model database with unidentified audio data includes steps S301 to S303.
Step S301, acquiring an audio data unit in which matching audio model data cannot be found in the audio model data set. In this step, when the audio data unit is compared with the audio model data in the audio model database, an audio data unit of the matched audio model data cannot be found, and here, the unmatched audio data unit is also determined by similarity matching, that is, the audio model data with the similarity of more than 70 percent to the audio data unit cannot be found in the audio model database, and the audio data unit is regarded as the unmatched audio data unit.
Step S302, determining a basic mouth shape corresponding to the audio data unit. In this step, the basic mouth shape corresponding to the unmatched audio data unit is determined, and in an alternative embodiment of the present application, the basic mouth shape corresponding to the audio data unit may be manually determined by an animator according to experience.
Step S303, adding the audio data unit to the audio model data set corresponding to the basic mouth shape. In this step, after determining the basic mouth shape corresponding to the audio data unit, the animation producer may add the audio data unit to the corresponding audio model database, so as to continuously supplement the audio model database, thereby being beneficial to improving the recognition accuracy.
In the application, considering that different special mouth shapes are often required to be manufactured when the animation mouth shapes are manufactured, in order to quickly manufacture mouth shapes with specific special effects for the animation roles, different special effect mouth shapes are also built on the basis of the basic mouth shapes.
As shown in fig. 4, the method for adding a special effect mouth shape when generating a mouth shape corresponding to dubbing for an animated character includes steps S401 to S403.
Step S401, generating a characteristic mouth shape effect of each basic mouth shape under each preset characteristic for the animation role. In this step, different characteristics to be expressed by the animated character are determined first, where the characteristics may be emotion characteristics or special effects, such as a mouth shape effect of the animated character in happiness, sadness, anger, etc., or exaggerated special effects, such as a mouth shape effect of a character in enlarging a mouth shape, reducing a mouth shape, adding a mouth shape to an artistic effect, etc. These mouth shape effects may be two-dimensional effects or three-dimensional effects in this application depending on the kind of animation. After determining the mouth shape characteristics required by the animation character, the mouth shape effect of each basic mouth shape of the animation character under each mouth shape special effect is manufactured, and all mouth shape effects are stored according to classification, so that the mouth shape effect can be extracted rapidly during use. In embodiments of the present application, these character mouth-shape effects may be presented in the form of a map or in the form of a map.
Step S402, receiving the feature selected by the user. In this step, when the animator adds a mouth shape corresponding to the dubbing to the animated character, it is determined whether the characteristic mouth shape is required for the dubbing and which characteristic mouth shape is required to be added, for example, when the mouth shape of the dubbing is required to be added with a happy special effect, the animator selects the happy special effect before making the mouth shape for the character, and extracts the mouth shape effect corresponding to the happy special effect of the animated character.
Step S403, generating a mouth shape corresponding to the dubbing file for the animation role according to the mouth shape effect of the feature corresponding to the feature selected by the user and the mouth shape data. In this step, according to the dubbed mouth shape data acquired in step S105 and the special effect mouth shape effect generated in step S401, a special effect mouth shape corresponding to the dubbing file can be quickly generated for the animated character. In the embodiment of the application, the mouth shapes of the animation characters can be automatically generated through the manufacturing software according to the mouth shape data corresponding to dubbing and the mouth shape effect of the characters of each basic mouth shape. Since the mouth shape effect of the character may be a two-dimensional mouth shape effect or a three-dimensional mouth shape effect, the mouth shape data may be directly output to the two-dimensional effect of the animated character or directly output to the three-dimensional effect of the animated character. Through the steps S401 to S403, the present application realizes that special effects such as emotion are added when adding a mouth shape for an animated character, and further improves the efficiency of mouth shape production.
According to the method, when the mouth shape is manufactured for the animation role, mouth shape size data are added, so that the mouth shape of the person is more vivid, the mouth shape size data are calculated according to the audio waveform data during comparison of the audio model in the step S104.
In the present application, since the time unit of dubbing is often in seconds, and the dubbing is often performed in frames at the time of animation, the number of frames for each basic mouth shape in the animation is required at the time of mouth shape creation for the animated character. Specifically, in step S102, when audio data of a dubbing file is acquired, time data of the audio data is acquired, and then a time unit is converted from a second unit to a frame unit of 24 frames per second or 25 frames per second. Further, the mouth shape data corresponding to the dubbing file generated in step S105 is compared with the audio data converted into a frame unit, and the time of each basic mouth shape in the mouth shape data is converted into a frame unit. Since different color frame rates are often required in animation due to different artistic styles, and the most basic time units of frames per second in animation, the time units of the audio file are converted from units of seconds to 24 frames per second or 25 frames per second. In the method, the frame number corresponding to each basic mouth shape in the mouth shape data is obtained, so that the mouth shape manufacturing efficiency is further improved, and the manual labor is reduced.
From the above description, it can be seen that the following technical effects are achieved:
1. the mouth shape of Chinese pronunciation is divided into four types through statistical analysis, and then corresponding audio model databases are respectively established for the four basic mouth shapes, and the difficulty of mouth shape matching is reduced through generalization, summarization and simplification of the mouth shape of Chinese pronunciation.
2. The mouth shape data corresponding to the dubbing file is rapidly identified by establishing the dubbing mouth shape comparison model, so that the workload of animation staff is reduced, and the generation of the mouth shape corresponding to the dubbing file for the animation role by the animation staff is accelerated.
3. When the dubbing mouth shape is compared, firstly, the audio data is divided into a plurality of audio data units according to the pronunciation length of the word, so that the processing amount of subsequent comparison and identification is effectively reduced, the accuracy of the subsequent comparison and identification is also improved, and in addition, the time information of each mouth shape in dubbing is accurately determined by acquiring the corresponding time interval of each audio data unit in the audio data.
4. When the dubbing mouth shapes are compared, the audio data of the dubbing file and the audio models in the audio model library of each basic mouth shape are stored in the system in a digital mode, and the comparison and identification efficiency is quickened due to the fact that the digital data are simple and efficient in comparison, and the processing capacity of a computer is reduced.
5. The method adopts the idea of machine learning, and the unrecognized audio data is used for continuously supplementing and updating the audio model database of the basic mouth shape, so that the recognition accuracy is improved.
6. The character emotion effect is added into the mouth shape model, and in consideration of the fact that different mouth shape special effects are often required to be manufactured when the animation mouth shape is manufactured, the mouth shape with specific special effects is quickly manufactured for the animation character, and different emotion special effect mouth shapes are further built on the basis of the basic mouth shape.
7. The mouth shape of the person is more lifelike by adding the mouth shape size data, and the mouth shape size data is calculated according to the waveform data of the audio frequency when the mouth shapes are compared.
8. The time units of the audio file are converted from units of seconds to 24 frames per second or 25 frames per second. In the method, the frame number corresponding to each basic mouth shape in the mouth shape data is obtained, so that the mouth shape manufacturing efficiency is further improved, and the manual labor is reduced.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
According to an embodiment of the present application, there is also provided an animated character automatic mouth shape generating device for implementing the above animated character automatic mouth shape generating method, as shown in fig. 5, the device including: an audio set generating module 1, an audio data acquiring module 2, an audio data processing module 3, a data comparing module 4 and a mouth shape data generating module 5, wherein:
the audio set generating module 1 is used for generating a corresponding audio model data set for each preset basic mouth shape;
an audio data obtaining module 2, configured to obtain audio data of a dubbing file;
an audio data processing module 3, configured to obtain at least one audio data unit and time information of the audio data unit from the audio data;
the data comparison module 4 is used for comparing each audio data unit with the audio model data in the audio model data set to determine a basic mouth shape corresponding to each audio data unit;
and the mouth shape data generating module 5 is used for generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit.
As shown in fig. 6, the animated character automatic mouth shape generating device further includes: a mouth shape effect generation module 6 and a character mouth shape generation module 7, wherein:
a mouth shape effect generating module 6 for generating a mouth shape effect of the character corresponding to each basic mouth shape for the animated character;
and a character mouth shape generating module 7 for generating a mouth shape corresponding to the dubbing file for the animated character according to the mouth shape data and the character mouth shape effect.
As shown in fig. 7, the animated character automatic mouth shape generating device further includes: a feature selection module 8, wherein:
the mouth shape effect generating module 6 is further used for generating a characteristic mouth shape effect of each basic mouth shape under each preset characteristic for the animation role;
a feature selection module 8 for receiving the feature selected by the user;
and the character mouth shape generating module 7 is further used for generating a mouth shape corresponding to the dubbing file for the animation character according to the mouth shape effect of the characteristics corresponding to the characteristics selected by the user and the mouth shape data.
It will be apparent to those skilled in the art that the modules or steps of the application described above may be implemented in a general purpose computing device, they may be centralized on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a memory device and executed by computing devices, or individually fabricated as individual integrated circuit modules, or multiple modules or steps within them may be fabricated as a single integrated circuit module. Thus, the present application is not limited to any specific combination of hardware and software.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims (5)

1. An automatic mouth shape generating method for an animation character is characterized by comprising the following steps:
generating a corresponding audio model data set for each preset basic mouth shape, wherein each basic mouth shape corresponds to a plurality of word pronunciations; all audio files of pronunciation corresponding to each basic mouth shape are converted into digital audio data; the mouth shapes of Chinese pronunciations are classified into four basic mouth shapes by statistics and analysis, namely mouth shapes corresponding to pronunciations a, o, e and en respectively;
acquiring audio data of a dubbing file, wherein the audio file is converted into digital data;
acquiring at least one audio data unit and time information of the audio data unit from the audio data; comprising the following steps: splitting the audio data into at least one audio data unit according to single word pronunciation; acquiring time information of each audio data unit in the audio data;
comparing each audio data unit with the audio model data in the audio model data set to calculate the similarity of the audio data units and the audio model data in the audio model library, and determining the basic mouth shape corresponding to each audio data unit;
generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit;
the method for automatically generating the mouth shape of the animation role further comprises the following steps: generating a characteristic mouth shape effect of each basic mouth shape under each preset characteristic for the animation character; receiving the characteristics selected by a user; generating a mouth shape corresponding to the dubbing file for the animation role according to the characteristic mouth shape effect corresponding to the characteristic selected by the user and the mouth shape data;
after comparing each audio data unit with the audio model data in the audio model data set, determining a basic mouth shape corresponding to each audio data unit, the method comprises the following steps: acquiring waveform data of the audio data unit and audio model data corresponding to the audio data unit; and calculating the mouth shape size data of the basic mouth shape corresponding to the audio data unit through waveform comparison.
2. The method for automatically generating an animated character model as in claim 1, further comprising:
generating a character mouth shape effect corresponding to each basic mouth shape for the animation character;
and generating a mouth shape corresponding to the dubbing file for the animation role according to the mouth shape data and the role mouth shape effect.
3. The method of automatic mouth shape generation for an animated character according to claim 1, wherein, in said comparing each of said audio data units with audio model data in said audio model data set:
acquiring an audio data unit in which matched audio model data cannot be found in the audio model data set;
determining a basic mouth shape corresponding to the audio data unit;
and adding the audio data unit into an audio model data set corresponding to the basic mouth shape.
4. The method for automatically generating an animated character model as in claim 1, further comprising:
converting the time of the audio data into a frame number,
the die data further includes: and the frame number information corresponding to each basic mouth shape in the dubbing file.
5. An automatic mouth shape generating device for an animated character, comprising:
the audio set generation module is used for generating a corresponding audio model data set for each preset basic mouth shape; wherein, each basic mouth shape corresponds to a plurality of word pronunciations; all audio files of pronunciation corresponding to each basic mouth shape are converted into digital audio data; the mouth shapes of Chinese pronunciations are classified into four basic mouth shapes by statistics and analysis, namely mouth shapes corresponding to pronunciations a, o, e and en respectively;
the audio data acquisition module is used for acquiring the audio data of the dubbing file; wherein the audio file is converted into digital data;
an audio data processing module, configured to obtain at least one audio data unit and time information of the audio data unit from the audio data; comprising the following steps: splitting the audio data into at least one audio data unit according to single word pronunciation; acquiring time information of each audio data unit in the audio data;
the data comparison module is used for comparing each audio data unit with the audio model data in the audio model data set so as to calculate the similarity of the audio data units and the audio model data in the audio model library and determine the basic mouth shape corresponding to each audio data unit;
the mouth shape data generating module is used for generating mouth shape data corresponding to the dubbing file according to the time information of each audio data unit and the basic mouth shape corresponding to each audio data unit;
the automatic mouth shape generating device of the animation role further comprises: the mouth shape effect generating module is used for generating a character mouth shape effect corresponding to each basic mouth shape for the animation character; a character mouth shape generating module for generating mouth shapes corresponding to the dubbing files for the animation characters according to the mouth shape data and the character mouth shape effect;
the mouth shape effect generating module is further used for generating a characteristic mouth shape effect of each basic mouth shape under each preset characteristic for the animation role, and the automatic mouth shape generating device for the animation role further comprises: the feature selection module is used for receiving the features selected by the user, and the character mouth shape generation module is also used for generating mouth shapes corresponding to the dubbing files for the animation characters according to the feature mouth shape effects corresponding to the features selected by the user and the mouth shape data;
after comparing each audio data unit with the audio model data in the audio model data set, determining a basic mouth shape corresponding to each audio data unit, the method comprises the following steps: acquiring waveform data of the audio data unit and audio model data corresponding to the audio data unit; and calculating the mouth shape size data of the basic mouth shape corresponding to the audio data unit through waveform comparison.
CN201810597021.4A 2018-06-11 2018-06-11 Automatic mouth shape generating method and device for cartoon character Active CN109064532B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810597021.4A CN109064532B (en) 2018-06-11 2018-06-11 Automatic mouth shape generating method and device for cartoon character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810597021.4A CN109064532B (en) 2018-06-11 2018-06-11 Automatic mouth shape generating method and device for cartoon character

Publications (2)

Publication Number Publication Date
CN109064532A CN109064532A (en) 2018-12-21
CN109064532B true CN109064532B (en) 2024-01-12

Family

ID=64820171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810597021.4A Active CN109064532B (en) 2018-06-11 2018-06-11 Automatic mouth shape generating method and device for cartoon character

Country Status (1)

Country Link
CN (1) CN109064532B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110136698B (en) * 2019-04-11 2021-09-24 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for determining mouth shape
CN110189394B (en) * 2019-05-14 2020-12-29 北京字节跳动网络技术有限公司 Mouth shape generation method and device and electronic equipment
CN112750184B (en) * 2019-10-30 2023-11-10 阿里巴巴集团控股有限公司 Method and equipment for data processing, action driving and man-machine interaction
CN110930481B (en) * 2019-12-11 2024-06-04 北京慧夜科技有限公司 Prediction method and system for mouth shape control parameters
CN112331184B (en) * 2020-10-29 2024-03-15 网易(杭州)网络有限公司 Voice mouth shape synchronization method and device, electronic equipment and storage medium
CN113112575B (en) * 2021-04-08 2024-04-30 深圳市山水原创动漫文化有限公司 Mouth shape generating method and device, computer equipment and storage medium
CN114782597A (en) * 2022-04-06 2022-07-22 北京达佳互联信息技术有限公司 Image processing method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1787035A (en) * 2005-11-04 2006-06-14 黄中伟 Method for computer assisting learning of deaf-dumb Chinese language pronunciation
CN1851779A (en) * 2006-05-16 2006-10-25 黄中伟 Multi-language available deaf-mute language learning computer-aid method
CN1936889A (en) * 2005-09-20 2007-03-28 文化传信科技(澳门)有限公司 Cartoon generation system and method
JP2016173790A (en) * 2015-03-18 2016-09-29 カシオ計算機株式会社 Image processing apparatus, animation generation method, and program
CN106297792A (en) * 2016-09-14 2017-01-04 厦门幻世网络科技有限公司 The recognition methods of a kind of voice mouth shape cartoon and device
CN107998658A (en) * 2017-12-01 2018-05-08 苏州蜗牛数字科技股份有限公司 3D role's shape of the mouth as one speaks voice chatting system and method are realized in VR game

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1936889A (en) * 2005-09-20 2007-03-28 文化传信科技(澳门)有限公司 Cartoon generation system and method
CN1787035A (en) * 2005-11-04 2006-06-14 黄中伟 Method for computer assisting learning of deaf-dumb Chinese language pronunciation
CN1851779A (en) * 2006-05-16 2006-10-25 黄中伟 Multi-language available deaf-mute language learning computer-aid method
JP2016173790A (en) * 2015-03-18 2016-09-29 カシオ計算機株式会社 Image processing apparatus, animation generation method, and program
CN106297792A (en) * 2016-09-14 2017-01-04 厦门幻世网络科技有限公司 The recognition methods of a kind of voice mouth shape cartoon and device
CN107998658A (en) * 2017-12-01 2018-05-08 苏州蜗牛数字科技股份有限公司 3D role's shape of the mouth as one speaks voice chatting system and method are realized in VR game

Also Published As

Publication number Publication date
CN109064532A (en) 2018-12-21

Similar Documents

Publication Publication Date Title
CN109064532B (en) Automatic mouth shape generating method and device for cartoon character
US10332507B2 (en) Method and device for waking up via speech based on artificial intelligence
CN108492817B (en) Song data processing method based on virtual idol and singing interaction system
CN103456314B (en) A kind of emotion identification method and device
CN111935537A (en) Music video generation method and device, electronic equipment and storage medium
CN110427610A (en) Text analyzing method, apparatus, computer installation and computer storage medium
CN109801349B (en) Sound-driven three-dimensional animation character real-time expression generation method and system
CN109408833A (en) A kind of interpretation method, device, equipment and readable storage medium storing program for executing
CN107589828A (en) The man-machine interaction method and system of knowledge based collection of illustrative plates
CN103793447A (en) Method and system for estimating semantic similarity among music and images
CN108664465A (en) One kind automatically generating text method and relevant apparatus
US20220375223A1 (en) Information generation method and apparatus
CN107665188B (en) Semantic understanding method and device
CN104504088A (en) Construction method of lip shape model library for identifying lip language
CN111462758A (en) Method, device and equipment for intelligent conference role classification and storage medium
CN111191503A (en) Pedestrian attribute identification method and device, storage medium and terminal
CN108833810A (en) The method and device of subtitle is generated in a kind of live streaming of three-dimensional idol in real time
CN110459200A (en) Phoneme synthesizing method, device, computer equipment and storage medium
CN114581567B (en) Method, device and medium for driving mouth shape of virtual image by sound
WO2023116122A1 (en) Subtitle generation method, electronic device, and computer-readable storage medium
CN111667557A (en) Animation production method and device, storage medium and terminal
Cosovic et al. Classification methods in cultural heritage
CN113705300A (en) Method, device and equipment for acquiring phonetic-to-text training corpus and storage medium
CN103544978A (en) Multimedia file manufacturing and playing method and intelligent terminal
CN116958342A (en) Method for generating actions of virtual image, method and device for constructing action library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231213

Address after: 518000, Building 302, Nanhai Yiku, Xinghua Road, Shekou, Shuiwan Community, Nanshan District, Shenzhen City, Guangdong Province, China

Applicant after: Shenzhen Kapu Animation Design Co.,Ltd.

Address before: Room 531, Building A, No. 68 Dongheyan, Chengqiao Town, Chongming District, Shanghai, 202155 (Shanghai Chengqiao Economic Development Zone)

Applicant before: SHANGHAI KAKA CULTURAL COMMUNICATION CO.,LTD.

GR01 Patent grant
GR01 Patent grant