CN113259778A - Method, system and storage medium for using virtual character for automatic video production - Google Patents

Method, system and storage medium for using virtual character for automatic video production Download PDF

Info

Publication number
CN113259778A
CN113259778A CN202110434256.3A CN202110434256A CN113259778A CN 113259778 A CN113259778 A CN 113259778A CN 202110434256 A CN202110434256 A CN 202110434256A CN 113259778 A CN113259778 A CN 113259778A
Authority
CN
China
Prior art keywords
video
information
virtual character
pronunciation
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110434256.3A
Other languages
Chinese (zh)
Inventor
李�权
王伦基
叶俊杰
朱杰
成秋喜
韩蓝青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CYAGEN BIOSCIENCES (GUANGZHOU) Inc
Research Institute Of Tsinghua Pearl River Delta
Original Assignee
CYAGEN BIOSCIENCES (GUANGZHOU) Inc
Research Institute Of Tsinghua Pearl River Delta
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CYAGEN BIOSCIENCES (GUANGZHOU) Inc, Research Institute Of Tsinghua Pearl River Delta filed Critical CYAGEN BIOSCIENCES (GUANGZHOU) Inc
Priority to CN202110434256.3A priority Critical patent/CN113259778A/en
Publication of CN113259778A publication Critical patent/CN113259778A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/816Monomedia components thereof involving special video data, e.g 3D video

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a method, a system and a storage medium for using a virtual character for automatic video production. The method comprises the steps of synthesizing pronunciation sound attributes and an explanation manuscript by using a neural network, obtaining voice information, generating virtual characters, generating video information according to the image information, embedding the virtual characters into the video information and the like. When the video information embedded with the virtual character is played, the image information contained in the video information can be simultaneously displayed, the virtual character simulating the action of reading and explaining the manuscript by the real person and playing synchronous voice information is played, the display effect is that the virtual character introduces the image information as the background, the video information embedded with the virtual character has the characteristics of lip shape and voice matching, rich expression and the like of the real person, the defects of voice synthesis by adopting a splicing scheme and short plates without the explanation of the real person and the virtual cartoon character in the prior art are solved, and the efficiency of video automatic creation can be greatly improved. The invention is widely applied to the technical field of multimedia.

Description

Method, system and storage medium for using virtual character for automatic video production
Technical Field
The invention relates to the technical field of multimedia, in particular to a method, a system and a storage medium for using a virtual character for automatic video production.
Background
Videos often need to be made from the fields of media, campus problem campaigns, travel promotions, and the like. The video production is always provided with higher technical thresholds due to the limitation of the quality of the camera equipment and the professional level of the shooting personnel. For example, video recording requires video equipment to be generally expensive; during the shooting and recording, the non-professional interpreter often has errors such as logic errors, mouth errors, laugh scenes, unsmooth sentences and the like. In response to the above shortcomings, some techniques have attempted to use computer-synthesized speech instead of filming a live instructor to produce promotional material. However, most of the synthesized voices used in the prior art are spliced and synthesized, are not natural and smooth enough, have no rhythm, seriously affect the video quality and give people the feeling of false video; the method for synthesizing the voice has no mirror of a real person, only synthesizes the voice of the cold ice, has low affinity and is not enough to attract the attention of audiences; in a word, the impression effect generated by the prior art is greatly different from the impression effect of the real shot works, and a space worthy of improvement still exists.
Disclosure of Invention
In view of at least one of the above technical problems, it is an object of the present invention to provide a method, system and storage medium for using a virtual character for automatic video production.
In one aspect, an embodiment of the present invention includes a method for using a virtual character for automatic video production, including:
determining a pronunciation sound attribute, a pronunciation image attribute and a pronunciation action attribute;
acquiring image information and an explanation manuscript corresponding to the image information;
synthesizing the pronunciation sound attribute and the explanation manuscript by using a neural network to obtain voice information;
generating a virtual character; the virtual character plays the voice information under the driving of the pronunciation image attribute and the pronunciation action attribute;
generating video information according to the image information;
embedding the virtual character in the video information.
Further, the determining the attribute of the pronunciation sound, the attribute of the pronunciation image and the attribute of the pronunciation action comprises:
acquiring a voice sample, a picture sample and a video sample;
selecting and determining the pronunciation sound attribute matched with the voice sample from a database through voiceprint recognition;
manually selecting an image in a database or taking a photo image of a speaker uploaded by a user as the virtual character;
the actions of the virtual character are selected manually or according to the actions of the speaker uploaded by the user.
Further, the generating the virtual character comprises:
synthesizing the pronunciation action attribute and the voice information by using a character lip generating model to obtain a lip synchronization character action video;
synthesizing the lip-shaped synchronous character action video and the pronunciation image attribute by using a video-driven virtual character model to obtain a virtual character explanation video; the virtual character explanation video includes the virtual character.
Further, the generating the virtual character further includes:
and carrying out cutout processing on the virtual character explanation video by using a video cutout model to obtain a background-free virtual character explanation video.
Further, the embedding the virtual character into the video information includes:
embedding the virtual character explanation video into the video information;
or
Embedding the virtual character explanation background-free video into the video information.
Further, the image information is information in a picture form; the generating of the video information according to the image information includes:
expanding the image information into the video information; the duration of the video information is equal to the duration of the voice information.
Further, the image information is information in the form of video clips; the generating of the video information according to the image information includes:
carrying out variable speed processing to enable the duration of the image information to be equal to the duration of the voice information;
and after the speed change processing, taking the image information as the video information.
Further, the performing the speed change process includes:
performing a speed change process on only a part or all of the image information;
or
Performing variable speed processing on only part or all of the voice information;
or
And carrying out variable speed processing on corresponding parts in the image information and the voice information.
In another aspect, an embodiment of the present invention further includes a system for using a virtual character for automatic video production, including:
the first module is used for determining the attribute of pronunciation sound, the attribute of pronunciation image and the attribute of pronunciation action;
the second module is used for acquiring image information and an explanation manuscript corresponding to the image information;
the third module is used for synthesizing the pronunciation sound attribute and the explanation manuscript by using a neural network to obtain voice information;
a fourth module for generating a virtual character; the virtual character plays the voice information under the driving of the pronunciation image attribute and the pronunciation action attribute;
a fifth module, configured to generate video information according to the image information;
a sixth module for embedding the virtual character into the video information.
In another aspect, embodiments of the present invention also include a storage medium having stored therein processor-executable instructions that, when executed by a processor, perform a method for using a avatar for automated video production.
The invention has the beneficial effects that: when the video information embedded with the virtual character in the embodiment is played, the image information contained in the video information can be simultaneously displayed, the virtual character simulating the action of reading and explaining the manuscript by the real person and playing the synchronous voice information is played, the whole display effect is that the virtual character introduces the image information as the background, the video playing method has the characteristics of lip shape and voice matching, rich expression and the like of the real person, solves the limitation of the requirement of video recording equipment in the prior art, overcomes the defect that the voice is synthesized only by adopting a splicing scheme in the prior art, solves the problem that a short board is not used for explaining the real person and the virtual cartoon character in automatic video generation in the prior art, and can greatly improve the efficiency of automatic video creation.
Drawings
FIG. 1 is a flowchart of a method for using a virtual character for automatic video production according to an embodiment;
FIG. 2 is a schematic diagram of a method for using a virtual character for automatic video production in an embodiment.
Detailed Description
In this embodiment, referring to fig. 1, the method for using a virtual character for video automatic production includes the following steps:
s1, determining pronunciation sound attribute, pronunciation image attribute and pronunciation action attribute;
s2, acquiring image information and an explanation manuscript corresponding to the image information;
s3, synthesizing the pronunciation sound attribute and the explanation manuscript by using a neural network to obtain voice information;
s4, generating a virtual character; the virtual character plays voice information under the drive of the attribute of the pronunciation image and the attribute of the pronunciation action;
s5, generating video information according to the image information;
and S6, embedding the virtual character into the video information.
The principle of steps S1-S6 is shown in FIG. 2.
In step S1, a plurality of pronunciation sound attributes, a plurality of pronunciation image attributes and a plurality of pronunciation action attributes may be stored in the database in advance and provided to the user for selection, the user generates a selection command by selecting a certain set of pronunciation sound attributes, pronunciation image attributes and pronunciation action attributes, and the system determines the pronunciation sound attributes, pronunciation image attributes and pronunciation action attributes corresponding to the selection command from the database according to the selection command.
In step S1, a voice sample, a picture sample, and a video sample uploaded by the user may also be obtained, where the voice sample may be a voice of a person recorded by the user through a mobile phone, the picture sample may be a portrait of a person recorded by the user through a mobile phone, the video sample may be an action video of a person recorded by the user through a mobile phone, and a voice sound attribute matching with the voice sample is selected and determined from the database through voiceprint recognition and image recognition, and the action of the avatar in the database is selected manually or the action of the speaker uploaded by the user is used as the action of the avatar. For example, the matching of the voice sample and the pronunciation sound attribute means that the voiceprint analysis result of the voice sample is close to the voiceprint analysis result of the pronunciation sound attribute, so that the obtained pronunciation sound attribute is similar to the pronunciation feature of the person recording the voice sample.
In step S2, the system acquires image information input by the user and an explanation document corresponding to the image information. The image information may be in a picture form or a video clip form, for example, the image information may be a high-definition photo of a piece of cultural relic taken by a museum, and the corresponding explanation manuscript may be text content telling a source and a historical story of the piece of cultural relic; the image information may also be a video clip obtained by aerial photography of one of the scenic spots in the scenic spot, and the corresponding explanation manuscript may be text content for explaining the opening time, traffic route and play advice of the scenic spot.
In step S3, the system synthesizes the pronunciation attribute and the explanation document using the neural network, and obtains speech information. When the obtained voice information is played, the expressed content is the content of the explanation manuscript, and the pronunciation characteristics are the characteristics of tone, speed, pause rule and the like determined by the pronunciation sound attributes.
In step S4, referring to fig. 2, the system first synthesizes the vocal action attributes and the voice information using a character lip shape generation model to obtain a lip-synchronized character action video, and when the virtual character explanation video is played, displays a lip shape synchronized with the lip action of a real person reading the voice information; and then, the system uses the video to drive the virtual character model to synthesize the lip-shaped synchronous character motion video and the pronunciation image attribute to obtain a virtual character explanation video. When the virtual character explanation video is played, a virtual character with the pronunciation character attribute is displayed, and the virtual character has a lip shape synchronous with the lip action when the real person reads the voice information, namely, the virtual character plays the voice information under the drive of the pronunciation character attribute and the pronunciation action attribute.
In step S4, referring to fig. 2, the system may further use the video matting model to matte the virtual character explanation video, that is, separate the virtual character from the background in the virtual character explanation video, and reserve the virtual character part, thereby obtaining the virtual character explanation background-free video.
In step S5, referring to fig. 2, the image information is subjected to expansion or shift processing to generate video information. When the image information is in a picture form, the image information can be subjected to processing such as amplification, reduction, fade-in, fade-out and the like, and then the processing result is arranged on a time axis, so that the image information is expanded into video information.
When the image information is in the form of a video clip, if the durations of the image information and the voice information do not coincide, the variable speed processing may be performed. Specifically, only part or all of the image information may be subjected to the variable speed processing, only part or all of the voice information may be subjected to the variable speed processing, or both of the image information and the voice information may be subjected to the variable speed processing, so that the time length of the image information coincides with the time length of the voice information, and the final playing effect can be synchronized. When the durations of the image information and the voice information are not consistent, the corresponding portions in the image information and the voice information may refer to a first portion in the image information and a second portion in the voice information, and the relative position of the time axis of the first portion in the image information is the same as the relative position of the time axis of the second portion in the voice information.
In step S6, referring to fig. 2, the avatar explanation video or the avatar explanation background-free video may be embedded in the video information, and the embedding position of the avatar in the video information may be freely selected, thereby implementing the embedding of the avatar in the video information.
Referring to fig. 2, the video information embedded with the virtual character is output by executing steps S1-S6, and since the video information is obtained by processing the image information, that is, the video information includes information of the cultural relic source, the historical story, the scenery spot opening time, the traffic route, the play advice and the like which need to be introduced, and the virtual character is obtained by processing the explanation manuscript, when the virtual character is displayed, the action of reading the explanation manuscript by a real person is simulated, and the synchronous voice information is played, the overall display effect of the virtual character is that the virtual character introduces the image information as the background, and the virtual character has the characteristics of lip shape, voice matching, rich expression and the like of the real person, solves the limitation of the equipment requirement of video recording in the prior art, solves the defect that voice is synthesized by only adopting the splicing scheme in the prior art, and solves the problem that no real person exists in the automatic video generation in the prior art, The short board for explaining the virtual cartoon character can greatly improve the efficiency of automatic video creation. The method for automatically producing the video by using the virtual character in the embodiment is not only applied to virtual character explanation, but also applied to the fields of online education, cultural relic explanation, book reading, intelligent virtual human media, tour guide robots, intelligent customer service, catering robots, home robots and the like.
In this embodiment, the method of using the avatar for video automatic production may be performed using a system of using the avatar for video automatic production. The system for using the virtual character for automatic video production comprises:
the first module is used for determining the attribute of pronunciation sound, the attribute of pronunciation image and the attribute of pronunciation action;
the second module is used for acquiring the image information and the explanation manuscript corresponding to the image information;
the third module is used for synthesizing the pronunciation sound attribute and the explanation manuscript by using a neural network to obtain voice information;
a fourth module for generating a virtual character; the virtual character plays voice information under the drive of the attribute of the pronunciation image and the attribute of the pronunciation action;
a fifth module for generating video information according to the image information;
a sixth module for embedding the virtual character into the video information.
The first module, the second module, the third module, the fourth module, the fifth module and the sixth module may be computer software modules, hardware modules or a combination of software modules and hardware modules with corresponding technical features, and when the system for automatically producing videos using virtual characters is operated, the technical effect of the method for automatically producing videos using virtual characters in the embodiment can be achieved.
In the present embodiment, a storage medium has stored therein processor-executable instructions, which when executed by a processor, are configured to perform the method for using a virtual character for video automatic production in the present embodiment, and achieve the same technical effects as those described in the present embodiment.
It should be noted that, unless otherwise specified, when a feature is referred to as being "fixed" or "connected" to another feature, it may be directly fixed or connected to the other feature or indirectly fixed or connected to the other feature. Furthermore, the descriptions of upper, lower, left, right, etc. used in the present disclosure are only relative to the mutual positional relationship of the constituent parts of the present disclosure in the drawings. As used in this disclosure, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. In addition, unless defined otherwise, all technical and scientific terms used in this example have the same meaning as commonly understood by one of ordinary skill in the art. The terminology used in the description of the embodiments herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this embodiment, the term "and/or" includes any combination of one or more of the associated listed items.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element of the same type from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. The use of any and all examples, or exemplary language ("e.g.," such as "or the like") provided with this embodiment is intended merely to better illuminate embodiments of the invention and does not pose a limitation on the scope of the invention unless otherwise claimed.
It should be recognized that embodiments of the present invention can be realized and implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in a non-transitory computer readable memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory computer-readable storage medium configured with the computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner, according to the methods and figures described in the detailed description. Each program may be implemented in a high level procedural or object terminal oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose.
Further, operations of processes described in this embodiment can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes described in this embodiment (or variations and/or combinations thereof) may be performed under the control of one or more computer systems configured with executable instructions, and may be implemented as code (e.g., executable instructions, one or more computer programs, or one or more applications) collectively executed on one or more processors, by hardware, or combinations thereof. The computer program includes a plurality of instructions executable by one or more processors.
Further, the method may be implemented in any type of computing platform operatively connected to a suitable interface, including but not limited to a personal computer, mini computer, mainframe, workstation, networked or distributed computing environment, separate or integrated computer platform, or in communication with a charged particle tool or other imaging device, and the like. Aspects of the invention may be embodied in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, optically read and/or write storage medium, RAM, ROM, or the like, such that it may be read by a programmable computer, which when read by the storage medium or device, is operative to configure and operate the computer to perform the procedures described herein. Further, the machine-readable code, or portions thereof, may be transmitted over a wired or wireless network. The invention described in this embodiment includes these and other different types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein.
A computer program can be applied to input data to perform the functions described in the present embodiment to convert the input data to generate output data that is stored to a non-volatile memory. The output information may also be applied to one or more output devices, such as a display. In a preferred embodiment of the present invention, the transformed data represents a physical and tangible target terminal, including a particular visual depiction of the physical and tangible target terminal produced on a display.
The above description is only a preferred embodiment of the present invention, and the present invention is not limited to the above embodiment, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present invention should be included in the protection scope of the present invention as long as the technical effects of the present invention are achieved by the same means. The invention is capable of other modifications and variations in its technical solution and/or its implementation, within the scope of protection of the invention.

Claims (10)

1. A method for using a virtual character for automated video production, comprising:
determining a pronunciation sound attribute, a pronunciation image attribute and a pronunciation action attribute;
acquiring image information and an explanation manuscript corresponding to the image information;
synthesizing the pronunciation sound attribute and the explanation manuscript by using a neural network to obtain voice information;
generating a virtual character; the virtual character plays the voice information under the driving of the pronunciation image attribute and the pronunciation action attribute;
generating video information according to the image information;
embedding the virtual character in the video information.
2. The method of claim 1, wherein determining the attributes of the pronunciation sound, the pronunciation image and the pronunciation action comprises:
acquiring a voice sample, a picture sample and a video sample;
selecting and determining the pronunciation sound attribute matched with the voice sample from a database through voiceprint recognition;
manually selecting an image in a database or taking a photo image of a speaker uploaded by a user as the virtual character;
the actions of the virtual character are selected manually or according to the actions of the speaker uploaded by the user.
3. The method of claim 1, wherein the generating a virtual character comprises:
synthesizing the pronunciation action attribute and the voice information by using a character lip generating model to obtain a lip synchronization character action video;
synthesizing the lip-shaped synchronous character action video and the pronunciation image attribute by using a video-driven virtual character model to obtain a virtual character explanation video; the virtual character explanation video includes the virtual character.
4. The method of claim 3, wherein the generating a virtual character further comprises:
and carrying out cutout processing on the virtual character explanation video by using a video cutout model to obtain a background-free virtual character explanation video.
5. The method of claim 4, wherein the embedding the virtual character into the video information comprises:
embedding the virtual character explanation video into the video information;
or
Embedding the virtual character explanation background-free video into the video information.
6. The method of claim 1, wherein the image information is information in the form of a picture; the generating of the video information according to the image information includes:
expanding the image information into the video information; the duration of the video information is equal to the duration of the voice information.
7. A method for automatic production of video by a virtual character according to claim 1, wherein the image information is information in the form of a video clip; the generating of the video information according to the image information includes:
carrying out variable speed processing to enable the duration of the image information to be equal to the duration of the voice information;
and after the speed change processing, taking the image information as the video information.
8. The method of claim 7, wherein the performing a variable speed process comprises:
performing a speed change process on only a part or all of the image information;
or
Performing variable speed processing on only part or all of the voice information;
or
And carrying out variable speed processing on corresponding parts in the image information and the voice information.
9. A system for using a virtual character for automated video production, comprising:
the first module is used for determining the attribute of pronunciation sound, the attribute of pronunciation image and the attribute of pronunciation action;
the second module is used for acquiring image information and an explanation manuscript corresponding to the image information;
the third module is used for synthesizing the pronunciation sound attribute and the explanation manuscript by using a neural network to obtain voice information;
a fourth module for generating a virtual character; the virtual character plays the voice information under the driving of the pronunciation image attribute and the pronunciation action attribute;
a fifth module, configured to generate video information according to the image information;
a sixth module for embedding the virtual character into the video information.
10. A storage medium having stored therein processor-executable instructions, which when executed by a processor, are configured to perform the method of any one of claims 1-8.
CN202110434256.3A 2021-04-22 2021-04-22 Method, system and storage medium for using virtual character for automatic video production Pending CN113259778A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110434256.3A CN113259778A (en) 2021-04-22 2021-04-22 Method, system and storage medium for using virtual character for automatic video production

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110434256.3A CN113259778A (en) 2021-04-22 2021-04-22 Method, system and storage medium for using virtual character for automatic video production

Publications (1)

Publication Number Publication Date
CN113259778A true CN113259778A (en) 2021-08-13

Family

ID=77221259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110434256.3A Pending CN113259778A (en) 2021-04-22 2021-04-22 Method, system and storage medium for using virtual character for automatic video production

Country Status (1)

Country Link
CN (1) CN113259778A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016648A (en) * 2022-07-15 2022-09-06 大爱全息(北京)科技有限公司 Holographic interaction device and processing method thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107213642A (en) * 2017-05-12 2017-09-29 北京小米移动软件有限公司 Virtual portrait outward appearance change method and device
CN109118562A (en) * 2018-08-31 2019-01-01 百度在线网络技术(北京)有限公司 Explanation video creating method, device and the terminal of virtual image
CN110381266A (en) * 2019-07-31 2019-10-25 百度在线网络技术(北京)有限公司 A kind of video generation method, device and terminal
CN110913267A (en) * 2019-11-29 2020-03-24 上海赛连信息科技有限公司 Image processing method, device, system, interface, medium and computing equipment
CN111739507A (en) * 2020-05-07 2020-10-02 广东康云科技有限公司 AI-based speech synthesis method, system, device and storage medium
CN112562721A (en) * 2020-11-30 2021-03-26 清华珠三角研究院 Video translation method, system, device and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107213642A (en) * 2017-05-12 2017-09-29 北京小米移动软件有限公司 Virtual portrait outward appearance change method and device
CN109118562A (en) * 2018-08-31 2019-01-01 百度在线网络技术(北京)有限公司 Explanation video creating method, device and the terminal of virtual image
CN110381266A (en) * 2019-07-31 2019-10-25 百度在线网络技术(北京)有限公司 A kind of video generation method, device and terminal
CN110913267A (en) * 2019-11-29 2020-03-24 上海赛连信息科技有限公司 Image processing method, device, system, interface, medium and computing equipment
CN111739507A (en) * 2020-05-07 2020-10-02 广东康云科技有限公司 AI-based speech synthesis method, system, device and storage medium
CN112562721A (en) * 2020-11-30 2021-03-26 清华珠三角研究院 Video translation method, system, device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115016648A (en) * 2022-07-15 2022-09-06 大爱全息(北京)科技有限公司 Holographic interaction device and processing method thereof

Similar Documents

Publication Publication Date Title
US8988436B2 (en) Training system and methods for dynamically injecting expression information into an animated facial mesh
US7015934B2 (en) Image displaying apparatus
US11847726B2 (en) Method for outputting blend shape value, storage medium, and electronic device
JP2009533786A (en) Self-realistic talking head creation system and method
KR102186607B1 (en) System and method for ballet performance via augumented reality
US11968433B2 (en) Systems and methods for generating synthetic videos based on audio contents
Liu et al. Realistic facial expression synthesis for an image-based talking head
CN106909217A (en) A kind of line holographic projections exchange method of augmented reality, apparatus and system
CN113395569B (en) Video generation method and device
CN113259778A (en) Method, system and storage medium for using virtual character for automatic video production
CN117636897A (en) Digital human audio and video generation system
CN110647780A (en) Data processing method and system
JP4651981B2 (en) Education information management server
KR101457045B1 (en) The manufacturing method for Ani Comic by applying effects for 2 dimensional comic contents and computer-readable recording medium having Ani comic program manufacturing Ani comic by applying effects for 2 dimensional comic contents
CN111311713A (en) Cartoon processing method, cartoon display device, cartoon terminal and cartoon storage medium
US10596452B2 (en) Toy interactive method and device
JP4085015B2 (en) STREAM DATA GENERATION DEVICE, STREAM DATA GENERATION SYSTEM, STREAM DATA GENERATION METHOD, AND PROGRAM
KR102138132B1 (en) System for providing animation dubbing service for learning language
CN111160051B (en) Data processing method, device, electronic equipment and storage medium
Kolivand et al. Realistic lip syncing for virtual character using common viseme set
KR102265102B1 (en) Editing method for subtitle with kinetic typography and electronic apparatus thereof
KR20170052084A (en) Apparatus and method for learning foreign language speaking
KR100374329B1 (en) Method for displaying english-play motion image and recording medium thereof
WO2023126975A1 (en) Device for synchronization of features of digital objects with audio contents
Polyzos Critical Examination of the Use of Artificial Intelligence as a Creative Tool in Editing and its Potential as a Creator in its own

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210813