CN110493613B - Video lip synchronization synthesis method and system - Google Patents
Video lip synchronization synthesis method and system Download PDFInfo
- Publication number
- CN110493613B CN110493613B CN201910758080.XA CN201910758080A CN110493613B CN 110493613 B CN110493613 B CN 110493613B CN 201910758080 A CN201910758080 A CN 201910758080A CN 110493613 B CN110493613 B CN 110493613B
- Authority
- CN
- China
- Prior art keywords
- lip
- video
- codes
- cloud server
- prototype
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2368—Multiplexing of audio and video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/242—Synchronization processes, e.g. processing of PCR [Program Clock References]
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a method and a system for synthesizing video lip synchronization, and belongs to the technical field of lip synchronization. The method specifically comprises the following steps: the cloud server receives the pronunciation manuscript through the terminal equipment and splits the manuscript into a plurality of sentences according to punctuation marks; the cloud server performs permutation and combination on each split sentence according to different lips, and matches the lips with the prototype video: splicing each sentence of prototype video successfully matched to form a synthesized video; calculating the playing time of the formed composite video; the cloud server sets the speed of speech according to time through the pronunciation manuscript received by the terminal equipment, and ensures that the pronunciation time length is equal to the character playing time length. The invention sets different codes for different lip shapes according to lip shape combination when characters on the pronunciation manuscript are pronounced, selects the prototype video corresponding to the lip shapes and synthesizes, thereby achieving the effect of ensuring the lip shapes to be consistent with the sound factors while the character picture is played in voice and increasing the authenticity.
Description
Technical Field
The invention belongs to the technical field of sound lip synchronization, and particularly relates to a video sound lip synchronization synthesis method and system.
Background
In order to enhance communication and exchange with customers and quasi-customers and provide better products and technical services for the customers, a plurality of merchants or organizations are specially provided with own customer service and after-sales technical service departments, the workload of the workers of the departments for communicating with the customers on line or off line every day is large, repeated and complicated question answering and guidance are needed, the users cannot be served on line or on duty every 24 hours, and the virtual real-person robot can be transported as soon as possible. That is, a large amount of real person videos and answering voices are stored in the display screen, and corresponding feedback is given to questions of customers.
However, in the answering process, the lip shape of the person in the video and the answering voice are synthesized in the later period, so that the lip shape of the person in the video is not synchronous with the voice, characters heard by the client are not matched with the lip shape, and the effect of face-to-face communication with real person customer service is not achieved, so that the client can be mentally excluded from the service.
Disclosure of Invention
The present invention provides a method and a system for synthesizing video lip synchronization that provides a sense of reality to people in order to solve the technical problems in the background art.
The invention is realized by the following technical scheme: a video lip synchronization synthesis method is characterized in that various lip prototype video files suitable for a virtual robot are stored in a cloud server;
the video lip synchronization synthesis method specifically comprises the following steps:
step 1: the cloud server receives the pronunciation manuscript through the terminal equipment and splits the manuscript into a plurality of sentences according to punctuation marks;
step 2: the cloud server performs permutation and combination on each split sentence according to different lips, and matches the lips with the prototype video:
and step 3: splicing each sentence of prototype video successfully matched to form a synthesized video;
and 4, step 4: calculating the playing time of the synthesized video formed in the step 3;
and 5: the cloud server sets the speed of speech according to the time of step 4, the pronunciation duration is equal to the text playing duration, the text is sent to the voice gateway, and the voice gateway converts the text into a sound file and sends the sound file back to the cloud server;
step 6: synthesizing the synthesized video generated in the step 3 and the sound generated in the step 5 to form a final synthesized video;
and 7: and (4) playing the synthesized video generated in the step (6) through a specified terminal, and exiting the system.
In a further embodiment, the step 2 specifically includes the following steps:
step 2.1: converting each Chinese character in the sentence into pinyin, setting lip codes to be 1 when consonants pronounce without closing lips according to vowels of the pinyin, setting lip codes to be 2 when the consonants pronounce while closing lips, setting lip codes to be 3 when the consonants pronounce while closing lips, setting lip codes to be 4 when the lip codes are large, setting lip codes to be 5 when the consonants pronounce while closing lips according to vowels, and setting lip codes to be 6 when the lip codes are large, thereby obtaining a string of lip permutation codes of the sentence;
step 2.2: searching and acquiring a prototype video with identical or similar lip-shaped arrangement codes in a prototype video library, wherein the lip-shaped codes of the last word in a sentence are required to be identical;
step 2.3: if the finding is found, turning to the step 3;
step 2.4: if the lip shape arranged codes with close lip shapes do not exist in the prototype video library, the lip shape arranged codes are subjected to limited splitting until prototype videos with the same or close lip shapes are found in each segment after splitting, the lip shape codes of the last word of a sentence are required to be equal, the prototype videos are spliced into sentence videos, and the step 3 is carried out;
step 2.5: if the lip-shaped equivalent or similar prototype video cannot be found after the limited splitting, the report system adds the prototype video supplementing the lip-shaped arrangement code, the matching fails, and the report system exits.
A uses the above-mentioned a video lip synchronous synthetic system, the terminal station of the robot, is used for receiving the question voice of the customer, and send and formate the video;
the cloud server is used for receiving the question voice sent by the robot terminal through the Internet, feeding back a corresponding synthesized video to the robot terminal through the Internet according to the question voice, and playing the synthesized video by the robot terminal;
in a further embodiment, the cloud server comprises: the device comprises a processor, a recording unit, a touch display unit, a communication unit and a lip arrangement unit, wherein the processor is respectively connected with the recording unit, the touch display unit, the communication unit and the lip arrangement unit;
the recording unit is used for acquiring the question voice of the client; the touch display unit is used for customer operation and video playing; the communication unit is used for carrying out data transmission with the cloud server; the lip shape arrangement unit is used for corresponding to the arrangement combination of different lip shapes of each sentence of characters and endowing each prototype video file with different lip shape arrangement combination codes, and the lip shape arrangement codes comprise: when consonants are sounded, lip codes are set to 1 when lip is slightly opened, and are set to 2 when lip is greatly opened, when vowels are sounded, lip codes are set to 3 when lip is slightly opened, and are set to 4 when lip is greatly opened, and when vowels are sounded, lip codes are set to 5 when lip is slightly opened, and are set to 6 when lip is greatly opened.
In a further embodiment, the cloud server comprises:
the receiving and pushing module is used for receiving the data sent by the robot terminal and sending the data to the robot terminal;
the voice conversion module is used for converting the question voice received from the cloud server through the Internet into question words and feeding the question words back to the cloud server; meanwhile, the pronunciation manuscript received on the cloud server through the Internet is converted into answer voice and is fed back to the cloud server through the Internet;
the matching module is used for matching the questioning words with corresponding answer voice or answer video from a question bank in a cloud server;
and the storage module is used for storing questioning voice, answering voice, pronunciation manuscript, synthesized video and key words of the client.
The invention has the beneficial effects that: different codes are set for different lips according to lip combination during pronunciation of characters on the pronunciation manuscript, prototype videos corresponding to the lips are selected and synthesized, the effect that the lips are consistent with sound when people pictures are played in voice is achieved, and authenticity is improved.
Drawings
Fig. 1 is a flow chart of a video lip synchronization synthesizing method.
Fig. 2 is a block flow diagram of step 2 in fig. 1.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Although the steps in the present invention are arranged by using reference numbers, the order of the steps is not limited, and the relative order of the steps can be adjusted unless the order of the steps is explicitly stated or other steps are required for the execution of a certain step. It is to be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
The applicant aims at the problems existing in the existing service industry: the lip shape of the person in the video and the answering voice are synthesized in a later period, so that the lip shape of the person in the video is not synchronous with the voice, characters heard by a client are not matched with the lip shape, the effect of face-to-face communication with real person customer service is not achieved, and the client can be prevented from mentally rejecting the service.
Therefore, in order to solve the above technical problems, the applicant designs a real person online help machine service system, and a video lip synchronization synthesis method and system which can improve the reality of the system.
First, various lip-shaped prototype video files suitable for the virtual robot are stored in the cloud server.
As shown in fig. 1, the method for synthesizing video lip synchronization specifically includes the following steps:
step 1: the cloud server receives the pronunciation manuscript through the terminal equipment and splits the manuscript into a plurality of sentences according to punctuation marks;
step 2: the cloud server performs permutation and combination on each split sentence according to different lips, and matches the lips with the prototype video:
and step 3: splicing each sentence of prototype video successfully matched to form a synthesized video;
and 4, step 4: calculating the playing time of the synthesized video formed in the step 3;
and 5: the cloud server sets the speed of speech according to the time of step 4, the pronunciation duration is equal to the text playing duration, the text is sent to the voice gateway, and the voice gateway converts the text into a sound file and sends the sound file back to the cloud server;
step 6: synthesizing the synthesized video generated in the step 3 and the sound generated in the step 5 to form a final synthesized video;
and 7: and (4) playing the synthesized video generated in the step (6) through a specified terminal, and exiting the system.
As shown in fig. 2, the step 2 specifically includes the following steps:
step 2.1: converting each Chinese character in the sentence into pinyin, setting lip codes to be 1 when consonants pronounce without closing lips according to vowels of the pinyin, setting lip codes to be 2 when the consonants pronounce while closing lips, setting lip codes to be 3 when the consonants pronounce while closing lips, setting lip codes to be 4 when the lip codes are large, setting lip codes to be 5 when the consonants pronounce while closing lips according to vowels, and setting lip codes to be 6 when the lip codes are large, thereby obtaining a string of lip permutation codes of the sentence;
step 2.2: searching and acquiring a prototype video with identical or similar lip-shaped arrangement codes in a prototype video library, wherein the lip-shaped codes of the last word in a sentence are required to be identical;
step 2.3: if the finding is found, turning to the step 3;
step 2.4: if the lip shape arranged codes with close lip shapes do not exist in the prototype video library, the lip shape arranged codes are subjected to limited splitting until prototype videos with the same or close lip shapes are found in each segment after splitting, the lip shape codes of the last word of a sentence are required to be equal, the prototype videos are spliced into sentence videos, and the step 3 is carried out;
step 2.5: if the lip-shaped equivalent or similar prototype video cannot be found after the limited splitting, the report system adds the prototype video supplementing the lip-shaped arrangement code, the matching fails, and the report system exits.
A video lip-synchronized compositing system, comprising: the robot terminal is used for receiving the questioning voice of the client and sending the synthesized video;
the cloud server is used for receiving the question voice sent by the robot terminal through the Internet, feeding back a corresponding synthesized video to the robot terminal through the Internet according to the question voice, and playing the synthesized video by the robot terminal;
4. the system according to claim 3, wherein the cloud server comprises: the device comprises a processor, a recording unit, a touch display unit, a communication unit and a lip arrangement unit, wherein the processor is respectively connected with the recording unit, the touch display unit, the communication unit and the lip arrangement unit;
the recording unit is used for acquiring the question voice of the client; the touch display unit is used for customer operation and video playing; the communication unit is used for carrying out data transmission with the cloud server; the lip shape arrangement unit is used for corresponding to the arrangement combination of different lip shapes of each sentence of characters and endowing each prototype video file with different lip shape arrangement combination codes, and the lip shape arrangement codes comprise: when consonants are sounded, lip codes are set to 1 when lip is slightly opened, and are set to 2 when lip is greatly opened, when vowels are sounded, lip codes are set to 3 when lip is slightly opened, and are set to 4 when lip is greatly opened, and when vowels are sounded, lip codes are set to 5 when lip is slightly opened, and are set to 6 when lip is greatly opened.
The cloud server comprises: the receiving and pushing module is used for receiving the data sent by the robot terminal and sending the data to the robot terminal; the voice conversion module is used for converting the question voice received from the cloud server through the Internet into question words and feeding the question words back to the cloud server; meanwhile, the pronunciation manuscript received on the cloud server through the Internet is converted into answer voice and is fed back to the cloud server through the Internet; the matching module is used for matching the questioning words with corresponding answer voice or answer video from a question bank in a cloud server; and the storage module is used for storing questioning voice, answering voice, pronunciation manuscript, synthesized video and key words of the client.
The video is synthesized by matching the answer voice, so that the answer voice and the synthesized video can be played simultaneously when the virtual robot demonstrates, the consistency of the sound effect and the picture is kept, the pronunciation of the sound effect and the lip shape on the picture reach a high degree, and the watching comfort level of a client is improved.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.
Claims (3)
1. A video lip synchronization synthesis method is characterized in that prototype video files of various lips suitable for a virtual robot are stored in a cloud server;
the video lip synchronization synthesis method specifically comprises the following steps:
step 1: the cloud server receives the pronunciation manuscript through the terminal equipment and splits the manuscript into a plurality of sentences according to punctuation marks;
step 2: the cloud server carries out permutation and combination on each split sentence according to different lips, and matches the lips with the prototype video;
and step 3: splicing each sentence of prototype video successfully matched to form a synthesized video;
and 4, step 4: calculating the playing time of the synthesized video formed in the step 3;
and 5: the cloud server sets the speed of speech according to the time of step 4, the pronunciation duration is equal to the text playing duration, the text is sent to the voice gateway, and the voice gateway converts the text into a sound file and sends the sound file back to the cloud server;
step 6: synthesizing the synthesized video generated in the step 3 and the sound generated in the step 5 to form a final synthesized video;
and 7: playing the synthesized video generated in the step 6 through a specified terminal, and exiting the system;
the step 2 specifically comprises the following steps:
step 2.1: converting each Chinese character in the sentence into pinyin, setting lip codes to be 1 when consonants pronounce without closing lips according to vowels of the pinyin, setting lip codes to be 2 when the consonants pronounce while closing lips, setting lip codes to be 3 when the consonants pronounce while closing lips, setting lip codes to be 4 when the lip codes are large, setting lip codes to be 5 when the consonants pronounce while closing lips according to vowels, and setting lip codes to be 6 when the lip codes are large, thereby obtaining a string of lip permutation codes of the sentence;
step 2.2: searching and acquiring a prototype video with identical or similar lip-shaped arrangement codes in a prototype video library, wherein the lip-shaped codes of the last word in a sentence are required to be identical;
step 2.3: if the finding is found, turning to the step 3;
step 2.4: if the lip shape arranged codes with close lip shapes do not exist in the prototype video library, the lip shape arranged codes are subjected to limited splitting until prototype videos with the same or close lip shapes are found in each segment after splitting, the lip shape codes of the last word of a sentence are required to be equal, the prototype videos are spliced into sentence videos, and the step 3 is carried out;
step 2.5: if the lip-shaped equivalent or similar prototype video cannot be found after the limited splitting, the report system adds the prototype video supplementing the lip-shaped arrangement code, the matching fails, and the report system exits.
2. A video lip-sync synthesizing system using a video lip-sync synthesizing method according to claim 1, comprising: the robot terminal is used for receiving the questioning voice of the client and sending the synthesized video;
the cloud server is used for receiving the question voice sent by the robot terminal through the Internet, feeding back a corresponding synthesized video to the robot terminal through the Internet according to the question voice, and playing the synthesized video by the robot terminal;
the cloud server comprises: the device comprises a processor, a recording unit, a touch display unit, a communication unit and a lip arrangement unit, wherein the processor is respectively connected with the recording unit, the touch display unit, the communication unit and the lip arrangement unit;
the recording unit is used for acquiring the question voice of the client; the touch display unit is used for customer operation and video playing; the communication unit is used for carrying out data transmission with the cloud server; the lip shape arrangement unit is used for corresponding to the arrangement combination of different lip shapes of each sentence of characters and endowing each prototype video file with different lip shape arrangement combination codes, and the lip shape arrangement codes comprise: when consonants are sounded, lip codes are set to 1 when lip is slightly opened, and are set to 2 when lip is greatly opened, when vowels are sounded, lip codes are set to 3 when lip is slightly opened, and are set to 4 when lip is greatly opened, and when vowels are sounded, lip codes are set to 5 when lip is slightly opened, and are set to 6 when lip is greatly opened.
3. The system of claim 2, wherein the cloud server comprises:
the receiving and pushing module is used for receiving the data sent by the robot terminal and sending the data to the robot terminal;
the voice conversion module is used for converting the question voice received from the cloud server through the Internet into question words and feeding the question words back to the cloud server; meanwhile, the pronunciation manuscript received on the cloud server through the Internet is converted into answer voice and is fed back to the cloud server through the Internet;
the matching module is used for matching the questioning words with corresponding answer voice or answer video from a question bank in a cloud server;
and the storage module is used for storing questioning voice, answering voice, pronunciation manuscript, synthesized video and key words of the client.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910758080.XA CN110493613B (en) | 2019-08-16 | 2019-08-16 | Video lip synchronization synthesis method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910758080.XA CN110493613B (en) | 2019-08-16 | 2019-08-16 | Video lip synchronization synthesis method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110493613A CN110493613A (en) | 2019-11-22 |
CN110493613B true CN110493613B (en) | 2020-05-19 |
Family
ID=68551356
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910758080.XA Active CN110493613B (en) | 2019-08-16 | 2019-08-16 | Video lip synchronization synthesis method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110493613B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325817B (en) * | 2020-02-04 | 2023-07-18 | 清华珠三角研究院 | Virtual character scene video generation method, terminal equipment and medium |
CN111225237B (en) | 2020-04-23 | 2020-08-21 | 腾讯科技(深圳)有限公司 | Sound and picture matching method of video, related device and storage medium |
CN113178206B (en) * | 2021-04-22 | 2022-05-31 | 内蒙古大学 | AI (Artificial intelligence) composite anchor generation method, electronic equipment and readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101482975A (en) * | 2008-01-07 | 2009-07-15 | 丰达软件(苏州)有限公司 | Method and apparatus for converting words into animation |
CN101796812A (en) * | 2006-03-31 | 2010-08-04 | 莱切技术国际公司 | Lip synchronization system and method |
CN106791539A (en) * | 2016-12-26 | 2017-05-31 | 国家新闻出版广电总局电影数字节目管理中心 | A kind of storage of film digital program and extracting method |
CN108010531A (en) * | 2017-12-14 | 2018-05-08 | 南京美桥信息科技有限公司 | A kind of visible intelligent inquiry method and system |
CN108038206A (en) * | 2017-12-14 | 2018-05-15 | 南京美桥信息科技有限公司 | A kind of visible intelligent method of servicing and system |
CN108090170A (en) * | 2017-12-14 | 2018-05-29 | 南京美桥信息科技有限公司 | A kind of intelligence inquiry method for recognizing semantics and visible intelligent interrogation system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100396091C (en) * | 2006-04-03 | 2008-06-18 | 北京和声创景音频技术有限公司 | Commandos dubbing system and dubbing making method thereof |
CN100476877C (en) * | 2006-11-10 | 2009-04-08 | 中国科学院计算技术研究所 | Generating method of cartoon face driven by voice and text together |
CN107786889A (en) * | 2017-11-13 | 2018-03-09 | 北海威德电子科技有限公司 | Can synchronous sign language interpreter DTV |
CN109308731B (en) * | 2018-08-24 | 2023-04-25 | 浙江大学 | Speech driving lip-shaped synchronous face video synthesis algorithm of cascade convolution LSTM |
CN109637518B (en) * | 2018-11-07 | 2022-05-24 | 北京搜狗科技发展有限公司 | Virtual anchor implementation method and device |
-
2019
- 2019-08-16 CN CN201910758080.XA patent/CN110493613B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101796812A (en) * | 2006-03-31 | 2010-08-04 | 莱切技术国际公司 | Lip synchronization system and method |
CN101482975A (en) * | 2008-01-07 | 2009-07-15 | 丰达软件(苏州)有限公司 | Method and apparatus for converting words into animation |
CN106791539A (en) * | 2016-12-26 | 2017-05-31 | 国家新闻出版广电总局电影数字节目管理中心 | A kind of storage of film digital program and extracting method |
CN108010531A (en) * | 2017-12-14 | 2018-05-08 | 南京美桥信息科技有限公司 | A kind of visible intelligent inquiry method and system |
CN108038206A (en) * | 2017-12-14 | 2018-05-15 | 南京美桥信息科技有限公司 | A kind of visible intelligent method of servicing and system |
CN108090170A (en) * | 2017-12-14 | 2018-05-29 | 南京美桥信息科技有限公司 | A kind of intelligence inquiry method for recognizing semantics and visible intelligent interrogation system |
Also Published As
Publication number | Publication date |
---|---|
CN110493613A (en) | 2019-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Crystal | The language revolution | |
CN110493613B (en) | Video lip synchronization synthesis method and system | |
CN110033659B (en) | Remote teaching interaction method, server, terminal and system | |
US7913155B2 (en) | Synchronizing method and system | |
CN110405791B (en) | Method and system for simulating and learning speech by robot | |
US20180203830A1 (en) | Synchronized consumption modes for e-books | |
WO2018108013A1 (en) | Medium displaying method and terminal | |
CN104735480B (en) | Method for sending information and system between mobile terminal and TV | |
CN111866529A (en) | Method and system for hybrid use of virtual real person during video live broadcast | |
US7613613B2 (en) | Method and system for converting text to lip-synchronized speech in real time | |
US11968433B2 (en) | Systems and methods for generating synthetic videos based on audio contents | |
CN109326151A (en) | Implementation method, client and server based on semantics-driven virtual image | |
CN114793300A (en) | Virtual video customer service robot synthesis method and system based on generation countermeasure network | |
CN113850898A (en) | Scene rendering method and device, storage medium and electronic equipment | |
CN112447073A (en) | Explanation video generation method, explanation video display method and device | |
CN111160051B (en) | Data processing method, device, electronic equipment and storage medium | |
US20160247500A1 (en) | Content delivery system | |
Kadam et al. | A Survey of Audio Synthesis and Lip-syncing for Synthetic Video Generation | |
KR20100115003A (en) | Method for generating talking heads from text and system thereof | |
KR101675049B1 (en) | Global communication system | |
CN109902311A (en) | A kind of synchronous English of video signal and multilingual translation system | |
CN108174123A (en) | Data processing method, apparatus and system | |
US20240153397A1 (en) | Virtual meeting coaching with content-based evaluation | |
US20240153398A1 (en) | Virtual meeting coaching with dynamically extracted content | |
CN111580614A (en) | Wearable intelligent device and sign language learning method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |