CN110493613A - A kind of synthetic method and system of video audio lip sync - Google Patents

A kind of synthetic method and system of video audio lip sync Download PDF

Info

Publication number
CN110493613A
CN110493613A CN201910758080.XA CN201910758080A CN110493613A CN 110493613 A CN110493613 A CN 110493613A CN 201910758080 A CN201910758080 A CN 201910758080A CN 110493613 A CN110493613 A CN 110493613A
Authority
CN
China
Prior art keywords
lip shape
lip
video
cloud server
code
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910758080.XA
Other languages
Chinese (zh)
Other versions
CN110493613B (en
Inventor
郭志扬
乔健
吴鹏程
陈起航
朱西锋
丁航
陆佳莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Aoxin Technology Co Ltd
Original Assignee
Jiangsu Aoxin Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Aoxin Technology Co Ltd filed Critical Jiangsu Aoxin Technology Co Ltd
Priority to CN201910758080.XA priority Critical patent/CN110493613B/en
Publication of CN110493613A publication Critical patent/CN110493613A/en
Application granted granted Critical
Publication of CN110493613B publication Critical patent/CN110493613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2368Multiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/242Synchronization processes, e.g. processing of PCR [Program Clock References]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The present invention discloses the synthetic method and system of a kind of video audio lip sync, belongs to the technical field of audio lip sync.Specifically includes the following steps: the cloud server receives pronunciation manuscript by terminal device, manuscript is split as several sentences according to punctuation mark;The cloud server carries out each sentence after fractionation to carry out permutation and combination according to different lip shapes, and lip shape is matched with prototype video: each prototype video of successful match is spliced, and forms synthetic video;Calculate the play time for the synthetic video to be formed;The pronunciation manuscript that cloud server will be received by terminal device, temporally sets word speed, it is ensured that pronunciation duration is equal to text and performs in a radio or TV programme duration.The lip shape when present invention is according to the word pronunciation to pronounce on manuscript, which combines, different lip shapes is arranged different codes, select the prototype video of corresponding lip shape, and synthesize, reach figure picture and guarantees that lip shape and sound because of consistent effect, increase authenticity while voice plays.

Description

A kind of synthetic method and system of video audio lip sync
Technical field
Originally the technical field for belonging to audio lip sync, more particularly to the synthetic method and system of a kind of video audio lip sync.
Background technique
In order to reinforce the communication with client and prospect and exchange, better product and technological service are provided for client, very More businessmans or mechanism are all specially provided with the customer service of oneself and technical service department, the staff of these departments exist daily after sale Workload under line or on line with Communication with Customer service is very big, carries out repetition, cumbersome problem answer and guidance, and not Daily 24 hours service users online or on duty of energy, virtual true man robot just comes into being.It is stored i.e. in display screen big True man's video of amount and answer voice, the enquirement for client provide corresponding feedback.
But because during answer, the lip shape of personage and answer voice are later period synthesis in video, therefore be will appear The lip shape of people in video and the nonsynchronous phenomenon of voice, the text and see that lip shape mismatches that client hears, be not achieved with very The effect of people's customer service face-to-face exchange, therefore client can be made to repel this service from heart.
Summary of the invention
The present invention is to solve technical problem present in above-mentioned background technique, provides a kind of video sound for giving the sense of reality Lip synchronous synthetic method and system.
The present invention is achieved through the following technical solutions: a kind of synthetic method of video audio lip sync, in cloud server There is the prototype video file for the various lip shapes that suitable virtual robot uses;
The synthetic method of the video audio lip sync specifically includes the following steps:
Step 1: the cloud server receives pronunciation manuscript by terminal device, splits manuscript according to punctuation mark For several sentences;
Step 2: the cloud server carries out each sentence after fractionation to carry out permutation and combination according to different lip shapes, by lip Shape is matched with prototype video:
Step 3: each prototype video of successful match being spliced, synthetic video is formed;
Step 4: calculating the play time of the synthetic video of step 3 formation;
Step 5: the pronunciation manuscript that cloud server will be received by terminal device sets word speed by the time of step 4, really It protects pronunciation duration and performs in a radio or TV programme duration equal to text, and manuscript is sent to voice gateways, voice gateways are by text conversion at sound text Part passes cloud server back;
Step 6: the sound that synthetic video and step 5 that step 3 generates generate being synthesized, final synthetic video is formed;
Step 7: the synthetic video that step 6 generates being played out by specified terminal, is logged off.
In a further embodiment, the step 2 specifically includes the following steps:
Step 2.1: each of sentence Chinese character is converted to phonetic, when not closing lip according to the vowel articulation of phonetic, consonant hair Lip shape parts a little when sound, and lip shape code is set as 1, and lip shape is opened greatly, and lip shape code is set as 2, when closing lip according to vowel articulation, when consonant articulation Lip shape parts a little, and lip shape code is set as 3, and lip shape is opened greatly, and lip shape code is set as 4, when stinging lip according to vowel articulation, lip shape when consonant articulation It parts a little, lip shape code is set as 5, and lip shape is opened greatly, and lip shape code is set as 6, it follows that a string of lip shape permutation codes of the sentence;
Step 2.2: found in prototype video library and obtain that lip shape permutation code is equivalent or similar prototype video, sentence last The lip shape code of a word must be equal;
Step 2.3: 3 are gone to step if finding;
Step 2.4: if there is no lip shape permutation code similar in lip shape in prototype video library, this lip shape permutation code being carried out limited Split, until after splitting every section all find that lip shape is equivalent or similar prototype video, the lip shape code of sentence the last character are necessary It is equal, and these prototype video-splicings are formed a complete sentence sub-video, go to step 3;
Step 2.5: if still can not find after carrying out limited fractionation, lip shape is equal or similar prototype video, reporting system are added The prototype video of the lip shape permutation code is supplemented, it fails to match, reports and logs off.
It is a kind of to use a kind of synthesis system of video audio lip sync as described above, robot terminal, for receiving client Enquirement voice, and send synthetic video;
Cloud server, for receiving the enquirement voice that the robot terminal is sent by internet, and according to the enquirement Voice feeds back corresponding synthetic video to the robot terminal by internet, and the robot terminal plays synthesis view Frequently;
In a further embodiment, the cloud server includes: processor, recoding unit, touch-display unit, communication unit Member and lip shape arrangement units, the processor respectively with the recoding unit, the touch-display unit, communication unit and lip shape Arrangement units connection;
The recoding unit is used to obtain the enquirement voice of client;The touch-display unit is for guest operation and plays view Frequently;The communication unit with the cloud server for carrying out data transmission;The lip shape arrangement units are for corresponding to every The permutation and combination of text difference lip shape, and assign each prototype video file different lip shape permutation and combination codes, the lip shape row Lip shape parts a little when column code includes: consonant articulation, and lip shape code is set as 1, and lip shape is opened greatly, and lip shape code is set as 2, is closed according to vowel articulation When lip, lip shape is parted a little when consonant articulation, and lip shape code is set as 3, and lip shape is opened greatly, and lip shape code is set as 4, stings lip according to vowel articulation When, lip shape parts a little when consonant articulation, and lip shape code is set as 5, and lip shape is opened greatly, and lip shape code is set as 6.
In a further embodiment, the cloud server includes:
Pushing module is received, for receiving the data of the robot terminal transmission and sending data to the robot terminal;
Voice conversion module puts question to text for the enquirement voice received on the cloud server by internet to be converted into Word simultaneously feeds back to the cloud server;The pronunciation manuscript meeting that will be received simultaneously by internet on the cloud server It is converted into answering voice, and the cloud service is fed back to by internet;
Matching module, for the enquirement text to be matched corresponding answer voice or answer from the exam pool in cloud server Video;
Memory module, for storing the enquirement voice of client, answering voice, pronunciation manuscript, synthetic video and keyword.
Beneficial effects of the present invention: lip shape when according to the word pronunciation to pronounce on manuscript, which combines, sets different lip shapes Different codes is set, the prototype video of corresponding lip shape is selected, and is synthesized, reaches figure picture and guarantees lip while voice plays Shape and sound increase authenticity because of consistent effect.
Detailed description of the invention
Fig. 1 is the flow diagram of the synthetic method of video audio lip sync.
Fig. 2 is the flow diagram of step 2 in Fig. 1.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.
Although the step in the present invention is arranged with label, it is not used to limit the precedence of step, unless Based on the execution of the order or certain step that specify step needs other steps, otherwise the relative rank of step is It is adjustable.It is appreciated that term "and/or" used herein be related to and cover in associated listed item one Person or one or more of any and all possible combinations.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.
Applicant existing service industry there are aiming at the problem that: in video the lip shape of personage and answer voice be the later period close At, therefore will appear the lip shape and the nonsynchronous phenomenon of voice of the people in video, the text and see lip shape not that client hears Matching, is not achieved the effect with true man's customer service face-to-face exchange, therefore client can be made to repel this service from heart.
Therefore in order to solve the above technical problems, applicant designs a kind of true man's online help machine service system, this can be allowed System improves the synthetic method and system of a kind of video audio lip sync of authenticity.
Exist first, has the prototype video file for the various lip shapes that suitable virtual robot uses in cloud server.
As shown in Figure 1, the synthetic method of the video audio lip sync specifically includes the following steps:
Step 1: the cloud server receives pronunciation manuscript by terminal device, splits manuscript according to punctuation mark For several sentences;
Step 2: the cloud server carries out each sentence after fractionation to carry out permutation and combination according to different lip shapes, by lip Shape is matched with prototype video:
Step 3: each prototype video of successful match being spliced, synthetic video is formed;
Step 4: calculating the play time of the synthetic video of step 3 formation;
Step 5: the pronunciation manuscript that cloud server will be received by terminal device sets word speed by the time of step 4, really It protects pronunciation duration and performs in a radio or TV programme duration equal to text, and manuscript is sent to voice gateways, voice gateways are by text conversion at sound text Part passes cloud server back;
Step 6: the sound that synthetic video and step 5 that step 3 generates generate being synthesized, final synthetic video is formed;
Step 7: the synthetic video that step 6 generates being played out by specified terminal, is logged off.
As shown in Fig. 2, the step 2 specifically includes the following steps:
Step 2.1: each of sentence Chinese character is converted to phonetic, when not closing lip according to the vowel articulation of phonetic, consonant hair Lip shape parts a little when sound, and lip shape code is set as 1, and lip shape is opened greatly, and lip shape code is set as 2, when closing lip according to vowel articulation, when consonant articulation Lip shape parts a little, and lip shape code is set as 3, and lip shape is opened greatly, and lip shape code is set as 4, when stinging lip according to vowel articulation, lip shape when consonant articulation It parts a little, lip shape code is set as 5, and lip shape is opened greatly, and lip shape code is set as 6, it follows that a string of lip shape permutation codes of the sentence;
Step 2.2: found in prototype video library and obtain that lip shape permutation code is equivalent or similar prototype video, sentence last The lip shape code of a word must be equal;
Step 2.3: 3 are gone to step if finding;
Step 2.4: if there is no lip shape permutation code similar in lip shape in prototype video library, this lip shape permutation code being carried out limited Split, until after splitting every section all find that lip shape is equivalent or similar prototype video, the lip shape code of sentence the last character are necessary It is equal, and these prototype video-splicings are formed a complete sentence sub-video, go to step 3;
Step 2.5: if still can not find after carrying out limited fractionation, lip shape is equal or similar prototype video, reporting system are added The prototype video of the lip shape permutation code is supplemented, it fails to match, reports and logs off.
A kind of synthesis system of video audio lip sync, comprising: robot terminal, for receiving the enquirement voice of client, and Send synthetic video;
Cloud server, for receiving the enquirement voice that the robot terminal is sent by internet, and according to the enquirement Voice feeds back corresponding synthetic video to the robot terminal by internet, and the robot terminal plays synthesis view Frequently;
4. a kind of synthesis system of video audio lip sync according to claim 3, which is characterized in that the cloud server Include: processor, recoding unit, touch-display unit, communication unit and lip shape arrangement units, the processor respectively with it is described Recoding unit, the touch-display unit, communication unit are connected with lip shape arrangement units;
The recoding unit is used to obtain the enquirement voice of client;The touch-display unit is for guest operation and plays view Frequently;The communication unit with the cloud server for carrying out data transmission;The lip shape arrangement units are for corresponding to every The permutation and combination of text difference lip shape, and assign each prototype video file different lip shape permutation and combination codes, the lip shape row Lip shape parts a little when column code includes: consonant articulation, and lip shape code is set as 1, and lip shape is opened greatly, and lip shape code is set as 2, is closed according to vowel articulation When lip, lip shape is parted a little when consonant articulation, and lip shape code is set as 3, and lip shape is opened greatly, and lip shape code is set as 4, stings lip according to vowel articulation When, lip shape parts a little when consonant articulation, and lip shape code is set as 5, and lip shape is opened greatly, and lip shape code is set as 6.
The cloud server includes: reception pushing module, for receive data that the robot terminal is sent and to The robot terminal sends data;Voice conversion module, for that will be received on the cloud server by internet It puts question to voice to be converted into puing question to text and feeds back to the cloud server;The cloud clothes will be received by internet simultaneously Pronunciation manuscript on business device can be converted into answering voice, and feed back to the cloud service by internet;Matching module is used In the enquirement text is matched corresponding answer voice or solution video from the exam pool in cloud server;Memory module, For storing the enquirement voice of client, answering voice, pronunciation manuscript, synthetic video and keyword.
Match synthetic video with voice is answered, in virtual robot demonstration, accomplishes to play simultaneously and answer voice and conjunction At video, audio and being consistent property of picture, the lip shape in the pronunciation and picture of audio reach forcing for height, increase client and see The comfort level seen.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art The other embodiments being understood that.

Claims (4)

1. a kind of synthetic method of video audio lip sync, which is characterized in that having suitable virtual robot in cloud server makes The prototype video file of various lip shapes;
The synthetic method of the video audio lip sync specifically includes the following steps:
Step 1: the cloud server receives pronunciation manuscript by terminal device, splits manuscript according to punctuation mark For several sentences;
Step 2: the cloud server carries out each sentence after fractionation to carry out permutation and combination according to different lip shapes, by lip Shape is matched with prototype video;
Step 3: each prototype video of successful match being spliced, synthetic video is formed;
Step 4: calculating the play time of the synthetic video of step 3 formation;
Step 5: the pronunciation manuscript that cloud server will be received by terminal device sets word speed by the time of step 4, really It protects pronunciation duration and performs in a radio or TV programme duration equal to text, and manuscript is sent to voice gateways, voice gateways are by text conversion at sound text Part passes cloud server back;
Step 6: the sound that synthetic video and step 5 that step 3 generates generate being synthesized, final synthetic video is formed;
Step 7: the synthetic video that step 6 generates being played out by specified terminal, is logged off.
2. a kind of synthetic method of video audio lip sync according to claim 1, which is characterized in that the step 2 is specific The following steps are included:
Step 2.1: each of sentence Chinese character is converted to phonetic, when not closing lip according to the vowel articulation of phonetic, consonant hair Lip shape parts a little when sound, and lip shape code is set as 1, and lip shape is opened greatly, and lip shape code is set as 2, when closing lip according to vowel articulation, when consonant articulation Lip shape parts a little, and lip shape code is set as 3, and lip shape is opened greatly, and lip shape code is set as 4, when stinging lip according to vowel articulation, lip shape when consonant articulation It parts a little, lip shape code is set as 5, and lip shape is opened greatly, and lip shape code is set as 6, it follows that a string of lip shape permutation codes of the sentence;
Step 2.2: found in prototype video library and obtain that lip shape permutation code is equivalent or similar prototype video, sentence last The lip shape code of a word must be equal;
Step 2.3: 3 are gone to step if finding;
Step 2.4: if there is no lip shape permutation code similar in lip shape in prototype video library, this lip shape permutation code being carried out limited Split, until after splitting every section all find that lip shape is equivalent or similar prototype video, the lip shape code of sentence the last character are necessary It is equal, and these prototype video-splicings are formed a complete sentence sub-video, go to step 3;
Step 2.5: if still can not find after carrying out limited fractionation, lip shape is equal or similar prototype video, reporting system are added The prototype video of the lip shape permutation code is supplemented, it fails to match, reports and logs off.
3. a kind of synthesis system using a kind of video audio lip sync as described in any one of claims 1 to 2, feature exist In, comprising: robot terminal for receiving the enquirement voice of client, and sends synthetic video;
Cloud server, for receiving the enquirement voice that the robot terminal is sent by internet, and according to the enquirement Voice feeds back corresponding synthetic video to the robot terminal by internet, and the robot terminal plays synthesis view Frequently;
A kind of synthesis system of video audio lip sync according to claim 3, which is characterized in that the cloud server packet Include: processor, recoding unit, touch-display unit, communication unit and lip shape arrangement units, the processor respectively with the record Sound unit, the touch-display unit, communication unit are connected with lip shape arrangement units;
The recoding unit is used to obtain the enquirement voice of client;The touch-display unit is for guest operation and plays view Frequently;The communication unit with the cloud server for carrying out data transmission;The lip shape arrangement units are for corresponding to every The permutation and combination of text difference lip shape, and assign each prototype video file different lip shape permutation and combination codes, the lip shape row Lip shape parts a little when column code includes: consonant articulation, and lip shape code is set as 1, and lip shape is opened greatly, and lip shape code is set as 2, is closed according to vowel articulation When lip, lip shape is parted a little when consonant articulation, and lip shape code is set as 3, and lip shape is opened greatly, and lip shape code is set as 4, stings lip according to vowel articulation When, lip shape parts a little when consonant articulation, and lip shape code is set as 5, and lip shape is opened greatly, and lip shape code is set as 6.
4. a kind of synthesis system of video audio lip sync according to claim 4, which is characterized in that the cloud server Include:
Pushing module is received, for receiving the data of the robot terminal transmission and sending data to the robot terminal;
Voice conversion module puts question to text for the enquirement voice received on the cloud server by internet to be converted into Word simultaneously feeds back to the cloud server;The pronunciation manuscript meeting that will be received simultaneously by internet on the cloud server It is converted into answering voice, and the cloud service is fed back to by internet;
Matching module, for the enquirement text to be matched corresponding answer voice or answer from the exam pool in cloud server Video;
Memory module, for storing the enquirement voice of client, answering voice, pronunciation manuscript, synthetic video and keyword.
CN201910758080.XA 2019-08-16 2019-08-16 Video lip synchronization synthesis method and system Active CN110493613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910758080.XA CN110493613B (en) 2019-08-16 2019-08-16 Video lip synchronization synthesis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910758080.XA CN110493613B (en) 2019-08-16 2019-08-16 Video lip synchronization synthesis method and system

Publications (2)

Publication Number Publication Date
CN110493613A true CN110493613A (en) 2019-11-22
CN110493613B CN110493613B (en) 2020-05-19

Family

ID=68551356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910758080.XA Active CN110493613B (en) 2019-08-16 2019-08-16 Video lip synchronization synthesis method and system

Country Status (1)

Country Link
CN (1) CN110493613B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111225237A (en) * 2020-04-23 2020-06-02 腾讯科技(深圳)有限公司 Sound and picture matching method of video, related device and storage medium
CN111325817A (en) * 2020-02-04 2020-06-23 清华珠三角研究院 Virtual character scene video generation method, terminal device and medium
CN113178206A (en) * 2021-04-22 2021-07-27 内蒙古大学 AI (Artificial intelligence) composite anchor generation method, electronic equipment and readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1825911A (en) * 2006-04-03 2006-08-30 北京和声创景音频技术有限公司 Commandos dubbing system and dubbing making method thereof
CN1971621A (en) * 2006-11-10 2007-05-30 中国科学院计算技术研究所 Generating method of cartoon face driven by voice and text together
CN101482975A (en) * 2008-01-07 2009-07-15 丰达软件(苏州)有限公司 Method and apparatus for converting words into animation
CN101796812A (en) * 2006-03-31 2010-08-04 莱切技术国际公司 Lip synchronization system and method
CN106791539A (en) * 2016-12-26 2017-05-31 国家新闻出版广电总局电影数字节目管理中心 A kind of storage of film digital program and extracting method
CN107786889A (en) * 2017-11-13 2018-03-09 北海威德电子科技有限公司 Can synchronous sign language interpreter DTV
CN108010531A (en) * 2017-12-14 2018-05-08 南京美桥信息科技有限公司 A kind of visible intelligent inquiry method and system
CN108038206A (en) * 2017-12-14 2018-05-15 南京美桥信息科技有限公司 A kind of visible intelligent method of servicing and system
CN108090170A (en) * 2017-12-14 2018-05-29 南京美桥信息科技有限公司 A kind of intelligence inquiry method for recognizing semantics and visible intelligent interrogation system
CN109308731A (en) * 2018-08-24 2019-02-05 浙江大学 The synchronous face video composition algorithm of the voice-driven lip of concatenated convolutional LSTM
CN109637518A (en) * 2018-11-07 2019-04-16 北京搜狗科技发展有限公司 Virtual newscaster's implementation method and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101796812A (en) * 2006-03-31 2010-08-04 莱切技术国际公司 Lip synchronization system and method
CN1825911A (en) * 2006-04-03 2006-08-30 北京和声创景音频技术有限公司 Commandos dubbing system and dubbing making method thereof
CN1971621A (en) * 2006-11-10 2007-05-30 中国科学院计算技术研究所 Generating method of cartoon face driven by voice and text together
CN101482975A (en) * 2008-01-07 2009-07-15 丰达软件(苏州)有限公司 Method and apparatus for converting words into animation
CN106791539A (en) * 2016-12-26 2017-05-31 国家新闻出版广电总局电影数字节目管理中心 A kind of storage of film digital program and extracting method
CN107786889A (en) * 2017-11-13 2018-03-09 北海威德电子科技有限公司 Can synchronous sign language interpreter DTV
CN108010531A (en) * 2017-12-14 2018-05-08 南京美桥信息科技有限公司 A kind of visible intelligent inquiry method and system
CN108038206A (en) * 2017-12-14 2018-05-15 南京美桥信息科技有限公司 A kind of visible intelligent method of servicing and system
CN108090170A (en) * 2017-12-14 2018-05-29 南京美桥信息科技有限公司 A kind of intelligence inquiry method for recognizing semantics and visible intelligent interrogation system
CN109308731A (en) * 2018-08-24 2019-02-05 浙江大学 The synchronous face video composition algorithm of the voice-driven lip of concatenated convolutional LSTM
CN109637518A (en) * 2018-11-07 2019-04-16 北京搜狗科技发展有限公司 Virtual newscaster's implementation method and device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325817A (en) * 2020-02-04 2020-06-23 清华珠三角研究院 Virtual character scene video generation method, terminal device and medium
CN111325817B (en) * 2020-02-04 2023-07-18 清华珠三角研究院 Virtual character scene video generation method, terminal equipment and medium
CN111225237A (en) * 2020-04-23 2020-06-02 腾讯科技(深圳)有限公司 Sound and picture matching method of video, related device and storage medium
CN111225237B (en) * 2020-04-23 2020-08-21 腾讯科技(深圳)有限公司 Sound and picture matching method of video, related device and storage medium
US11972778B2 (en) 2020-04-23 2024-04-30 Tencent Technology (Shenzhen) Company Limited Sound-picture matching method of video, related apparatus, and storage medium
CN113178206A (en) * 2021-04-22 2021-07-27 内蒙古大学 AI (Artificial intelligence) composite anchor generation method, electronic equipment and readable storage medium
CN113178206B (en) * 2021-04-22 2022-05-31 内蒙古大学 AI (Artificial intelligence) composite anchor generation method, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN110493613B (en) 2020-05-19

Similar Documents

Publication Publication Date Title
Halim et al. The functions of code-switching in Facebook interactions
Heiss Dubbing multilingual films: A new challenge?
US7697668B1 (en) System and method of controlling sound in a multi-media communication application
Cox et al. Tessa, a system to aid communication with deaf people
US9037956B2 (en) Content customization
US8849676B2 (en) Content customization
Fresco Naturalness in the Spanish Dubbing Language: a Case of Not-so-close Friends 1
Ebling et al. Building a Swiss German Sign Language avatar with JASigning and evaluating it among the Deaf community
CN110405791A (en) A kind of robot imitates and the method and system of study speech
CN110493613A (en) A kind of synthetic method and system of video audio lip sync
Langford Analysing talk: Investigating verbal interaction in English
WO2013148724A1 (en) Content customization
CN111866529A (en) Method and system for hybrid use of virtual real person during video live broadcast
Wagner et al. The big australian speech corpus (the big asc)
Minutella Translating foreign languages and non-native varieties of English in animated films: Dubbing strategies in Italy and the case of Despicable Me 2
CN109326151A (en) Implementation method, client and server based on semantics-driven virtual image
Lhawa Language revitalization, video, and mobile social media: A case study from the Khroskyabs language amongst Tibetans in China
US20160247500A1 (en) Content delivery system
Acharya Popular culture and English language learning: A study among youth in India
Sikorski Regional accents: A rationale for intervening and competencies required
US20210183261A1 (en) Interactive virtual learning system and methods of using same
Altinkaya et al. Assisted speech to enable second language
Díaz-Cintas An excursus on audiovisual translation
CN111090704A (en) Self-service learning system of language spoken language based on block chain technology
JP2010162249A (en) Acoustic quiz system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant