CN107172449A - Multi-medium play method, device and multimedia storage method - Google Patents

Multi-medium play method, device and multimedia storage method Download PDF

Info

Publication number
CN107172449A
CN107172449A CN201710462699.7A CN201710462699A CN107172449A CN 107172449 A CN107172449 A CN 107172449A CN 201710462699 A CN201710462699 A CN 201710462699A CN 107172449 A CN107172449 A CN 107172449A
Authority
CN
China
Prior art keywords
audio
multimedia
dub
dubbing
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710462699.7A
Other languages
Chinese (zh)
Inventor
陈凌奇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Whaley Technology Co Ltd
Original Assignee
Whaley Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Whaley Technology Co Ltd filed Critical Whaley Technology Co Ltd
Priority to CN201710462699.7A priority Critical patent/CN107172449A/en
Publication of CN107172449A publication Critical patent/CN107172449A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/4104Peripherals receiving signals from specially adapted client devices
    • H04N21/4126The peripheral being portable, e.g. PDAs or mobile phones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses a kind of multi-medium play method.This method includes:Configuration is dubbed determined by S1, the multimedia fileinfo of acquisition and user, fileinfo includes multimedia video, background audio, the storage information for dubbing text, dubbing configuration includes the vocal print feature of each role;S2, the multimedia video, background audio obtained according to fileinfo, text is dubbed;Text is dubbed described in S3, basis and configuration generation is dubbed and dubs audio, this is dubbed the vocal print feature of each role in audio and matched with dubbing the vocal print feature of each role in configuration;S4, by described dub audio and background audio synthesizes the multimedia audio;S5, the synchronous broadcasting multimedia video and audio.The invention also discloses a kind of multimedia playing apparatus and a kind of multimedia storage method.The storage resource of multimedia occupancy can be greatly reduced in one aspect of the present invention, on the other hand can require that adjustment role dubs according to user, so as to meet the appreciation demand of user individual.

Description

Multi-medium play method, device and multimedia storage method
Technical field
The present invention relates to a kind of multi-medium play method and device.
Background technology
Multimedia order programme system (Demand Multimedia System) is the common form of multi-media network application, main To apply includes:Video request program (Video on Demand, VOD), film-on-demand (Movie on Demand, MOD), news point Broadcast (News on Demand, NOD) etc..With the fast development of the technologies such as network, computer, audio frequency and video processing, multimedia point The service broadcast has been widely applied.
Client/server (C/S) pattern is used multi-media service system more.In fact, this is also just because of multimedia The characteristic (needing large storage capacity or high throughput) such as data volume is big promotes the realization of Client/Server pattern, therefore many Media server is exactly the computer system that multimedia service is provided for other systems (multimedia client).Existing multimedia clothes Business system is for the multimedia storage mode such as movie and television play often as shown in figure 1, separating to deposit by its video and audio file Storage, in user's program request, in real time plays video and audio sync.One film or musical works generally have multiple audio versions (most commonly multiple languages), so require many parts of voice datas of storage, on the one hand need to take substantial amounts of storage money Source;On the other hand, can only be heard during multimedia it is original dub, and original dub not necessarily is adapted to all users, it is difficult to full The appreciation demand of sufficient user individual.
The content of the invention
The technical problems to be solved by the invention be to overcome prior art not enough there is provided a kind of multi-medium play method and Device, on the one hand can be greatly reduced the storage resource of multimedia occupancy, on the other hand can require that adjustment role dubs according to user, So as to meet the appreciation demand of user individual.
It is of the invention specific using following technical scheme solution above-mentioned technical problem:
A kind of multi-medium play method, comprises the following steps:
Configuration is dubbed determined by S1, the multimedia fileinfo of acquisition and user, the fileinfo includes multimedia regard Frequently, background audio, dub the storage information of text, it is described to dub configuration and include the vocal print feature of each role;
S2, the multimedia video, background audio obtained according to the fileinfo, text is dubbed;
S3, according to it is described dub text and dub configuration generation dub audio, this dub in audio the vocal print feature of each role and The vocal print feature for dubbing each role in configuration matches;
S4, by described dub audio and background audio synthesizes the multimedia audio;
S5, the synchronous broadcasting multimedia video and audio.
Further, it is described to dub configuration also including dubbing used languages.Further, it is described to dub configuration also Including dubbing used dialect type.
Preferably, step S1~S4 is completed by the server of distal end, and step S5 is completed by local intelligent terminal, the clothes It is engaged between device and intelligent terminal that information exchange can be achieved.
A kind of multimedia playing apparatus, including:
Data obtaining module, dubs configuration, the fileinfo for obtaining determined by multimedia fileinfo and user It is described to dub the vocal print spy that configuration includes each role including multimedia video, background audio, the storage information for dubbing text Levy;
File acquisition module, for obtaining the multimedia video, background audio according to the fileinfo, dubbing text;
Audio generation module is dubbed, audio is dubbed for dubbing text according to and dubbing configuration generation, this is dubbed in audio The vocal print feature of each role matches with dubbing the vocal print feature of each role in configuration;
Audio synthesis module, for audio and the background audio of dubbing to be synthesized into the multimedia audio;
Playing module, the multimedia video and audio are played for synchronous.
Further, it is described to dub configuration also including dubbing used languages.Further, it is described to dub configuration also Including dubbing used dialect type.
Preferably, data obtaining module, file acquisition module, dub audio generation module, audio synthesis module and be arranged at In the server of distal end, playing module is arranged in local intelligent terminal, can be achieved between the server and intelligent terminal Information exchange.
Following technical scheme can also be obtained based on same inventive concept:
A kind of multimedia storage method, extracts video, the audio of original multimedia file first;Then from the audio extracted In be partitioned into and background audio and dub audio;The audio of dubbing being partitioned into is converted to and dubs text;By the video, background Audio, dub text and store respectively.
Further, this method is further comprising the steps of:From be partitioned into dub audio in extract the sound of each role Line feature, and will record the text message of the vocal print feature of each role add described in dub in text.
Compared with prior art, the invention has the advantages that:
The present invention by multimedia video, background audio, dub text and store respectively, and synthesized in real time when playing;Due to text Notebook data is much smaller compared to the memory space that voice data takes, therefore the storage of mass multimedia resource can be greatly reduced disappears Consumption;On the other hand, the present invention synthesize it is multimedia dub audio when, can be that role chooses the vocal print that dub according to user preferences Feature, meets the appreciation demand of user individual, improves Consumer's Experience.
Brief description of the drawings
Fig. 1 is existing multimedia storage mode schematic diagram;
Fig. 2 is multimedia storage mode schematic diagram of the present invention;
Fig. 3 is the principle schematic diagram of one specific embodiment of multimedia playing apparatus of the present invention;
Fig. 4 is the example user interface that configuration is dubbed for determination;
Fig. 5 is the schematic flow sheet of audio server Composite tone.
Embodiment
It is big and user individual can not be met appreciate demand for the storage resource consumption amount present in prior art Deficiency, thinking of the invention is by multimedia video, background audio, dubs text and store respectively, and is closed in real time when playing Into;Because text data is much smaller compared to the memory space that voice data takes, therefore mass multimedia resource can be greatly reduced Storage consumption;On the other hand, the present invention synthesize it is multimedia dub audio when, can according to user preferences be role choose match somebody with somebody The vocal print feature of sound, meets the appreciation demand of user individual, improves Consumer's Experience.
So-called vocal print (Voiceprint), is the sound wave spectrum for the carrying verbal information that electricity consumption acoustic instrument is shown.It is modern Scientific investigations showed that, vocal print not only has specificity, and the characteristics of have relative stability.In real life, everyone says Language during words, the characteristics of having oneself.Between the people being very familiar with, can a listening and mutually it is discernable, here it is language The variant characteristic of sound people.The fine difference of human body phonatory organ can all cause the change of sounding air-flow, cause tonequality, tone color Difference.In addition, the also faster or slower of the custom of people's sounding, firmly varies, the difference of loudness of a sound, the duration of a sound is also resulted in.Pitch, sound By force, the duration of a sound, tone color are referred to as voice " four key elements " in linguistics, and these factors are decomposed into more than 90 and plant feature again.These Feature is demonstrated by different wave length, frequency, intensity, the rhythm of alternative sounds.The change of sound wave can be converted into the strong of electric signal The change of these electric signals is depicted as wave spectrum figure by degree, wavelength, frequency, tempo variation, instrument again, just into vocal print.From sound The characteristic parameter for characterizing speaker's personal characteristics can be extracted in line signal(For example parameters of cepstrum LPCC, Mei Er frequencies are fallen Compose parameter MFCC etc.), i.e. vocal print feature.Have benefited from voice process technology(Speech recognition especially therein, voice are closed Into, voice coding, Application on Voiceprint Recognition this four big branch technique)And the fast development of computer and network technologies, it is that multimedia is entered Online dub in real time of row is possibly realized.
The present invention carries out multimedia storage using mode as shown in Figure 2 in advance.Specific storage method is as follows:
Step 1, the video for extracting original multimedia file, audio;
Multimedia(Multimedia)It is the synthesis of media, generally comprises text, the media form such as sound and image. In computer systems, multimedia refers to a kind of man-machine interactive information interchange for combining two or more media and propagates matchmaker Body.Conventional media include word, picture, photo, sound, animation and film, and the interaction function that formula is provided.Root According to coded system and the difference of concrete application, original multimedia file generally with MVO, AVI, MP3, MP4, WMV, MPG, The forms such as RAM, RA, DVD are stored.The video data of original multimedia file, voice data are extracted respectively, specifically Extracting method is existing mature technology, and here is omitted.
Step 2, it is partitioned into from the audio extracted background audio and dubs audio;
The prior art that the function can be achieved is a lot, for example, Pazera Free Audio commercial at present can be used directly The softwares such as Extractor, adobe audition are realized.Wherein, background audio can also be provided by Moviemaking company, because Film company background audio and dubs what is typically made respectively when turning out movies.
Step 3, the audio of dubbing being partitioned into is converted to and dubs text;
Can be by manually being changed or using speech recognition technology automatic conversion.The specific form for dubbing text can be certainly Row definition.In view of the original selection for dubbing often majority of movie and television play, it is therefore necessary to retain original dub as user Option(It is typically set at default option).The present invention specifically uses following methods:From be partitioned into dub audio in carry Take out the vocal print feature of each role, and will record the text message of the vocal print feature of each role add described in dub in text.With Under be an example that present invention dubs text:
<Film information>
<Duration>01:30:00</ duration>
<Languages>Chinese</ languages>
</ film information>
<Role's label>
<Leading man 1>
<Name>Guan Yu</ name>
<Age>31</ the age>
<Personality>It is bold and generous</ personality>
<Give tacit consent to vocal print>Performer Lu Shu engraves vocal print</ acquiescence vocal print>
</ leading man 1>
……
</ role label>
<Text>
00:00:01-00:00:07 pass plumage (it is arrogant | middling speed | medium):It is good that I sees face, such as inserts and sells first ear ... by tender
……
</ text>
Step 4, by the video, background audio, dub text and store respectively;
Video, background audio, dub text these three data and can be stored in locally, can also be stored respectively in corresponding same high in the clouds In database, server or different cloud databases, server.
Fig. 3 shows the structural principle of one specific embodiment of multimedia playing apparatus of the present invention, and it is substantially a set of Multimedia order programme system.As shown in figure 3, the device includes four Cloud Servers:Vod server, dub text server, sound Frequency server and video server, and be respectively used to store video, background audio, dub three cloud databases of text.Should The idiographic flow that device provides Multimedia on demand service is as follows:
Configuration is dubbed determined by S1, the multimedia fileinfo of acquisition and user, the fileinfo includes multimedia regard Frequently, background audio, dub the storage information of text, it is described to dub configuration and include the vocal print feature of each role;
Vod server obtains the order request of user by the information exchange with intelligent terminal, is deposited according to order request from itself Find the acute fileinfo of institute ordering film in the multimedia file index of storage, the fileinfo include multimedia video, The storage informations such as background audio, the storage address for dubbing text, file size, can also include duration, role of movie and television play etc. Information.
Vod server also dubs configuration by being obtained with the information exchange of intelligent terminal determined by user, described to dub Match somebody with somebody
Putting includes the vocal print feature of each role.Fig. 4 shows the example user interface that configuration is dubbed for determination, passes through point The vocal print feature oneself liked can be chosen for each role by hitting corresponding button in interface.User is not clicked on then as acquiescence vocal print Feature(The usually original vocal print feature dubbed), next stage option can be ejected after user's click replacement:
A. local vocal print storehouse B. network vocal prints storehouse
Local vocal print feature list is ejected if local vocal print storehouse has been selected to select to user;If network vocal print storehouse has been selected
Ejection input frame fills in vocal print feature title to user, such as can utilize " Liu Dehua ", " Donald duck ", " Zhao Benshan " Role's title for being widely known by the people names corresponding vocal print feature, or the configuration of each vocal print feature is a bit of accordingly to be shown Example audio selects for user's audition.Can also further in configuration is dubbed increase Chinese, English, French etc. dub it is used Languages option, or even can also add the dialect options such as Guangdong language, the south of Fujian Province language, Sichuan words.
S2, the multimedia video, background audio obtained according to the fileinfo, text is dubbed;
Corresponding fileinfo is sent respectively to dub text server, audio server and Video service by vod server Device, while the configuration of dubbing that user determines to be sent to and dubs text server.Dub text server, audio server and regard Frequency server is found out from corresponding database respectively dubs text, background audio, video accordingly.Dubbing text server will Dub that text and user determine dub configuration together with send to audio server.
Text is dubbed described in S3, basis and configuration generation is dubbed and dubs audio, the vocal print that this dubs each role in audio is special Levy and matched with dubbing the vocal print feature of each role in configuration;
Audio server, which will dub text using speech synthesis technique and be converted to, dubs audio accordingly, and be configured to according to dubbing The audio of dubbing of each role assigns corresponding vocal print feature so that dubs in audio the vocal print feature of each role and dubs in configuration The vocal print feature of each role matches.Specific phonetic synthesis can use existing various technologies, such as Chinese invention patent Technology disclosed in CN104485099A, CN105023570A, CN102117614B etc..Can also be combined with translation engine into The conversion of row languages.
S4, by described dub audio and background audio synthesizes the multimedia audio;
The means such as audio server passage time stamp are synthesized the audio of dubbing of generation with background audio, obtain user institute point The personalized audio of playing multimedia.Fig. 5 shows the basic procedure of the present embodiment sound intermediate frequency server Composite tone.
S5, the synchronous broadcasting multimedia video and audio;
Video server and audio server, which transmit video and audio sync to intelligent terminal, to be played.
These are only the present invention a specific embodiment, actually vod server, dub text server, audio clothes Be engaged in device and video server can be same server, and corresponding database can also use same database.With depositing The further development of the technologies such as storage, computing, above-mentioned multi-medium play method can also independently be realized in local intelligent terminal.

Claims (10)

1. a kind of multi-medium play method, it is characterised in that comprise the following steps:
Configuration is dubbed determined by S1, the multimedia fileinfo of acquisition and user, the fileinfo includes multimedia regard Frequently, background audio, dub the storage information of text, it is described to dub configuration and include the vocal print feature of each role;
S2, the multimedia video, background audio obtained according to the fileinfo, text is dubbed;
S3, according to it is described dub text and dub configuration generation dub audio, this dub in audio the vocal print feature of each role and The vocal print feature for dubbing each role in configuration matches;
S4, by described dub audio and background audio synthesizes the multimedia audio;
S5, the synchronous broadcasting multimedia video and audio.
2. method as claimed in claim 1, it is characterised in that described to dub configuration also including dubbing used languages.
3. method as claimed in claim 2, it is characterised in that described to dub configuration also including dubbing used dialect type.
4. method as claimed in claim 1, it is characterised in that step S1~S4 is completed by the server of distal end, and step S5 is by this The intelligent terminal on ground is completed, and information exchange can be achieved between the server and intelligent terminal.
5. a kind of multimedia playing apparatus, it is characterised in that including:
Data obtaining module, dubs configuration, the fileinfo for obtaining determined by multimedia fileinfo and user It is described to dub the vocal print spy that configuration includes each role including multimedia video, background audio, the storage information for dubbing text Levy;
File acquisition module, for obtaining the multimedia video, background audio according to the fileinfo, dubbing text;
Audio generation module is dubbed, audio is dubbed for dubbing text according to and dubbing configuration generation, this is dubbed in audio The vocal print feature of each role matches with dubbing the vocal print feature of each role in configuration;
Audio synthesis module, for audio and the background audio of dubbing to be synthesized into the multimedia audio;
Playing module, the multimedia video and audio are played for synchronous.
6. device as claimed in claim 5, it is characterised in that described to dub configuration also including dubbing used languages.
7. device as claimed in claim 6, it is characterised in that described to dub configuration also including dubbing used dialect type.
8. device as claimed in claim 5, it is characterised in that data obtaining module, file acquisition module, dub audio generation mould Block, audio synthesis module are arranged in the server of distal end, and playing module is arranged in local intelligent terminal, the server Information exchange can be achieved between intelligent terminal.
9. a kind of multimedia storage method, it is characterised in that extract video, the audio of original multimedia file first;Then Background audio is partitioned into from the audio extracted and audio is dubbed;The audio of dubbing being partitioned into is converted to and dubs text; By the video, background audio, dub text and store respectively.
10. method as claimed in claim 9, it is characterised in that this method is further comprising the steps of:Sound is dubbed from what is be partitioned into Extract the vocal print feature of each role in frequency, and will record the text message of the vocal print feature of each role add described in dub text In.
CN201710462699.7A 2017-06-19 2017-06-19 Multi-medium play method, device and multimedia storage method Pending CN107172449A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710462699.7A CN107172449A (en) 2017-06-19 2017-06-19 Multi-medium play method, device and multimedia storage method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710462699.7A CN107172449A (en) 2017-06-19 2017-06-19 Multi-medium play method, device and multimedia storage method

Publications (1)

Publication Number Publication Date
CN107172449A true CN107172449A (en) 2017-09-15

Family

ID=59819862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710462699.7A Pending CN107172449A (en) 2017-06-19 2017-06-19 Multi-medium play method, device and multimedia storage method

Country Status (1)

Country Link
CN (1) CN107172449A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107396177A (en) * 2017-08-28 2017-11-24 北京小米移动软件有限公司 Video broadcasting method, device and storage medium
CN107707974A (en) * 2017-09-18 2018-02-16 广东九联科技股份有限公司 A kind of realization method and system of special efficacy voice function
CN107770474A (en) * 2017-09-27 2018-03-06 北京金山安全软件有限公司 Sound processing method and device, terminal equipment and storage medium
CN108231059A (en) * 2017-11-27 2018-06-29 北京搜狗科技发展有限公司 Treating method and apparatus, the device for processing
CN108744521A (en) * 2018-06-28 2018-11-06 网易(杭州)网络有限公司 The method and device of game speech production, electronic equipment, storage medium
CN108900886A (en) * 2018-07-18 2018-11-27 深圳市前海手绘科技文化有限公司 A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method
CN109346057A (en) * 2018-10-29 2019-02-15 深圳市友杰智新科技有限公司 A kind of speech processing system of intelligence toy for children
CN109391842A (en) * 2018-11-16 2019-02-26 维沃移动通信有限公司 A kind of dubbing method, mobile terminal
CN109584858A (en) * 2019-01-08 2019-04-05 武汉西山艺创文化有限公司 A kind of virtual dubbing method and its device based on AI artificial intelligence
CN109903757A (en) * 2017-12-08 2019-06-18 佛山市顺德区美的电热电器制造有限公司 Method of speech processing, device, computer readable storage medium and server
CN110366032A (en) * 2019-08-09 2019-10-22 腾讯科技(深圳)有限公司 Video data handling procedure, device and video broadcasting method, device
CN110390925A (en) * 2019-08-02 2019-10-29 湖南国声声学科技股份有限公司深圳分公司 Voice and accompaniment synchronous method, terminal, bluetooth equipment and storage medium
CN110415678A (en) * 2019-06-13 2019-11-05 百度时代网络技术(北京)有限公司 Customized voice broadcast client, server, system and method
CN110459200A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 Phoneme synthesizing method, device, computer equipment and storage medium
CN110534131A (en) * 2019-08-30 2019-12-03 广州华多网络科技有限公司 A kind of audio frequency playing method and system
CN110753263A (en) * 2019-10-29 2020-02-04 腾讯科技(深圳)有限公司 Video dubbing method, device, terminal and storage medium
CN110933330A (en) * 2019-12-09 2020-03-27 广州酷狗计算机科技有限公司 Video dubbing method and device, computer equipment and computer-readable storage medium
CN111031386A (en) * 2019-12-17 2020-04-17 腾讯科技(深圳)有限公司 Video dubbing method and device based on voice synthesis, computer equipment and medium
CN111367870A (en) * 2018-12-25 2020-07-03 深圳市优必选科技有限公司 Method, device and system for sharing picture book
CN111916052A (en) * 2020-07-30 2020-11-10 北京声智科技有限公司 Voice synthesis method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1774715A (en) * 2003-04-14 2006-05-17 皇家飞利浦电子股份有限公司 System and method for performing automatic dubbing on an audio-visual stream
CN101189657A (en) * 2005-05-31 2008-05-28 皇家飞利浦电子股份有限公司 A method and a device for performing an automatic dubbing on a multimedia signal
US20120259630A1 (en) * 2011-04-11 2012-10-11 Samsung Electronics Co., Ltd. Display apparatus and voice conversion method thereof
CN103065620A (en) * 2012-12-27 2013-04-24 安徽科大讯飞信息科技股份有限公司 Method with which text input by user is received on mobile phone or webpage and synthetized to personalized voice in real time
CN103117057A (en) * 2012-12-27 2013-05-22 安徽科大讯飞信息科技股份有限公司 Application method of special human voice synthesis technique in mobile phone cartoon dubbing
CN103650002A (en) * 2011-05-06 2014-03-19 西尔股份有限公司 Video generation based on text
CN105227966A (en) * 2015-09-29 2016-01-06 深圳Tcl新技术有限公司 To televise control method, server and control system of televising
CN105763923A (en) * 2014-12-15 2016-07-13 乐视致新电子科技(天津)有限公司 Video and video template editing methods and device thereof

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1774715A (en) * 2003-04-14 2006-05-17 皇家飞利浦电子股份有限公司 System and method for performing automatic dubbing on an audio-visual stream
CN101189657A (en) * 2005-05-31 2008-05-28 皇家飞利浦电子股份有限公司 A method and a device for performing an automatic dubbing on a multimedia signal
US20120259630A1 (en) * 2011-04-11 2012-10-11 Samsung Electronics Co., Ltd. Display apparatus and voice conversion method thereof
CN103650002A (en) * 2011-05-06 2014-03-19 西尔股份有限公司 Video generation based on text
CN103065620A (en) * 2012-12-27 2013-04-24 安徽科大讯飞信息科技股份有限公司 Method with which text input by user is received on mobile phone or webpage and synthetized to personalized voice in real time
CN103117057A (en) * 2012-12-27 2013-05-22 安徽科大讯飞信息科技股份有限公司 Application method of special human voice synthesis technique in mobile phone cartoon dubbing
CN105763923A (en) * 2014-12-15 2016-07-13 乐视致新电子科技(天津)有限公司 Video and video template editing methods and device thereof
CN105227966A (en) * 2015-09-29 2016-01-06 深圳Tcl新技术有限公司 To televise control method, server and control system of televising

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107396177A (en) * 2017-08-28 2017-11-24 北京小米移动软件有限公司 Video broadcasting method, device and storage medium
CN107707974A (en) * 2017-09-18 2018-02-16 广东九联科技股份有限公司 A kind of realization method and system of special efficacy voice function
CN107770474A (en) * 2017-09-27 2018-03-06 北京金山安全软件有限公司 Sound processing method and device, terminal equipment and storage medium
CN108231059A (en) * 2017-11-27 2018-06-29 北京搜狗科技发展有限公司 Treating method and apparatus, the device for processing
CN108231059B (en) * 2017-11-27 2021-06-22 北京搜狗科技发展有限公司 Processing method and device for processing
CN109903757A (en) * 2017-12-08 2019-06-18 佛山市顺德区美的电热电器制造有限公司 Method of speech processing, device, computer readable storage medium and server
CN108744521A (en) * 2018-06-28 2018-11-06 网易(杭州)网络有限公司 The method and device of game speech production, electronic equipment, storage medium
CN108900886A (en) * 2018-07-18 2018-11-27 深圳市前海手绘科技文化有限公司 A kind of Freehandhand-drawing video intelligent dubs generation and synchronous method
CN109346057A (en) * 2018-10-29 2019-02-15 深圳市友杰智新科技有限公司 A kind of speech processing system of intelligence toy for children
CN109391842A (en) * 2018-11-16 2019-02-26 维沃移动通信有限公司 A kind of dubbing method, mobile terminal
CN109391842B (en) * 2018-11-16 2021-01-26 维沃移动通信有限公司 Dubbing method and mobile terminal
CN111367870A (en) * 2018-12-25 2020-07-03 深圳市优必选科技有限公司 Method, device and system for sharing picture book
CN109584858A (en) * 2019-01-08 2019-04-05 武汉西山艺创文化有限公司 A kind of virtual dubbing method and its device based on AI artificial intelligence
CN110415678A (en) * 2019-06-13 2019-11-05 百度时代网络技术(北京)有限公司 Customized voice broadcast client, server, system and method
CN110459200A (en) * 2019-07-05 2019-11-15 深圳壹账通智能科技有限公司 Phoneme synthesizing method, device, computer equipment and storage medium
CN110390925A (en) * 2019-08-02 2019-10-29 湖南国声声学科技股份有限公司深圳分公司 Voice and accompaniment synchronous method, terminal, bluetooth equipment and storage medium
CN110390925B (en) * 2019-08-02 2021-08-10 湖南国声声学科技股份有限公司深圳分公司 Method for synchronizing voice and accompaniment, terminal, Bluetooth device and storage medium
CN110366032A (en) * 2019-08-09 2019-10-22 腾讯科技(深圳)有限公司 Video data handling procedure, device and video broadcasting method, device
CN110366032B (en) * 2019-08-09 2020-12-15 腾讯科技(深圳)有限公司 Video data processing method and device and video playing method and device
CN110534131A (en) * 2019-08-30 2019-12-03 广州华多网络科技有限公司 A kind of audio frequency playing method and system
CN110753263A (en) * 2019-10-29 2020-02-04 腾讯科技(深圳)有限公司 Video dubbing method, device, terminal and storage medium
CN110933330A (en) * 2019-12-09 2020-03-27 广州酷狗计算机科技有限公司 Video dubbing method and device, computer equipment and computer-readable storage medium
CN111031386A (en) * 2019-12-17 2020-04-17 腾讯科技(深圳)有限公司 Video dubbing method and device based on voice synthesis, computer equipment and medium
CN111031386B (en) * 2019-12-17 2021-07-30 腾讯科技(深圳)有限公司 Video dubbing method and device based on voice synthesis, computer equipment and medium
CN111916052A (en) * 2020-07-30 2020-11-10 北京声智科技有限公司 Voice synthesis method and device

Similar Documents

Publication Publication Date Title
CN107172449A (en) Multi-medium play method, device and multimedia storage method
US6556972B1 (en) Method and apparatus for time-synchronized translation and synthesis of natural-language speech
EP3872806B1 (en) Text-to-speech from media content item snippets
US9190052B2 (en) Systems and methods for providing information discovery and retrieval
US6859778B1 (en) Method and apparatus for translating natural-language speech using multiple output phrases
CN110675886B (en) Audio signal processing method, device, electronic equipment and storage medium
US20090299748A1 (en) Multiple audio file processing method and system
CN109147800A (en) Answer method and device
US20140258858A1 (en) Content customization
AU2013259799A1 (en) Content customization
US20140258462A1 (en) Content customization
US11687576B1 (en) Summarizing content of live media programs
US11922931B2 (en) Systems and methods for phonetic-based natural language understanding
CN113691909B (en) Digital audio workstation with audio processing recommendations
CN109800296A (en) A kind of meaning of one&#39;s words fuzzy recognition method based on user&#39;s true intention
Sylvanus A brief history of TV and TV music practice in Nigeria
JP2013029684A (en) Web site system for voice data transcription
CN113851140A (en) Voice conversion correlation method, system and device
Bozonnet et al. A multimodal approach to initialisation for top-down speaker diarization of television shows
Liu Language, identity and unintelligibility: A case study of the rap group Higher Brothers
WO2024103383A1 (en) Audio processing method and apparatus, and device, storage medium and program product
CN113056908A (en) Video subtitle synthesis method and device, storage medium and electronic equipment
KR20210017730A (en) Method and server for providing music service in store
De Poli et al. From audio to content
US12026199B1 (en) Generating description pages for media entities

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170915