CN101521853A

CN101521853A - Method for converting multimedia with personalized speech and service end

Info

Publication number: CN101521853A
Application number: CN200810020316A
Authority: CN
Inventors: 李嘉辉
Original assignee: AVANTOUCH SOFTWARE Co Ltd
Current assignee: Li Jiahui
Priority date: 2008-02-29
Filing date: 2008-02-29
Publication date: 2009-09-02

Abstract

The invention discloses a method for converting multimedia with personalized speech, which comprises the following steps: a sending end sends text message to a service end; after being received by the service end, the text message is converted into multimedia file, and the text message comprises sentence message; after receiving the text message and converting the text message into the multimedia file, the service end converts the sentence message in the text message into personalized voice file corresponding to the voice file uploaded by the sending end in advance according to the voice file uploaded by the sending end in advance; and then, the converted voice file replaces audio parts in the multimedia file. The invention also provides the service end for implementing the method. According to the technical scheme, the service end can better customize multimedia in a personalized mode, and can further increase the user experience degree.

Description

The method and the service end that have the multimedia conversion of personalized speech

Technical field

The present invention relates to network communications technology field, particularly a kind of method that has the multimedia conversion of personalized speech, and the service end of implementing this method.

Background technology

Growing along with cell phone entertainment industry, multimedia transmission based on wireless network is more and more universal, and people improve day by day to the requirement of multimedia communication content, people no longer are satisfied with single, passive traditional media entertainment way, but need personalized, interactive more multimedia communication mode.Particularly along with the further developing of mobile network communication technology, and mobile multi-media service is universal day by day, and mobile multimedia has shown more and more that it extensively and get over ripe user's cognition and acceptance.

Present stage, by the voice that the multimedia that the service provider provided is sent, its tamber characteristic and prosodic features all are that service end is intrinsic, can't satisfy user's individual demand.

Summary of the invention

The problem to be solved in the present invention provides a kind of method of the multimedia conversion that has a personalized speech, and making can personalized ground customizing multimedia at user side, effectively improves user experience.

For achieving the above object, a kind of method that has the multimedia conversion of personalized speech, transmitting terminal sends text message to service end, service end receives afterwards described text message converting multimedia file, it is characterized in that: described text message comprises statement information, service end receives after text information and the converting multimedia file, the voice document of uploading in advance according to transmitting terminal again, with the voice document of the statement information translation in the described text message, then the voice document that is converted to is replaced the audio-frequency unit in the described multimedia file for the corresponding personalization of voice document of uploading in advance with transmitting terminal.

Further, the method of the above-mentioned multimedia conversion that has personalized speech, wherein, service end with described statement information translation for the method for personalized voice document is, the existing mature personalized speech generation technique of service end utilization, by with the corresponding voice document of transmitting terminal, be personalized voice document with described statement information translation.

Again further, the method of the above-mentioned multimedia conversion that has personalized speech, wherein, service end with the method that described text message is converted to multimedia file is, service end from the multimedia file storehouse, directly obtain with described text message in the middle of the multimedia file that is complementary of statement information, the multimedia file after obtaining changing.

Further, the method of the above-mentioned multimedia conversion that has personalized speech, wherein, service end with the method that described text message is converted to multimedia file is, service end is independent word or speech with the statement information decomposition in the middle of the described text message earlier, from the multimedia file storehouse, obtain the multimedia file that is complementary with each independent word or speech then respectively, the multimedia file that is complementary with each independent word or speech that will obtain then merges the multimedia file after obtaining changing.

The present invention also provides the service end of implementing said method, comprising:

Receiving element is used to receive the text message that comes from transmitting terminal, and described text message is delivered to processing unit processes; And be used for receiving and the corresponding voice document of described transmitting terminal, and described voice document is sent to cell stores;

Memory cell is used to store the voice document that comes from receiving element;

Processing unit is used for the text message converting multimedia file that comes from receiving element with described; And be used for the corresponding voice document of described and transmitting terminal stored according to described memory cell, described statement information translation is become personalized voice document, and described voice document is replaced audio-frequency unit in the described multimedia file.

Above-mentioned service end can also comprise transmitting element, is used for described multimedia file is sent.

In a word, the present invention provides new cognition for the network communications technology, by implementing disclosed technical scheme, service end personalized customization multimedia better, receiving terminal receives and comes from transmitting terminal with the text mode transmission and after the multimedia that service end converts to, the personal characteristics of its audio-frequency unit and transmit leg matches, and similarly is that transmit leg is being spoken to the recipient, and the user experience of network communications technology is greatly improved.

Description of drawings

Fig. 1 is the schematic flow sheet of the method for the multimedia conversion that has a personalized speech provided by the invention;

Fig. 2 is the schematic flow sheet of a kind of embodiment of the present invention;

Fig. 3 is the schematic diagram of the service end that discloses of the present invention.

Embodiment

Utilize the personalized speech generation technique,, just can obtain the pairing voice document of this any literal, and this voice document sounds similarly being that the target people says as long as provide one section voice document of any literal and target people.This personalized speech generation technique implementation method, by the text of input is analyzed, obtain speech synthesis technique (Text-to-Speech is called for short TTS) parameter, be converted into target people's speech parameter then, finally synthetic and acquisition approaches target people's voice.

The embodiment of the invention provides a kind of method of the multimedia conversion that has a personalized speech, and making can personalized customizing multimedia at user side, has effectively improved user experience.Below the specific embodiment of the present invention is described in further detail.

Embodiment one:

A kind of method that has the multimedia conversion of personalized speech that present embodiment disclosed as shown in Figure 1, may further comprise the steps:

Step 101: service end receives the text message that comes from transmitting terminal;

The source of text message can comprise: sending side terminal is by the text message of keyboard input, perhaps the text message that obtains after transforming by speech recognition software of the voice of terminal microphone input.

The text message that present embodiment is alleged can comprise:

Receiving terminal information particularly, can be recipient's phone number; Receiving terminal information also can be transmitting terminal itself, and under the situation of omitting receiving terminal information, system can be defaulted as receiving terminal and be transmitting terminal itself;

The type indication information is used to the multimedia type of indicating text information to change into;

Statement information, the information that sends to reciever such as the user is as Word messages such as " happy birthday to you ".

Step 102: service end is with described text message converting multimedia file;

In this step, a kind of concrete mode comprises: described service end is obtained the multimedia file that is complementary with described statement information in the multimedia file storehouse as the described multimedia file that converts to, be about to described statement information such as " happy birthday to you " as whole removing coupling multimedia file storehouse, obtain the multimedia file that corresponding multimedia file obtains after as described conversion; Perhaps,

In this step, another kind of concrete mode comprises:

Described service end is independent word or speech with described statement information decomposition; Such as statement information " happy birthday to you " is decomposed into independent one by one word and speech, described then service end is obtained respectively in the multimedia file storehouse, the multimedia file that is complementary with described each independent word or speech; The multimedia file that is complementary with described each independent word or speech that obtains is merged, obtain the described multimedia file that converts to.

Further, service end can also judge whether transmit leg uploaded individual voice document according to sender number.If transmit leg was never uploaded individual voice document, then current multimedia file is exactly final multimedia file, will directly send to the recipient.If uploaded individual voice document before the transmit leg, then service end can be utilized the personalized speech generation technique, this multimedia file is further handled, promptly change step 103, making voice that described multimedia file sends still is all very near its individual voice document of uploading in advance on the prosodic features from tamber characteristic.

The transmit leg user can upload multiple modes such as voice document or online recording voluntarily by individual voice document being given relevant Wap of service provider's counter attendant or login or Web website, make service end obtain individual voice document, the phone number that the user is provided is realized corresponding one by one with said individual voice document simultaneously.

Step 103: when existing with the corresponding voice document of described transmitting terminal, described service end is personalized voice document according to the described and corresponding voice document of transmitting terminal with described statement information translation;

In this step, said is personalized voice document with described statement information translation, can be:

The existing mature personalized speech generation technique of described service end utilization, by with the corresponding voice document of transmitting terminal, be the voice document of personalization with described statement information translation.

Step 104: described service end is replaced audio-frequency unit in the described multimedia file with the described voice document that is converted to.

Terminal that present embodiment is alleged or transmitting terminal or receiving terminal can be catv terminals, as are connected to the PC of Internet; Can be wireless terminal also, as mobile phone.Send mode can be to send by the application software that mobile phone terminal is installed, and also can pass through WAP (wireless application protocol) (WAP, Wireless Application Protocol) and enter edit file transmission again behind the WAP inputting interface.

The multimedia that present embodiment is alleged, include but not limited to: MPEG, AVI, RMVB, WMV, SWF, VIV, ASF, RM, RA, RP, RT, MOV, QT, 3GPP, MP4,3D, JPEG, PNG, GIF, BMP, AMR, MMF, 3GPP, MP4, RM, AVI, WAV, APE, MP3/MP2/MP1/MPGA, WMA/ASF, MIDI/MID, VQF, AIF/AIFF, AU, VOC, AAC, VOX etc.

By implementing the technical scheme that present embodiment disclosed, make that service end can personalized customizing multimedia, strengthened recreationally greatly, effectively improved user experience.

Embodiment two:

A kind of method that has the multimedia conversion of personalized speech that present embodiment discloses can be based on embodiment one, and as shown in Figure 2, process comprises for example:

Xiao Zhang wants to send the multimedia file of relevant birthday blessing to Xiao Li, Xiao Zhang imports birthday blessing statement on transmitting terminal mobile phone 201, for example " wish you today happy birthday ", input Xiao Li's phone number, send to the shortcode of appointment with the form of short message, behind sms center 202 and multimedia message/short message server 203, above-mentioned information is forwarded to processing server 204, after processing server 204 is received this note, " wish you today happy birthday " is decomposed into independent word, be the corresponding multimedia file of each independent word coupling then, again these multimedia files merged into a multimedia file.Then, go to search Xiao Zhang according to transmit leg Xiao Zhang's phone number and whether uploaded voice document in advance, if uploaded, use the personalized speech generation technique that literal " wish you today happy birthday " is converted into the voice document of propertyization one by one, this voice document sounds similarly being that Xiao Zhang says.Further, this voice document can be replaced the audio-frequency unit that covers multimedia file.

Further, the final multimedia file that generates after " multimedia message/short message server " and " sms center " handled, is sent on Xiao Li's the mobile phone 205 again.

By implementing the technical scheme that present embodiment disclosed, make that service end can personalized ground customizing multimedia, strengthened recreationally greatly, effectively improved user experience.

Embodiment three:

Present embodiment provides a kind of service end, as Fig. 3, comprising:

Further, described service end can also comprise: transmitting element is used for described multimedia file is sent.

In sum, by implementing the technical scheme that present embodiment disclosed, service end personalized customization multimedia better, receiving terminal receives and comes from transmitting terminal with the text mode transmission and after the multimedia that service end converts to, the personal characteristics of its audio-frequency unit and transmit leg matches, similarly be that transmit leg is being spoken to the recipient, the user experience of network communications technology is greatly improved.

Specific embodiment described above only is schematic, wherein said unit as separating component explanation can, also can not be physically to separate both, the parts that show as the unit can be or can not be physical locations also, promptly can be positioned at a place, perhaps also can be distributed on a plurality of network element.Those of ordinary skills under the situation of not paying performing creative labour, promptly can understand and implement other all concrete mode based on the technical scheme that the present invention discloses.

By the description of above execution mode, those skilled in the art can be well understood to the present invention and can realize by software and essential general hardware platform, can certainly pass through hardware, but the former is better selection under a lot of situation.Based on such understanding, the part that technical solution of the present invention contributes to background technology in essence in other words, can embody with the form of software product, this computer software product can be stored in the storage medium, comprise some instructions, make a computer equipment carry out the described method of each embodiment of the present invention after the operation.

The above only is embodiments of the present invention; should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims

1, a kind of method that has the multimedia conversion of personalized speech, transmitting terminal sends text message to service end, service end receives afterwards described text message converting multimedia file, it is characterized in that: described text message comprises statement information, service end receives after text information and the converting multimedia file, the voice document of uploading in advance according to transmitting terminal again, with the voice document of the statement information translation in the described text message, then the voice document that is converted to is replaced the audio-frequency unit in the described multimedia file for the corresponding personalization of voice document of uploading in advance with transmitting terminal.

2, the method that has the multimedia conversion of personalized speech according to claim 1, it is characterized in that: service end with described statement information translation for the method for personalized voice document is, the existing mature personalized speech generation technique of service end utilization, by with the corresponding voice document of transmitting terminal, be personalized voice document with described statement information translation.

3, the method that has the multimedia conversion of personalized speech according to claim 1 and 2, it is characterized in that: service end with the method that described text message is converted to multimedia file is, service end from the multimedia file storehouse, directly obtain with described text message in the middle of the multimedia file that is complementary of statement information, the multimedia file after obtaining changing.

4, the method that has the multimedia conversion of personalized speech according to claim 1 and 2, it is characterized in that: service end with the method that described text message is converted to multimedia file is, service end is independent word or speech with the statement information decomposition in the middle of the described text message earlier, from the multimedia file storehouse, obtain the multimedia file that is complementary with each independent word or speech then respectively, the multimedia file that is complementary with each independent word or speech that will obtain then merges the multimedia file after obtaining changing.

5, a kind of service end is characterized in that: comprises,

Processing unit is used for the text message converting multimedia file that comes from receiving element with described; And be used for the corresponding voice document of described and transmitting terminal stored according to described memory cell, described statement information translation is become the personalized speech file, and described voice document is replaced audio-frequency unit in the described multimedia file.

6, service end as claimed in claim 5 is characterized in that: also comprise transmitting element, be used for described multimedia file is sent.