Summary of the invention
The embodiment of the invention provides a kind of talking picture player method, device and index file generation method, and it is big to take storage space in order to talking picture in the solution prior art, uses underaction problem easily.
The talking picture player method that the embodiment of the invention provides comprises:
Obtain the index file of talking picture of talking picture correspondence to be played; Described index file of talking picture comprises: the attribute information of the corresponding relation of picture and text file, the attribute information of described picture and text;
Resolve described index file of talking picture, obtain described picture and text file;
With described text be converted to audio frequency and with described picture synchronous playing.
Also comprise the corresponding relation of picture and audio file and the attribute information of described audio file in the described index file of talking picture;
After resolving described index file of talking picture, also obtain described audio file;
And with described picture, text and described audio file synchronous playing.
Play after being translated into audio file by speech synthesis engine when playing described text.
Described method also comprises the steps:
Set up the corresponding relation of picture and text file;
With the attribute information of described corresponding relation, described picture and the attribute information of text, be stored as index file of talking picture.
The index file of talking picture generation method that the embodiment of the invention provides also comprises: the corresponding relation of setting up picture and audio file;
Also store the corresponding relation of described picture and audio file and the attribute information of described audio file at described index file of talking picture.
The described corresponding relation of setting up the picture and text file comprises:
Set up the one-one relationship between the picture and text file; Perhaps set up the many-one relationship between the picture and text file; Perhaps set up the many-to-many relationship between the picture and text file;
The described corresponding relation of setting up picture and audio file comprises:
Set up the one-one relationship between picture and the audio file; Perhaps set up the many-one relationship between picture and the audio file; Perhaps set up the many-to-many relationship between picture and the audio file.
The attribute information of described picture comprises the stored position information of picture at least;
The attribute information of described text comprises the stored position information of text at least;
The attribute information of described audio file comprises the stored position information of audio file at least.
The attribute information of described picture also comprises storage format, picture size and coded system one of them or the combination in any of picture;
The attribute information of described text also comprises storage format, text size and coded system one of them or the combination in any of text;
The attribute information of described audio file also comprises storage format, audio file size and coded system one of them or the combination in any of audio file.
The talking picture playing device that the embodiment of the invention provides comprises acquiring unit, pretreatment unit and broadcast unit, wherein:
Described acquiring unit is used to obtain the index file of talking picture corresponding with talking picture to be played; Described index file of talking picture comprises: the attribute information of the corresponding relation of picture and text file, the attribute information of described picture and text;
Described pretreatment unit is used to resolve described index file of talking picture, obtains described picture and text file, and sends to described broadcast unit respectively;
Described broadcast unit, be used for described text be converted to audio frequency and with described picture synchronous playing.
The talking picture playing device that the embodiment of the invention provides also comprises: index storage unit is used to store described index file of talking picture;
Described acquiring unit obtains described index file of talking picture from described index storage unit.
The talking picture playing device that the embodiment of the invention provides also comprises: information memory cell is used for picture and stores described text;
Described pretreatment unit obtains described picture and described text, and sends to described broadcast unit respectively from described information memory cell.
Go back storing audio files in the described information memory cell; Described pretreatment unit also obtains audio file from described information memory cell, and sends to described broadcast unit.
Described broadcast unit comprises that further picture shows subelement, speech synthesis engine subelement and voice playing subelement, wherein:
Described picture shows subelement, is used to receive described picture and shows;
Described speech synthesis engine subelement is used to receive described text, is converted into audio file, and sends to described voice playing subelement;
Described voice playing subelement is used to receive audio file and the broadcast that described speech synthesis engine subelement sends; Perhaps receive audio file and broadcast that described pretreatment unit and described speech synthesis engine subelement send.
The talking picture playing device that the embodiment of the invention provides also comprises control sub unit, also is used to control described picture and shows that subelement and voice playing subelement carry out synchronous playing to picture and audio frequency.
The embodiment of the invention is by setting up the corresponding relation of picture and text file; With the attribute information of described corresponding relation, described picture and the attribute information of text, be stored as index file of talking picture.When playing talking picture, resolve index file of talking picture, obtain the corresponding relation of picture and text file; According to described corresponding relation, search the picture and text file; According to the attribute information of described picture and text file, obtain described picture and text file; Described picture and text file synchronization is play.According to the scheme that the embodiment of the invention provides, the user only needs to establish a corresponding index file of talking picture for each talking picture in advance, can get access to picture and text realization synchronous playing by this index file of talking picture; Make things convenient for the user to make and use talking picture; And since text to take storage space less, can greatly reduce the required storage space that takies of talking picture storage.
Embodiment
Be explained in detail to the main realization principle of embodiment of the invention technical scheme, embodiment and to the beneficial effect that should be able to reach below in conjunction with each accompanying drawing.
As shown in Figure 1, the embodiment of the invention at first provides a kind of index file of talking picture generation method, and its cardinal principle flow process is as follows:
Step 11 is for picture is set up corresponding relation with the synchronous documents that is associated.
Talking picture comprises picture and the synchronous documents that is associated, and the synchronous documents in the embodiment of the invention comprises following two kinds of situations:
One, described synchronous documents only is a text;
Two, described synchronous documents comprises text, also comprises audio file simultaneously.
The synchronous documents that hereinafter is mentioned to can be one of above-mentioned two kinds of situations.
In order to keep the compatibility with existing file system, when talking picture is stored in storer, the mode that still adopts picture file and synchronous documents to store respectively.A picture file can be associated with a plurality of synchronous documents, also can be associated with a synchronous documents by a plurality of picture files, can also be associated with a plurality of synchronous documents by a plurality of picture files.The utilization factor of picture file and synchronous documents can be improved like this, and the storage space that is used for picture file and synchronous documents can be effectively saved.
The text related with picture can be converted into voice and realize the talking picture broadcast by speech synthesis engine with text conversion Audiotechnica (TTS, Text ToSpeech) and the text message of inciting somebody to action wherein.
When talking picture is stored, need be according to the related information between the picture that is provided with and the synchronous documents that is associated, set up the corresponding relation of picture and the synchronous documents that is associated.
Step 12, the attribute information with the attribute information of the corresponding relation of picture and the synchronous documents that is associated, picture and the synchronous documents that is associated is stored as index file of talking picture.
After setting up the corresponding relation of picture and the synchronous documents that is associated, need set up index information for talking picture.Common index information is stored in the mode of index file of talking picture, not only need to comprise the corresponding relation of picture and the synchronous documents that is associated in the index file of talking picture, also need to comprise the attribute information of picture file and the attribute information of the synchronous documents that is associated.
The attribute information of picture file described here for example comprises storage format, memory location, size and the coded system etc. of picture file; The be associated attribute information of synchronous documents for example comprises storage format, memory location, size and the coded system etc. of the synchronous documents that is associated.
Index file of talking picture can also can generate when setting up talking picture automatically by the user by manually generating.
Index file of talking picture is logically finished the merging of picture with the synchronous documents that is associated of talking picture, realizes related between picture and the synchronous documents that is associated with the form of index file of talking picture.In actual applications, as shown in Figure 2, picture file is still stored respectively with existing document form with the synchronous documents that is associated, mutually between and onrelevant.With the attribute information of picture file, be stored as index file of talking picture with corresponding form with the synchronous documents that is associated.Owing to comprising attribute information and the corresponding relation of picture in the index file of talking picture, therefore, can obtain talking picture relevant picture and synchronous documents, and then talking picture is play by index file of talking picture with the synchronous documents that is associated.
Preferable, a plurality of index file of talking picture can be formed the index information storehouse, and by the search index information bank, the user need can select the talking picture of broadcast easily.
Accordingly, the embodiment of the invention also provides a kind of talking picture player method, and as shown in Figure 3, this method is specific as follows:
Step 21 is resolved the index file of talking picture of talking picture correspondence to be played, obtains the corresponding relation of picture and the synchronous documents that is associated.
When talking picture is play, at first need to obtain corresponding index file of talking picture.According to the content of index file of talking picture, obtain the corresponding relation of picture and the synchronous documents that is associated.
Index file of talking picture can obtain by the search index information bank.
Step 22 according to described corresponding relation, is searched picture and the synchronous documents that is associated.
According to the corresponding relation of the picture that obtains, further search corresponding picture and synchronous documents with the synchronous documents that is associated.
Here, a picture file can be associated with a plurality of synchronous documents, also can be associated with a synchronous documents by a plurality of picture files, and all right a plurality of picture files and a plurality of synchronous documents are interrelated.
Step 23 according to the attribute information of picture with the synchronous documents that is associated, is obtained picture and the synchronous documents that is associated.
After finding concrete picture and the synchronous documents that is associated, need obtain picture and the synchronous documents that is associated, the foundation of obtaining is the picture of storing in the index file of talking picture and the attribute information of the synchronous documents that is associated.
In the attribute information of the picture of storing in the index file of talking picture and the synchronous documents that is associated, can comprise storage format, memory location, size and the coded system etc. of picture file; The attribute information of the synchronous documents that is associated can comprise storage format, memory location, size and the coded system etc. of the synchronous documents that is associated.
Step 24 is with picture and the synchronous documents synchronous playing that is associated.
According to the attribute information of picture with the synchronous documents that is associated, the synchronous documents that not only can obtain picture easily from corresponding memory location and be associated, and, can also know file size, file memory format and the coded system etc. of picture and the synchronous documents that is associated.According to these attribute informations, can adopt corresponding playing program that talking picture is carried out synchronous playing.
For example, from the attribute information of synchronous documents, can know that synchronous documents is audio file and text or only is text that if comprise audio file in the synchronous documents, then this audio file can directly be play by audio player; For the text in the synchronous documents, then need to call speech synthesis engine with TTS technology, text is converted to audio file, play by audio player then.
Here, when the synchronous documents that is associated with picture in talking picture is text, need calls speech synthesis engine text is converted to audio file with TTS technology.According to the category of language that the difference and the speech synthesis engine of the employed language of text itself are supported, speech synthesis engine can be converted into text the multilingual form, and plays by audio player.For example, demand according to the user, the language that the text in the text can be converted into any speech synthesis engine supports such as English, Russian, French is play, and certainly, also text can be transformed into dialect (as Sichuan words, Guangdong language etc.) and play.
Preferable, when talking picture was play, the content that the user can self-defined broadcast that is to say, the user can select the type and the content of the synchronous documents that is associated with picture in the talking picture, according to the broadcast form of requirement definition talking picture of self.For example: only play wherein audio file, only play text, perhaps displaying audio file and text simultaneously.
Preferable, when the synchronous documents that is associated with picture in talking picture was text, a kind of realization and player method flow process of concrete talking picture were as follows:
When 1, talking picture being stored, at first in storer, deposit picture in, import the text message that picture therewith is associated then, preserve;
2, picture of Bao Cuning and text message can be by after the pre-service, generate index file of talking picture, are kept in the middle of the storer or are uploaded to network, deposit in the network storage server.Pre-service is to set up index file of talking picture for talking picture, and this index file of talking picture has comprised the attribute information of picture and the attribute information of associated text, can be text 1, text 2...... text n.
3, in the time that talking picture will be play, call the index file of talking picture in the index information storehouse.Concrete grammar is: the index file of talking picture that the retrieval previous step is set up, and according to the file in download from storer or network storage server of the information in the index file of talking picture.So just the picture file and the text of association are downloaded.
4, with picture file decoding and export display device to.
5, simultaneously, text message is imported the TTS speech synthesis engine, text message is converted to audio-frequency information, and plays by audio player.
Preferable, the 4th step was used direct memory access (DMA) (DMA, Direct MemoryAccess) technology with the 5th step process, thereby the demonstration of picture and the broadcast of voice can be carried out synchronously, and saved cpu resource.
The embodiment of the invention can be that speech form is play with the stored text file conversion owing to introduce the TTS technology.Thereby, the talking picture in the embodiment of the invention, the synchronous documents that is associated with picture can also comprise text except general audio file (recording or music); When playing talking picture, the text that will be associated with picture by the speech synthesis engine with TTS technology converts corresponding speech form to and plays.Therefore, only need just can realize voice output for the related relevant text of picture.Not only can save the recording process of a large amount of audio files, and can reduce the storage space that talking picture takies greatly, and, because the employed language of text can be any language that system supports, speech synthesis engine with TTS technology also can be play corresponding text by the language that any system supports, makes that the use of talking picture is very flexible.
Correspondingly, the embodiment of the invention also provides a kind of talking picture playing device functional structure as shown in Figure 4, and this device comprises acquiring unit 31, pretreatment unit 32 and broadcast unit 33, and is specific as follows:
Acquiring unit 31 obtains the index file of talking picture corresponding with talking picture to be played.
Index file of talking picture described here comprises: the attribute information of the corresponding relation of picture and the synchronous documents that is associated, the attribute information of picture and the synchronous documents that is associated.The synchronous documents that is associated comprises text, perhaps comprises text and audio file.
Pretreatment unit 32 is used to resolve described index file of talking picture, obtains described picture and the synchronous documents that is associated, and sends to described broadcast unit respectively.
Broadcast unit 33, the text that is used for receiving be converted to audio frequency and with described picture synchronous playing, for the audio file that receives direct synchronous playing then.
Preferable, as shown in Figure 5, above-mentioned talking picture playing device further comprises index storage unit 34 and information memory cell 35, and is specific as follows:
Index storage unit 34 is used to store index file of talking picture.
Information memory cell 35 is used for picture information, and stores the synchronous documents that is associated.
Acquiring unit 31 obtains index file of talking picture from index storage unit 34.
Pretreatment unit 32 obtains picture and relevant connection synchronous documents from information memory cell 35.
Preferable, as shown in Figure 6, the broadcast unit 33 further pictures in the above-mentioned talking picture playing device show subelement 331, speech synthesis engine subelement 332 and voice playing subelement 333, and are specific as follows:
Picture shows subelement 331, is used to receive picture and shows.
Speech synthesis engine subelement 332 is used to receive text file, and the text message in the text is converted into audio frequency, generates corresponding audio files, and sends to voice playing subelement 333.
Voice playing subelement 333 is used to receive the audio file of pretreatment unit 32 and/or 332 transmissions of speech synthesis engine subelement and play.
Preferable, can also comprise control sub unit (not illustrating among Fig. 6), be used to control the synchronous playing that picture shows subelement 331 and 333 pairs of pictures of voice playing subelement and audio frequency.
Preferable, based on device shown in Figure 4, auxiliary unit additional among Fig. 5 and Fig. 6 can mutually combine, and obtains the more comprehensive talking picture playing device of function.
As shown in Figure 7, a kind of preferable talking picture playing device specific implementation structure is as follows:
Storer is used to provide the corresponding function of above-mentioned index storage unit 34 and information memory cell 35, and the storage talking picture comprises picture and the synchronous documents that is associated.
Obtain/pretreatment module, be used to provide the corresponding function of above-mentioned acquiring unit 31, pretreatment unit 32, for talking picture is set up index file of talking picture.Composing picture, text and audio file in logic.The text here is the synchronous documents that is associated with picture in the talking picture with audio file.Pretreatment module also is responsible for the retrieval index file of talking picture, find the synchronous documents of picture association and decomposite relevant picture, text and audio file, and picture sent to picture driver module in the broadcast unit, text is sent to speech synthesis engine driver module in the broadcast unit, audio file is directly sent to audio conversion driver module in the broadcast unit.
Broadcast unit comprises: picture driver module, speech synthesis engine driver module and audio conversion driver module.Wherein:
The picture driver module is used to provide above-mentioned picture to show the corresponding function of subelement 331, is used for picture is shown.
The speech synthesis engine driver module is used to provide the corresponding function of above-mentioned speech synthesis engine subelement 332, and the text message in the text is converted into audio-frequency information, generates corresponding audio file, and sends to the audio conversion driver module.
The audio conversion driver module is used to provide the corresponding function of above-mentioned voice playing subelement 333, with obtain/audio-frequency information in the audio file that pretreatment module and speech synthesis engine driver module send carries out digital-to-analog conversion, and with the pictorial information synchronous playing.
As shown in Figure 8, a kind of hardware design principle of preferable talking picture playing device is specific as follows:
Memory interface obtains the talking picture data from storer; CPU carries out separating treatment to the form of talking picture data, isolate image, text and audio file, call corresponding picture processing, display driver, text message analyzing and processing and phonetic synthesis driving respectively and carry out phonetic synthesis and coding, audio file is decoded and digital-to-analog conversion, and last image and voice data are transferred to display interface and audio frequency and encoding and decoding interface respectively and show respectively and broadcast.Further, use the DMA technology to guarantee that whole process is good synchronously.
Wherein, need comprise support in the memory interface, for example: various FLASH, various storage card, hard disk and portable hard drive etc. to various memory devices.CPU finishes system control, the analysis of view data and text, decoding, and audio frequency such as synthesizes at function.Display interface is then finished and is received the view data demonstration.Audio frequency and encoding and decoding interface then are that original audio data is carried out digital-to-analog conversion and broadcast.DMA interface is in order to guarantee to allow picture and audio sync smoothness, also saves the interface that cpu resource institute must interpolation simultaneously.
In sum, the scheme that the embodiment of the invention provided has reduced the shared storage space of talking picture storage, and, can satisfy the demand of user's flexible use.
Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.