CN1560816A

CN1560816A - Method and device for sync controlling voice frequency and text information

Info

Publication number: CN1560816A
Application number: CNA200410015393XA
Authority: CN
Inventors: 陈德卫; 李涛; 殷明
Original assignee: 陈德卫
Current assignee: Shenzhen Legend Technology Co Ltd
Priority date: 2004-02-18
Filing date: 2004-02-18
Publication date: 2005-01-05
Anticipated expiration: 2024-02-18
Also published as: CN1332365C

Abstract

The invention provides a method and device for realizing synchronous control of audio frequency and text information. The multi-media format files containing audio frequency information, text information, time label and control label is stored in the memory device. Under control of the microprocessor, it reads all the multi-media file formats, the audio frequency in them is displayed through the decoder and reading device, at the same time, the text information in them is displayed through displaying device. The method and device realize synchronous process to the audio frequency information and text information which are processed separately originally, and they are combined together, thus the consumer may hear and read the content.

Description

A kind of method and apparatus of realizing audio frequency and text message synchro control

Technical field

The present invention relates to be applied to the process field of the audio frequency and the text message of digital product, especially relate to a kind of method and apparatus of realizing audio frequency and text message synchro control.

Background technology

Traditional digital products such as portable language learning machine are merely able to support single medium, promptly have only audio frequency or have only Word message, therefore can't realize multimedia integrated learning mode, function singleness.Moreover, these equipment also depend on conventional information carriers such as books, tape and CD, can't work alone.Its reason is: on the one hand, the learning stuff of various non-digitalizations (particularly books and tape) is isolated to outside the computer, can't store and directly be used on computer, internet and other digital product; On the other hand, the foreign language learning material that utilizes multimedia technology to make is to make separately, can't directly be used on the existing non-digital portable equipment; Again on the one hand, in the prior art, audio-frequency information and text message are separate processes, can not realize that synchronous processing uses for the consumer.

At present, the mode of processing audio information is to adopt lossy compression method algorithm MP3 and the WMA that audio frequency is carried out high compression basically.

MP3 is the abbreviation of MPEG-1 Layer3 (MPEG:Moving Pictures Experts Group), be a kind of compression and the processing mode that decompresses of ISO (International Standards Organization) (International Standard Organization) definition, be used for handling the acoustic information of height ratio.The audio files tonequality that it generated is near CD, and file size have only its 1/12nd.

WMA (Windows Media Audio) comes from Microsoft, and tonequality is better than MP3 format, and it is to reduce data traffic but keep the method for tonequality to reach the purpose higher than MP3 compressibility, and the compressibility of WMA generally can reach about 1: 18.Be to have obtained admitting and support energetically of more and more websites under the large-scale promotion of Microsoft, in the network audio process field, obtained more and more the application.

But MP3 and WMA are processing audio information, and the semantic information of processed voice can only not listened when using on the MP3 walkman, can not read.

And the MP3 player is broadcast whole file (or entire chapter text) continuously, and simple " A-B " paragraph of general only support is re-reading, by determining re-reading syntagma by hand.This mode is very impracticable, and the user will rule of thumb operate, and often can only get the part of complete sentence, rather than whole complete sentence, re-readingly so just has little significance.Especially in the study use of foreign language, many user's reflections can't obtain exactly thinking re-reading sentence or paragraph, and be not suitable for the use of foreign language learning.

In a word, at present in digital product, especially in the product that can carry, audio frequency and text message are not combined the mode of synchronous processing, can't cause existing digital product great majority to be confined to recreations such as Music Appreciation with the two in conjunction with application.

Summary of the invention

Content of the present invention is to solve existing digital product can not realize the problem of audio frequency and text message synchro control, and a kind of method and apparatus of realizing audio frequency and text message synchro control is provided.

The present invention is on the basis that utilizes MP3 and WMA processing mode, a kind of method and apparatus of realizing audio frequency and text message synchro control is provided, the audio-frequency information and the text message of script separating treatment are set up synchronously, and combine, when playback of audio information, can show its corresponding text information simultaneously like this, and have high compression rate.

Method of the present invention is achieved in that

A kind of method that realizes audio frequency and text message synchro control, described method is to realize by the digital device that has microprocessor; Described digital device comprises microprocessor, demoder, memory storage, display device and reads device for tone frequencies, wherein, reads device for tone frequencies and is connected on the demoder, and carry out both-way communication with demoder, and microprocessor connects demoder, memory storage, display device; It is characterized in that described method comprises the steps:

(1) audio-frequency information and text message material are edited, generation has comprised the multimedia form file of audio-frequency information, text message, time tag and control label, and is stored in the described memory storage;

(2) under the control of microprocessor, read the multimedia form file that is stored in the described memory storage, according to time tag in the described multimedia form file and control label, audio-frequency information in the described multimedia form file by demoder with read device for tone frequencies and play back, is shown the text message in the described multimedia form file simultaneously by display device.

Wherein, described multimedia form file is based on the existing LRC (abbreviation of English word Lyrics, be a kind of be that aiming at of suffix realized the file layout that the lyrics and song display the play synchronously with .1rc) the distinctive formatted file of formatted file exploitation, its form is as follows:

Audio-frequency information+time tag+control label 1+ text 1+ control label 2+ text 2+...+ control label N+ text N, N 〉=1.

The difference of itself and LRC form is wherein to be provided with the control label, and described control label comprises switch labels and paragraph tag.And the LRC file has only defined basic time tag to realize synchronously.

Text message follows described switch labels closely and occurs; When reading switch labels, text message thereafter will be revealed.Described switch labels can have two or more, thereby has realized can carrying multiple text message under the control of a time tag.Wherein, first switch labels can be default.

Described paragraph tag is to be used to control staged operation, carries out the segmentation loop play when easy to use.

In the step (1) of the inventive method, the generation of described multimedia form file comprises following 3 steps:

(1.1) preparation of original material:

Collect voice messaging and text message respectively and be stored in the computing machine, described voice messaging can be the voice document of MP3, WMA or WAV form, and described text message can be the text of TXT or LRC form;

(1.2) add sync tag and control label:

Voice document and text are opened simultaneously joining day label and control corresponding label when playing;

(1.3) synthetic multimedia form file:

Under the situation that voice document and text are opened simultaneously, the application of multimedia formatted file is made software, the multimedia form file that will save File As, and be stored in the computing machine.

The present invention also provides the device of realizing the inventive method, comprise microprocessor, demoder, memory storage, display device and read device for tone frequencies, wherein, read device for tone frequencies and be connected on the demoder, and carrying out both-way communication with demoder, microprocessor connects demoder, memory storage, display device; It is characterized in that, store the multimedia form file in the described memory storage, described multimedia form file has comprised audio-frequency information, text message, time tag and control label; Described microprocessor control reading to described multimedia form file, according to time tag in the described multimedia form file and control label, audio-frequency information in the described multimedia form file by demoder with read device for tone frequencies and play back, is shown the text message in the described multimedia form file simultaneously by display device.

Described device can be portable learning machine or the personal computer that relevant software is housed.

Implement method and apparatus of the present invention, the audio-frequency information and the text message of script separating treatment are set up synchronously, and combined, make the consumer can listen and see this two-part content, very convenient use synchronously.And this control method makes audio-frequency information and text message to be stored in simultaneously in the existing digital product, is convenient to carry use, makes things convenient for consumer's study and amusement.The present invention is particularly useful for the study of language.

Description of drawings

The form schematic diagram of employed multimedia form file in the method that Fig. 1 realizes audio frequency and text message synchro control for the present invention is a kind of;

Fig. 2 is a kind of structural representation of realizing the device of audio frequency and text message synchro control of the present invention;

Fig. 3 forms synoptic diagram for the system that device shown in Figure 2 is applied to foreign language learning.

Embodiment

Embodiments of the invention mainly are the situations at foreign language learning.

In the present embodiment, adopted a kind of distinctive multimedia form file of developing based on existing LRC formatted file (SMP3).In the SMP3 file, kept the General Definition of LRC file, simultaneously, added the label of a series of special uses, be used for function with LRC and extend to and be more suitable for making multimedia foreign language learning material.

As shown in Figure 1, described SMP3 formatted file includes key elements such as audio stream, time tag, text message label, text message.These key elements are with the corresponding relation on free, thus the operation of realization corresponding synchronous.

Audio stream is by the MP3 scrambler, analoging sound signal is converted into digital signal, exists with binary mode, and its Code And Decode is all formulated according to the mpeg standard of ISO (International Standards Organization) ISO; Available in actual use audio stream comprises and a series of analoging sound signal is converted into digital signal, exists with binary mode, and WMA, ADPCM, AAC form, its Code And Decode are all made according to corresponding international standard;

Time tag is [mm:ss.ff], and wherein mm represents the number of minutes, and ss represents a second number, and ff represents 10 milliseconds of numbers.The form of time tag is identical with general statement time method, is that the starting point from audio stream begins counting, plays the effect of a pointer;

The control label comprises switch labels and paragraph tag.

The form of expression of switch labels is [Tag 01], [Tag 02], [Tag 03] ..., or represent with other mode.Text message follows switch labels closely and occurs, and when computing machine or digital equipment are read switch labels, will show after this label, to next label all the elements in the past, no matter be which type of content, employing be which kind of literal, program is not done any identification to content.Switch labels can have two or more, thereby has realized can carrying the expression of multiple text message under the control of a time tag.

According to sound foreign language properties of materials and needs, we have defined four kinds of switch labels, are described as follows:

(1) English label [Tag 01], the literal of representing its back is for English.

(2) Chinese label [Tag 02] represents that the literal of its back is Chinese.Under some occasions, be defined as " original text label "

(3) word label [Tag 03], the literal of representing its back is a word.

(4) answer label [Tag 04], the literal of representing its back is answer, or the answer in the dialogue.

The form of paragraph tag is [Tag 05], its objective is in order to carry out segmentation, particularly under the situation that very long article is arranged, adds paragraph tag and can allow program carry out staged operation.In the time of the present loop play of concrete operating body, can realize circulating by section.

According to the different qualities and the request for utilization of foreign language learning material, we have defined 6 kinds of file types, and file type is write the file the inside, so that program to different learning stuffs, is called different programs, embody different learning methods.It is as follows that these six kinds of learning stuffs and label thereof are formed mode:

The article learning stuff: the label that uses is: [Tag 01], [Tag 03], [Tag 02], wherein first label can be default.

The content of [Tag 01] back is the English with audio sync

The content of [Tag 02] back is the Chinese translation with audio sync

The content of [Tag 03] back is word and the note thereof that occurs in the words;

For example:

[Tag 03] David, hello for David's (name) [Tag 02]! David!

[Tag 02] hello! Old!

The word learning stuff: the label that uses is: [Tag 01], [Tag 02], wherein first label can be default.

The content of [Tag 01] back is word or the English example sentence with audio sync

The content of [Tag 02] back is the translation with the English example sentence of audio sync

Example is:

[00:01] [Tag 01] Apple, apple.

[00:02][Tag?01]I?am?eating?an?apple。I am eating an apple [Tag 02].

Hearing examination material: the label that uses is: [Tag 01], [Tag 04], [Tag 02], wherein first label can be default.

The content of [Tag 01] back is the several options with multiple-choice question, or the fill a vacancy exercise question of topic of hearing

The content of [Tag 02] back is the dialogue in the hearing examination question or the original text of paragraph

The content of [Tag 04] back is this problem purpose answer;

Example is:

[00:01][Tag?01]A、Apple?B、Coin[Tag?02]What’s?on?your?hand？It’s?a?coin。

[00:02][Tag?01]Listen?to?the?question。[Tag?02]What’s?on?your?hand？It’s?acoin。

[00:03][Tag?01]A、Apple??B、Coin[Tag?04]Answer：(B)

The dialogue learning stuff: the label that uses is: [Tag 01], [Tag 02], [Tag 04], wherein first label can be default.

The content of [Tag 01] back is the other side's word, generally is question sentence.

The content of [Tag 02] back is the translation of the other side's word

The content of [Tag 04] back is that this problem purpose is with reference to answer

Example is:

[00:01] [Tag 01] How are you? [Tag 02] how do you do? [Tag 04] I am fine!

WMA band lyrics music: the label that uses is [Tag 01], [Tag 02], and wherein first label can be default.

The content of [Tag 01] back is the lyrics with the music correspondence

The content of [Tag 02] back is the lyrics translation with the music correspondence

Example is:

[00:01][Tag?01]I?am?eating?an?apple。I am eating an apple [Tag 02].

[00:02]I?am?eating?an?apple。I am eating an apple [Tag 02].

MP3 band lyrics music: the label that uses is: [Tag 01], [Tag 02], wherein first label can be default.

The content of [Tag 01] back is the lyrics with the music correspondence

Example is:

[00:01][Tag?01]I?am?eating?an?apple。I am eating an apple [Tag 02].

[00:02]I?am?eating?an?apple。I am eating an apple [Tag 02].

The space of SMP3 file is constructed as follows:

The piece district	The address	Content
The piece district	The address	Content	Voice content	Start anew	Can be various audio frequency, mainly be WMA and MP3 at present
Content of text	Follow the voice ending closely	Various text messages are mainly from the LRC file	Voice content	Start anew
Content of text	Follow the voice ending closely	Various text messages are mainly from the LRC file	The material type sign	8 bytes+(0-7)	The definition of material type
Voice content length	8 bytes+(8-15)		The material type sign	8 bytes+(0-7)	The definition of material type
Voice content length	8 bytes+(8-15)		The LRC content-length	8 bytes+(16-31)
Reserve area	8 bytes+(32-39)		The LRC content-length	8 bytes+(16-31)

Described SMP3 file does not have fixing length, and actual length is by the length decision of voice content and content of text, and the length of voice content and content of text without limits.

What the address in the last table was represented is the side-play amount of address.Byte material type wherein, voice content length, word content length and the shared byte number of reserve area are predefined, can set different numerical value as required for.

The generation of SMP3 file comprises 3 steps:

(1) preparation of original material:

The SMP3 file has comprised voice messaging and text message since at present all be independent existence, so this two-part material must be prepared respectively.

Phonetic material mainly contains two sources, and the one, the ready-made material that exists with various digital audio forms such as MP3, WMA or WAV.For the audio material that exists with WAV or extended formatting, can will very convert MP3 or WMA form to by various audio conversion softwares.For being the audio data that MP3 or WMA form exist, come just can use as long as collect.

Make the text message that the SMP3 file needs, can be by artificial method, edit with notepad that Windows carried or the Word of Microsoft, exist in the computing machine with .TXT or .LRC form then.Here text message comprises English, Chinese, new word etc.

(2) add sync tag and control label:

Use SMP3 material software, with audio file (.mp3 .WMA) and text (.TXT or .LRC) open joining day label and control corresponding label (switch labels and paragraph tag) in broadcast simultaneously.

(3) synthetic SMP3 file:

Use SMP3 material software, under the situation that audio file and text are opened simultaneously, select " saving as SMP3 ", just can generate with .SP3 is the SMP3 file of suffix, exists in the computing machine.The client downloads into SMP3 foreign language learning machine or use SMP3 material playout software with file, just can carry out high efficiency study.

Be illustrated in figure 2 as a kind of structural representation of portable learning machine of the device of realizing audio frequency and text message synchro control.

Described learning machine is by microprocessor, demoder, storer, display device and reads device for tone frequencies and constitute, wherein, read device for tone frequencies and include earphone, X-over and microphone, earphone and microphone are connected respectively on the X-over, X-over is connected with demoder again, and carrying out both-way communication with demoder, demoder is connected with microprocessor again; Display device, it is the LCD LCDs, is connected to microprocessor by the LCD control module; Storer, it has the storage card and the information interface of canned data, and wherein information interface is connected with microprocessor;

Earphone wherein also can be replaced by loudspeaker.

Microprocessor connects demoder, information interface, LCD control module, and the execution of control program, read the data in the storer, with corresponding audio-frequency information by demoder with read device for tone frequencies and play back, simultaneously text message is shown by display device, be convenient to the consumer and use audio-frequency information and text message simultaneously, can understand the content that it is learnt timely and accurately, improve learning efficiency.

Microprocessor, it is connected with buffer memory ROM by I/O, so that reading control program is carried out information processing.And above-mentioned microprocessor, it is connected with FLASH by I/O, is convenient to reading of data.

Device of the present invention can also be connected with the FM module by the FM interface.

As shown in Figure 3, the multimedia foreign language learning stuff can be made into the file processing form of SMP3, and can be stored in computing machine, in the equipment such as internet and portable multimedia foreign language learning machine, when the consumer learns, can download the content of being learnt from the multimedia learning website by the internet, perhaps from computing machine, learn by the multimedia learning software of special use, or utilize portable multimedia language learner shown in Figure 2 to learn, the mode of learning that it adopted all is the same, can obtain audio frequency and text message simultaneously, be convenient to consumer's study.

Claims

1, a kind of method that realizes audio frequency and text message synchro control, described method is to realize by the digital device that has microprocessor; Described digital device comprises microprocessor, demoder, memory storage, display device and reads device for tone frequencies, wherein, reads device for tone frequencies and is connected on the demoder, and carry out both-way communication with demoder, and microprocessor connects demoder, memory storage, display device; It is characterized in that described method comprises the steps:

Wherein, the form of described multimedia form file is as follows:

2, the method for realization audio frequency according to claim 1 and text message synchro control is characterized in that, described control label comprises switch labels and paragraph tag; Text message follows described switch labels closely and occurs; When reading switch labels, text message thereafter will be revealed; Described paragraph tag is to be used to control staged operation.

3, the method for realization audio frequency according to claim 2 and text message synchro control is characterized in that, described switch labels can have two or more, under the control of a time tag, can carry multiple text message; Wherein, first switch labels can be default.

4, the method for realization audio frequency according to claim 1 and text message synchro control is characterized in that, in step (1), the generation of described multimedia form file comprises following 3 steps:

(1.1) preparation of original material:

(1.2) add sync tag and control label:

(1.3) synthetic multimedia form file:

5. device of realizing the method for described realization audio frequency of claim 1 and text message synchro control, comprise microprocessor, demoder, memory storage, display device and read device for tone frequencies, wherein, reading device for tone frequencies is connected on the demoder, and carrying out both-way communication with demoder, microprocessor connects demoder, memory storage, display device; It is characterized in that, store the multimedia form file in the described memory storage, described multimedia form file has comprised audio-frequency information, text message, time tag and control label; Described microprocessor control reading to described multimedia form file, according to time tag in the described multimedia form file and control label, audio-frequency information in the described multimedia form file by demoder with read device for tone frequencies and play back, is shown the text message in the described multimedia form file simultaneously by display device.

6, the device of the method for realization audio frequency according to claim 5 and text message synchro control, it is characterized in that, described memory storage can be made of storage card and information interface, and information interface is connected with microprocessor, and described multimedia file is stored in the storage card.

7, the device of the method for realization audio frequency according to claim 5 and text message synchro control, it is characterized in that, the described device for tone frequencies of reading comprises loudspeaker, X-over and microphone, and loudspeaker and microphone are connected respectively on the X-over, and X-over is connected with demoder again.

8, the device of the method for realization audio frequency according to claim 7 and text message synchro control is characterized in that, described loudspeaker also can use earphone to replace.

9, the device of the method for realization audio frequency as claimed in claim 5 and text message synchro control is characterized in that, described device can also be connected with the FM module by the FM interface.