CN1755665A

CN1755665A - Voice file generating system and method

Info

Publication number: CN1755665A
Application number: CN 200410081060
Authority: CN
Inventors: 徐晓燕; 邱全成
Original assignee: Inventec Corp
Current assignee: Inventec Corp
Priority date: 2004-09-30
Filing date: 2004-09-30
Publication date: 2006-04-05

Abstract

The invention relates to a voice file generating system and method in data processing equipment. It mainly uses the resource accessing mechanism and the connection of the set resource path and the voice resource and the access condition to access the voice resource, then it uses file form changing mechanism to change the stored voice resource layout into the file layout, it uses the making interface and the tool provided by the after-mechanism to do after-treatment to the voice resource and stores them into the database.

Description

Speech filing system and method

Technical field

The invention relates to a kind of speech filing system and method, particularly about a kind of speech filing system and method that is applied on the data processing equipment.

Background technology

Along with making rapid progress of electronics and information industry development, various powerful and cheap consumer electronics information products come out one after another.For example, in order further to link up with the personage who uses foreign language, the data processing equipment that has function of language learning in a large number generally appears in the consumption market like rain the back spring bamboo.In the verbal learning process of carrying out as data processing equipments such as computing machine or e-dictionaries, how can provide the learner to be close to the academic environment identical with true man, reach need not by with true man's interaction, only by and this data processing equipment between interaction can reach the effect of verbal learning, become the developer the problem that must face.

It is a kind of true man's form of teaching of emulation that speech study function is provided, because the data-handling efficiency of data processing equipment and the significantly increase of information storage capacity now, handles leveling off to the voice audio of the former sound of voice and no longer cause developer's puzzlement.Existing language learning system and method are by playing one section voice document of pre-recording, and after the learner heard certain paragraph or all hears out, oneself was again with reading through.The user who is this mode of learning can't the oneself judge the effect of learning, therefore the developer proposes the language learning system that another kind has recognition function, it is by recording the learner with the voice of reading, judge voice pre-record and the difference degree of following between the voice of reading by recognition mechanism again, as the evaluation of learner's results of learning.

Above-mentioned existing voice learning system no doubt can provide the academic environment of hearing of an emulation of learner.Right these voice data all are that the fabricator by language learning system prerecords in this system, even if provide the user can obtain the voice data of upgrading or expanding from network or other data memory unit.On the other hand, the learner also can't for example set the specific paragraph of study, set original text captions and/or subtitlen etc. according to study condition or the relevant phonetic study environment of requirements set of self.Therefore, the efficient of phonetic study is difficult to effective raising.

In sum, how can provide a kind of has and can become the problem that needs to be resolved hurrily for the learner according to the study condition of self or the speech filing system and the method for requirements set related voice academic environment.

Summary of the invention

For solving the shortcoming of above-mentioned prior art, fundamental purpose of the present invention is to provide a kind of learner of supplying according to the study condition of self or the speech filing system and the method for requirements set related voice academic environment.

For reaching the above and other purpose, speech filing system of the present invention comprises: the resource access module is connected to the voice resource generator and according to access condition access voice resource according to the resource path of setting; The file layout modular converter is with the default file layout of voice resource format conversion one-tenth of institute's access; The post-production module provides and makes interface and instrument, meets the post production process of the voice resource of default form; And database, store the voice resource of this process post production process.

By this speech filing system, carry out the method that voice document generates and be: provide the resource access module to be connected to the voice resource generator according to the resource path of setting and according to access condition access voice resource; Provide the file layout modular converter that the voice resource format conversion of access is become default file layout; Provide the post-production module to provide and make interface and instrument, the voice resource that will meet default form carries out post production process; And the voice resource that this process post production process of database storage is provided.

Compare with existing voice document generation technique, speech filing system of the present invention and method can provide a kind of voice document post-production mechanism, for study condition or the requirements set related voice academic environment of learner according to self.

Description of drawings

Fig. 1 is the basic block diagram of speech filing system of the present invention; And

Fig. 2 is the process flow diagram of voice document generation method of the present invention.

Embodiment

See also Fig. 1, it is the basic block diagram of speech filing system 1 of the present invention, and as shown in the figure, speech filing system 1 of the present invention comprises resource access module 12, file layout modular converter 14, post-production module 16 and database 18.

In the present embodiment, speech filing system 1 of the present invention is to be applied in the personal computer 2, more specifically is the function that is used to provide this personal computer 2 language pronouncings study.What need specify is that this personal computer 2 comprises that in fact also other is used to carry out soft, the hard and/or firmware of data computing, for avoiding the technical characterictic of fuzzy this case, only shows and enforcement speech recognition system 1 of the present invention and the relevant part of method.In addition, also replaceable one-tenth of this personal computer 2 such as e-dictionary, personal digital assistant, mobile phone etc. have the data processing equipment that support voice goes out input function.On the other hand, this preferable personal computer 2 also has network connecting function, is connected to other voice resource generator 4 by network system 3, as server unit etc., carries out the access of voice resource.

This resource access module 12 is to be used for being connected to the voice resource generator and according to access condition access voice resource according to the resource path of setting.In the present embodiment, the resource path of these resource access module 12 foundations, can for example be connected to hard disk unit in this personal computer 2, disc storage device, as external storage unit such as carry-on dish of USB or reader device etc.; Also can for example be to meet on the resource address of consistance resource addressing device (URL) agreement, as resource generators 4 such as the webserver or file servers, wherein this consistance resource addressing device agreement can for example be: HTTP, Gopher, News, FTP or Telnet etc., this resource access module 12 can be connected to these voice resource generators 4 by network system 3.

In addition, this resource access module 12 can provide an input interface, when inputing to this input interface by this personal computer 2 with one in above-mentioned these resource paths for the user, can be connected to this hard disk unit, disc storage device, external storage unit and/or resource generators such as the webserver, file server according to this resource path, and this resource generator of the access resource, particularly voice resource that provide.This resource access module 12 also can be stored to the voice resource of access in hard disk unit, disc storage device and/or the external storage unit in this personal computer 2.

This document format converting module 14 is to be used for becoming default file layout according to the voice resource format conversion with access.In the present embodiment, this default voice resource file layout is digital sound files (digital audio file) form " .WAV " commonly used on the personal computer.Therefore, when the voice resource of these resource access module 12 accesses to " .WAV " voice document form in addition, as " .mp3 ", " .wma ", " .rm " ... Deng the time, this document format converting module 14 converts the voice resource of these " .WAV " voice document form in addition to " .WAV " file layout.

In addition, this document format converting module 14 will this former audio frequency and inputting audio be converted in the process of waveform signal the different sampling frequency (44kHz, 22kHz or 11kHz) that can set according to this sampling frequency setting module 12 and figure place (8 or 16) and single-tone/stereo etc.What need special instruction is, this document format converting module 14 also can be utilized other audio volume control conversion of signals form, as " .au ", " .snd ", " .voc ", " .aiff ", " .afc ", " .iff " or forms such as " .mat ", because these audio volume control conversion of signals forms are prior aries, so also will not give unnecessary details its content.

This post-production module 16 provides makes interface and instrument, be used for the voice resource of default form that this document format converting module 14 is converted to after, it is carried out post production process.In the present embodiment, this post-production module 16 can provide the user to comprise the post production process of breakpoint index, the time interval, original text captions and subtitlen etc. at least by this personal computer 2.Wherein, this time interval is to be used for one section voice resource is cut at least one section; This breakpoint index is to be used to provide the index title of setting each section after this cutting, the usefulness of confession user retrieval; These original text captions are to be used to provide the user to carry out importing and setting corresponding to the original text captions of voice data, show synchronously that in this voice resource playing process the original text captions supply user's control reference; This subtitlen then is that the subtitlen that is used to provide the user to carry out corresponding voice data is imported and set, show synchronously that in this voice resource playing process subtitlen is for user's control reference, preferable selection is, these original text captions can be set at synchronously with this subtitlen and show in the process that this voice resource is play, to increase learner's, particularly beginner learning efficiency.

This database 18 is the voice resources that are used to store this process post production process.In the present embodiment, after this voice resource being carried out post production process by this post-production module 16, for avoiding obscuring mutually from the raw tone resource of this voice resource generator access with this resource access module 12, so can this database 18 be set in this hard disk unit in this personal computer 2, disc storage device, external storage unit, store the voice resource that this post-production module 16 was handled, this voice resource can for example be the voice resource of handling through back system such as breakpoint index, the time interval, original text captions and subtitlen.

See also Fig. 2, it is the flow process of voice document generation method of the present invention.

In step S201, provide this resource access module 12 to be connected to the voice resource generator according to the resource path of setting and according to access condition access voice resource.In the present embodiment, the resource path of 12 foundations of this resource access module can for example be to be connected to external storage unit such as hard disk unit, disc storage device, the carry-on dish of USB or reader device in this personal computer 2 etc.; Also can for example be to meet on the resource address of consistance resource addressing device agreement as resource generators such as the webserver or file servers.

In addition, this resource access module 12 can provide an input interface, when inputing to this input interface by this personal computer 2 with one in above-mentioned these resource paths for the user, can be connected to this resource generator according to this resource path, and this resource generator of the access resource, particularly voice resource that provide.This resource access module 12 also can be stored to the voice resource of institute's access in hard disk unit, disc storage device and/or the outer formula storage unit in this personal computer 2.Then carry out step S202.

In step S202, provide this document format converting module 14 that the voice resource format conversion of institute's access is become default file layout.In the present embodiment, this default voice resource file layout is a digital sound files form " .WAV " commonly used on the personal computer.Therefore, when the voice resource of " .WAV " voice document form is in addition arrived in these resource access module 12 accesses, convert " .WAV " file layout to the voice resource that is about to these " .WAV " voice document form in addition.

In addition, this document format converting module 14 will this former audio frequency and inputting audio be converted in the process of waveform signal the different sampling frequency (44kHz, 22kHz or 11kHz) that can set according to this sampling frequency setting module 12 and figure place (8 or 16) and single-tone/stereo etc.Then carry out step S203.

In step S203, provide interface and the instrument made by post-production module 16, carry out post production process after this document format converting module 14 being converted to the voice resource of default form.In the present embodiment, this post-production module 16 can provide the user to comprise the post production process of breakpoint index, the time interval, original text captions and subtitlen etc. at least by this personal computer 2.Wherein, this time interval is to be used for one section voice resource is cut at least one section; This breakpoint index is to be used to provide the index title of setting each section after this cutting, the usefulness of confession user retrieval; These original text captions are that the original text captions that are used to provide the user to carry out corresponding voice data are imported and set, and show synchronously that in this voice resource playing process the original text captions supply user's control reference; This subtitlen then is that the subtitlen that is used to provide the user to carry out corresponding voice data is imported and set, show synchronously that in this voice resource playing process subtitlen is for user's control reference, preferable selection is, these original text captions can be set at synchronously with this subtitlen and show in the process that this voice resource is play, to increase learner's, particularly beginner learning efficiency.Then carry out step S204.

In step S204, provide this database 18 to store the voice resource of this process post production process.In the present embodiment, after this voice resource being carried out post production process by this post-production module 16, for avoiding obscuring mutually from the raw tone resource of this voice resource generator access with this resource access module 12, so can this database 18 be set in this hard disk unit in this personal computer 2, disc storage device, external storage unit, store this post-production module 16 and handle voice resource later, this voice resource can for example be the voice resource through post production process such as breakpoint index, the time interval, original text captions and subtitlens.

In sum, speech filing system of the present invention and method can provide making mechanism behind a kind of voice document, for study condition or the requirements set related voice academic environment of learner according to self.The user can with access to voice resource be made into the phonetic study resource that meets particular requirement, reach personalized phonetic study environment, to increase the efficient of study.

Claims

1. a speech filing system is applied in the data processing equipment, it is characterized in that, this speech filing system comprises:

The resource access module is connected to the voice resource generator and according to access condition access voice resource according to the resource path of setting;

The file layout modular converter is with the default file layout of voice resource format conversion one-tenth of institute's access;

The post-production module provides and makes interface and instrument, meets the post production process of the voice resource of default form; And

Database stores the voice resource of this process post production process.

2. the system as claimed in claim 1, it is characterized in that this resource path is one that is connected in the following resource generator: hard disk unit, disc storage device, external storage unit etc. and the data processing equipment that meets the resource address agreement of consistance resource addressing device agreement.

3. the system as claimed in claim 1 is characterized in that, this resource access module also provides an input interface, imports this resource path to this input interface by this data processing equipment.

4. the system as claimed in claim 1 is characterized in that, this resource access module also is stored to the voice resource of institute's access one in hard disk unit, disc storage device and the external storage unit in this data processing equipment.

5. the system as claimed in claim 1 is characterized in that, this default file layout is a kind of file layout in " .WAV ", " .au ", " .snd ", " .voc ", " .aiff ", " .afc ", " .iff " and " .mat " form.

6. system as claimed in claim 5 is characterized in that, this document format converting module is that the voice resource of the voice document form beyond the file layout that will preset converts default file layout to.

7. system as claimed in claim 6 is characterized in that, the voice document form beyond this default file layout is a kind of among " .mp3 ", " .wma " and " .rm ".

8. the system as claimed in claim 1 is characterized in that, this post-production module provides the user and comprises wherein a kind of post production process such as breakpoint index, the time interval, original text captions and subtitlen by this data processing equipment at least.

9. system as claimed in claim 2 is characterized in that, this storage unit is arranged in one of them device of this hard disk unit, disc storage device and external storage unit.

10. a voice document generation method is applied in the data processing equipment, and this voice document generation method comprises:

Provide the resource access module to be connected to the voice resource generator according to the resource path of setting and according to access condition access voice resource;

Provide the file layout modular converter that the voice resource format conversion of access is become default file layout;

Provide the post-production module to provide and make interface and instrument, the voice resource that will meet default form carries out post production process; And

The voice resource of this process post production process of database storage is provided.

11. method as claimed in claim 10, it is characterized in that this resource path is one that is connected in the following resource generator: hard disk unit, disc storage device, external storage unit etc. and the resource generator that meets the resource address agreement of consistance resource addressing device (URL) agreement.

12. method as claimed in claim 10 is characterized in that, this resource access module also provides an input interface, imports this resource path to this input interface by this data processing equipment.

13. method as claimed in claim 10 is characterized in that, this resource access module also is stored to the voice resource of institute's access one in hard disk unit, disc storage device and the external storage unit in this data processing equipment.

14. method as claimed in claim 10 is characterized in that, this default file layout is a kind of file layout in " .WAV ", " .au ", " .snd ", " .voc ", " .aiff ", " .afc ", " .iff " and " .mat " form.

15. method as claimed in claim 14 is characterized in that, this document format converting module is that the voice resource of the voice document form beyond the file layout that will preset converts default file layout file layout to.

16. method as claimed in claim 15 is characterized in that, the voice document form beyond this default file layout is a kind of among " .mp3 ", " .wma " and " .rm ".

17. method as claimed in claim 10 is characterized in that, this post-production module provides the user and comprises one of them post production process such as breakpoint index, the time interval, original text captions and subtitlen at least by this data processing equipment.

18. method as claimed in claim 11 is characterized in that, this storage unit is arranged on this hard disk unit, disc storage device and circumscribed storage unit in one of them.