CN105261356A

CN105261356A - Voice recognition system and method

Info

Publication number: CN105261356A
Application number: CN201510728467.2A
Authority: CN
Inventors: 范浩
Original assignee: GUILIN XINTONG TECHNOLOGY Co Ltd
Current assignee: GUILIN XINTONG TECHNOLOGY Co Ltd
Priority date: 2015-10-30
Filing date: 2015-10-30
Publication date: 2016-01-20

Abstract

The invention relates to a voice recognition system and method. The system comprises an acquisition module, a conversion module, an extraction module, a coupling module, an execution module, and a voice database, wherein the acquisition module is used for collecting to-be-recognized voice information; the conversion module is used for converting the to-be-recognized voice information into first standard audio information capable of being recognized by the extraction module; the extraction module is used for analyzing the first standard audio information and extracting a keyword from the first standard audio information; the coupling module is used for calling a target command word and coupling the target command word with the keyword in the first standard audio information, and if the coupling succeeds, sending the corresponding target command word to the execution module; and the execution module is used for receiving the target command word and executing a corresponding target action. According to the voice recognition system and method, the recognition rate and recognition accuracy of a voice signal is improved; the corresponding target action is executed through the execution module; and automation of voice control is realized. Under the precondition of ensuring the quality of voice recognition, user's experience and operation efficiency is improved.

Description

A kind of speech recognition system and method

Technical field

The present invention relates to technical field of voice recognition, particularly relate to a kind of speech recognition system and method.

Background technology

Speech recognition technology is that machine is converted to corresponding word or symbol by the sound, syllable or the phrase that identify and people sends by understanding process, or provide response, as performed control, making answer etc., its application widely, almost relate to each field of life, such as computing machine control, Industry Control, information network inquiry etc.

Speech recognition system, according to the requirement of different recognition system, can be divided into much different kinds.Such as, according to the difference identifying object, can be divided into: isolated word (word) identification, connection string, continuous speech recognition; Can be divided into according to the limited range of speaker: particular person and signer-independent sign language recognition system; Divide according to recognition methods, mainly contain: template matching method, probability model method, based on systems such as artificial neural networks.Usually, speech recognition system all can arrange a vocabulary, and system identifies the entry be contained in this vocabulary.In the prior art, be substantially all semi-automatic identification, need manually to participate in follow-up performing an action, therefore efficiency comparison is low.In addition, be all adopt once to identify mostly in prior art, so not only discrimination is lower, also can affect the accuracy of identification.

Summary of the invention

Technical matters to be solved by this invention is for above-mentioned the deficiencies in the prior art, provides a kind of speech recognition system and method.

The technical scheme that the present invention solves the problems of the technologies described above is as follows:

According to one aspect of the present invention, provide a kind of speech recognition system, comprise acquisition module, conversion module, extraction module, matching module, execution module and speech database.Described acquisition module is for gathering voice messaging to be identified; Described conversion module is used for described voice messaging to be identified to be converted into discernible first standard audio information of described extraction module; Described extraction module is used for resolving described first standard audio information and extracting the key word in described first standard audio information; The command object word of correspondence, for calling the command object word that prestores in described speech database and it being mated with the key word in described first standard audio information, if the match is successful, is then sent to execution module by described matching module; Described execution module receiving target order word also performs corresponding subject performance; Described speech database is for storing the command object word of setting.

According to another aspect of the present invention, provide a kind of audio recognition method, comprising:

Step 1: gather voice messaging to be identified;

Step 2: described voice messaging to be identified is converted into discernible first standard audio information;

Step 3: described first standard audio information is resolved and extracts the key word in described first standard audio information;

Step 4: call the command object word that prestores in speech database and it is mated with the key word in described first standard audio information, if the match is successful, then the command object word of correspondence being sent to execution module;

Step 5: described execution module receiving target order word also performs corresponding subject performance.

The invention has the beneficial effects as follows: a kind of speech recognition system of the present invention and method, by transforming and extraction process the voice signal to be identified gathered, improve the discrimination of voice signal and the accuracy of identification, and perform corresponding subject performance by corresponding execution module, achieve voice-operated robotization and intellectuality, under the prerequisite ensureing speech recognition quality, substantially increase the dirigibility of recognition system, enhance Consumer's Experience and operating efficiency.

Accompanying drawing explanation

Fig. 1 is a kind of speech recognition system structural representation of the present invention;

Fig. 2 is for being a kind of audio recognition method process flow diagram of the present invention.

Embodiment

Be described principle of the present invention and feature below in conjunction with accompanying drawing, example, only for explaining the present invention, is not intended to limit scope of the present invention.

Embodiment one, a kind of speech recognition system, be described in detail a kind of speech recognition system of the present invention below in conjunction with accompanying drawing 1.

As shown in Figure 1, a kind of speech recognition system structural representation, comprises acquisition module, conversion module, extraction module, matching module, execution module and speech database.

Wherein, described acquisition module is for gathering voice messaging to be identified; Described conversion module is used for described voice messaging to be identified to be converted into discernible first standard audio information of described extraction module; Described extraction module is used for resolving described first standard audio information and extracting the key word in described first standard audio information; The command object word of correspondence, for calling the command object word that prestores in described speech database and it being mated with the key word in described first standard audio information, if the match is successful, is then sent to execution module by described matching module; Described execution module receiving target order word also performs corresponding subject performance; Described speech database is for storing the command object word of setting.

A kind of speech recognition system of the present embodiment also comprises pretreatment module, described pretreatment module is used for carrying out analog to digital conversion, method, anti-confusion filtering and pre-emphasis process to described voice messaging to be identified after described acquisition module gathers voice messaging to be identified, and pretreated signal is sent to conversion module.Process can be optimized to the voice signal to be identified that described acquisition module gathers, the impurity component of going out wherein by described pretreatment module, be convenient to follow-up conversion module identification, improve the accuracy of recognition efficiency and identification.

Preferably, a kind of speech recognition system of the present embodiment also comprises supplementary acquisition module, described supplementary acquisition module is used for gathering when it fails to match for described matching module supplementing voice messaging, described pretreatment module carries out pre-service to described supplementary voice messaging, through described conversion module, pretreated supplementary voice messaging is converted into discernible second standard audio information of described extraction module again, and calls extraction module and matching module successively.The success ratio of speech recognition can be improved by described supplementary acquisition module, compared with traditional recognition system, a kind of speech recognition system described in the present embodiment can carry out when it fails to match for described matching module supplementing identification, and this has very important significance in actual application.

Preferably, it fails to match to the command object word prestored in the key word in described second standard audio information and described speech database for described matching module, then repeat to mate next time, when it fails to match number of times reaches predetermined threshold value time, then point out None-identified.The success ratio of speech recognition can be improved in this way further.In practice, supplement speech recognition and there is the unsuccessful situation of identification, be provided with in this way, can make greatly to improve the recognition success rate of supplementary voice signal.

Preferably, according to during coupling, the match is successful, number of times carries out descending sort to the command object word stored in described speech database.For a specific speech recognition system, by to early stage identification data analysis, we find when identifying, the specific command object word number of times that the match is successful can be higher, and that is, the frequency that client performs certain subject performance is higher, so to the command object word stored in described speech database, according to during coupling, the match is successful, number of times carries out descending sort, the recognition efficiency of system can be improved, shorten recognition time, strengthen the experience of user.

Embodiment two, a kind of audio recognition method, be described in detail a kind of audio recognition method of the present invention below in conjunction with accompanying drawing 2.

As shown in Figure 2, a kind of audio recognition method process flow diagram, comprising:

Step 1: gather voice messaging to be identified;

In the present embodiment, before the described step 2 of execution, also analog to digital conversion, method, anti-confusion filtering and pre-emphasis process are carried out to described voice messaging to be identified.Process can be optimized to the voice signal to be identified that described acquisition module gathers, the impurity component of going out wherein by above-mentioned pre-service, be convenient to follow-up conversion module identification, improve the accuracy of recognition efficiency and identification.

Preferably, in described step 4, when the keyword match in the command object word prestored in speech database and described first standard audio information is failed, gather and supplement voice messaging, pre-service is carried out to described supplementary voice messaging, and return step 2, discernible second standard audio information will be converted into through pretreated supplementary voice messaging, then perform the extraction of step 3 and the coupling action of step 4 successively.The success ratio of speech recognition can be improved by gathering supplementary voice messaging, compared with traditional recognition system, a kind of audio recognition method described in the present embodiment can carry out complementary matching identification when first time, it fails to match, and this has very important significance in actual application.

Preferably, if it fails to match to the command object word prestored in the key word in described second standard audio information and described speech database, then repeat to mate next time, when it fails to match number of times reaches predetermined threshold value time, then point out None-identified.The success ratio of speech recognition can be improved in this way further.In practice, supplement speech recognition and there is the unsuccessful situation of identification, be provided with in this way, can make greatly to improve the recognition success rate of supplementary voice signal.

A kind of speech recognition system of the present invention and method, by transforming and extraction process the voice signal to be identified gathered, improve the discrimination of voice signal and the accuracy of identification, and perform corresponding subject performance by corresponding execution module, achieve voice-operated robotization and intellectuality, under the prerequisite ensureing speech recognition quality, substantially increase the dirigibility of recognition system, enhance Consumer's Experience and operating efficiency.

The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. a speech recognition system, is characterized in that: comprise acquisition module, conversion module, extraction module, matching module, execution module and speech database;

Described acquisition module is for gathering voice messaging to be identified;

Described conversion module is used for described voice messaging to be identified to be converted into discernible first standard audio information of described extraction module;

Described extraction module is used for resolving described first standard audio information and extracting the key word in described first standard audio information;

The command object word of correspondence, for calling the command object word that prestores in described speech database and it being mated with the key word in described first standard audio information, if the match is successful, is then sent to execution module by described matching module;

Described execution module receiving target order word also performs corresponding subject performance;

Described speech database is for storing the command object word of setting.

2. a kind of speech recognition system according to claim 1, it is characterized in that: also comprise pretreatment module, described pretreatment module is used for carrying out analog to digital conversion, method, anti-confusion filtering and pre-emphasis process to described voice messaging to be identified after described acquisition module gathers voice messaging to be identified, and pretreated signal is sent to conversion module.

3. a kind of speech recognition system according to claim 2, it is characterized in that: also comprise supplementary acquisition module, described supplementary acquisition module is used for gathering when it fails to match for described matching module supplementing voice messaging, described pretreatment module carries out pre-service to described supplementary voice messaging, through described conversion module, pretreated supplementary voice messaging is converted into discernible second standard audio information of described extraction module again, and calls extraction module and matching module successively.

4. a kind of speech recognition system according to claim 3, it is characterized in that: if described matching module is to the command object word prestored in the key word in described second standard audio information and described speech database, it fails to match, then repeat to mate next time, when it fails to match number of times reaches predetermined threshold value time, then point out None-identified.

5. a kind of speech recognition system according to any one of Claims 1-4, is characterized in that: according to during coupling, the match is successful, number of times carries out descending sort to the command object word stored in described speech database.

6. an audio recognition method, is characterized in that, comprising:

Step 1: gather voice messaging to be identified;

7. a kind of audio recognition method according to claim 6, is characterized in that: before the described step 2 of execution, also carry out analog to digital conversion, method, anti-confusion filtering and pre-emphasis process to described voice messaging to be identified.

8. a kind of audio recognition method according to claim 7, it is characterized in that: in described step 4, when the keyword match in the command object word prestored in speech database and described first standard audio information is failed, gather and supplement voice messaging, pre-service is carried out to described supplementary voice messaging, and return step 2, discernible second standard audio information will be converted into through pretreated supplementary voice messaging, then perform the extraction of step 3 and the coupling action of step 4 successively.

9. a kind of audio recognition method according to claim 8, it is characterized in that: if it fails to match to the command object word prestored in the key word in described second standard audio information and described speech database, then repeat to mate next time, when it fails to match number of times reaches predetermined threshold value time, then point out None-identified.

10. a kind of audio recognition method according to any one of claim 6 to 9, is characterized in that: according to during coupling, the match is successful, number of times carries out descending sort to the command object word stored in described speech database.