CN104092829A

CN104092829A - Voice calling method and access gateway based on voice recognition

Info

Publication number: CN104092829A
Application number: CN201410347198.0A
Authority: CN
Inventors: 商琦; 曹纪清; 伏玉琛
Original assignee: Suzhou Industrial Park Institute of Services Outsourcing
Current assignee: Suzhou Industrial Park Institute of Services Outsourcing
Priority date: 2014-07-21
Filing date: 2014-07-21
Publication date: 2014-10-08

Abstract

The embodiment of the invention provides a voice calling method and an access gateway based on voice recognition. The method comprises the steps that the access gateway acquires a calling voice input by a user, and the calling voice comprises the information of a called party; the access gateway recognizes and obtains a called number according to the calling voice and executes voice calling. According to the voice calling method and the access gateway based on voice recognition, the calling voice of the user is recognized through the access gateway, the called number can be recognized and obtained from the calling voice, and therefore voice communication with the called party is carried out based on the recognized and obtained called number. Due to the fact that the user does not need to press down a series of keys, corresponding to the called number, on a phone one by one, the voice calling process can be simplified, and the aim of fast calling the called number is achieved. Meanwhile, due to the fact that the user is prevented from operating the keys, the method can adapt to the needs of specific crowds such as the disabled or the old with inconvenient hands and feet, and user experience is improved.

Description

Voice call method based on speech recognition and IAD

Technical field

The embodiment of the present invention relates to communication technical field, relates in particular to a kind of voice call method and IAD based on speech recognition.

Background technology

In recent years, " the broadband China " and " last kilometer " that advocates energetically along with country built, and a large amount of IADs arise at the historic moment.Must implement fiber-to-the-home regulation along with the Ministry of Industry and Information Technology in 2013 proposes newly-built community, IAD is as last kilometer, tightr with terminal use's relation; Particularly speech business, the basic service providing as operator, is used the most extensive at present.Taking home gateway as example, home gateway contacts closely with user's terminal equipment, for all terminal equipments of household internal are connected with outside all Access Networks.For example, if a certain family will carry out voice call by the phone in family and extraneous terminal, need by carrying out voice call with extraneous terminal after accessing home gateway into network.

In prior art, when phone carries out audio call by home gateway, first after user's off-hook, dial called number by user by operation phone button, then phone is initiated voice call request by home gateway, finally connects terminal called, thus the object of realization and called voice call.

There is following defect in above-mentioned audio call technology: dials called number because needs user operates phone button, also need user to press by turn a succession of button corresponding with called number on phone, especially for called be cellphone subscriber, outer city or foreign user, corresponding called number is conventionally more than 11, need to be by more than 11 buttons on phone, operating process is loaded down with trivial details, easily makes mistakes; Once and push the wrong a key, must on-hook after off-hook dial-up again, not only inconvenience but also expend the plenty of time.In addition, thisly realize by button the demand that mode that voice breathe out cannot adapt to the specific crowd such as the elderly of disabled person or trick inconvenience.

Summary of the invention

The embodiment of the present invention provides a kind of voice call method and IAD based on speech recognition, to simplify audio call flow process, to realize the object of short calling called number, and promotes user's experience.

First aspect, the embodiment of the present invention provides a kind of voice call method based on speech recognition, comprising:

The voice calls that obtains user's input, described voice calls comprises called party information;

According to described voice calls, identification obtains called number, and carries out audio call.

Second aspect, the embodiment of the present invention also provides a kind of IAD based on speech recognition, comprising:

Voice calls acquisition module, for obtaining the voice calls of user's input, described voice calls comprises called party information;

Audio call module, for according to described voice calls, identifies and obtains called number, and carry out audio call.

The voice call method based on speech recognition and IAD that the embodiment of the present invention provides, identify user's voice calls by IAD, can from voice calls, identify and obtain called number, the called number obtaining based on identification, thus can carry out voice call with called.Owing to pressing by turn a succession of button corresponding with called number on phone without user, therefore can simplify audio call flow process, realize the object of short calling called number, simultaneously owing to having avoided user's operation push-button, the demand that therefore can adapt to this class specific crowd of the elderly of disabled person or trick inconvenience, has promoted user's experience.

Brief description of the drawings

In order to be illustrated more clearly in the present invention, introduce simply the accompanying drawing of required use in the present invention being done to one below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.

The flow chart of a kind of voice call method based on speech recognition that Fig. 1 provides for the embodiment of the present invention one;

The flow chart of a kind of voice call method based on speech recognition that Fig. 2 provides for the embodiment of the present invention two;

The flow chart of a kind of voice call method based on speech recognition that Fig. 3 provides for the embodiment of the present invention three;

The flow chart of a kind of voice call method based on speech recognition that Fig. 4 provides for the embodiment of the present invention four;

The flow chart of a kind of voice call method based on speech recognition that Fig. 5 provides for the embodiment of the present invention five;

The structural representation of a kind of IAD based on speech recognition that Fig. 6 provides for the embodiment of the present invention six.

Embodiment

For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, the technical scheme in the embodiment of the present invention is described in further detail, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiment.Be understandable that; specific embodiment described herein is only for explaining the present invention; but not limitation of the invention; based on the embodiment in the present invention; those of ordinary skill in the art, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.It also should be noted that, for convenience of description, in accompanying drawing, only show part related to the present invention but not full content.

Embodiment mono-

Refer to Fig. 1, the flow chart of a kind of voice call method based on speech recognition providing for the embodiment of the present invention one.The voice calling device that the method for the embodiment of the present invention can be realized by hardware and/or software is carried out, and this implement device is typically and is configured in accessing gateway equipment, as home gateway.

As shown in Figure 1, described method comprises:

Step 110, IAD obtain the voice calls of user's input, and described voice calls comprises called party information;

This step is specifically obtained the voice calls that includes called party information of calling subscriber's input.Particularly, voice calls described in the microphone that calling subscriber can be by phone or hands-free key-press input, and described voice calls is sent to IAD, so that IAD receives the voice calls of calling subscriber's input.Caller is for also inputting described voice calls by the built-in microphone of IAD or power amplifying device.

Described voice calls can have numerous embodiments, for example, comprise following at least one: the natural language of the natural language of called number, the natural language of called name and called cornet.

Particularly, the natural-sounding of called number comprises called party information, is also called number.The natural-sounding of called number is a kind of speech form common in described voice calls, for example calling subscriber says callee's phone number " 13012345678 " after off-hook by the microphone of phone, and calling subscriber's voice " 13012345678 " are the natural-sounding of called number.

Wherein, when the natural language that is called name, in described IAD, be provided with electronic address list in advance at described voice calls, described electronic address list comprises the first mapping relations of called name and called number.Particularly, the natural language of called name comprises called party information, is also called name.

For example, calling subscriber says callee's name " yellow Xiao Ming " after off-hook by the microphone of phone, and calling subscriber's voice " yellow Xiao Ming " are the natural language of called name.

Wherein, electronic address list is the carrier of called party information, and electronic address list has specifically reflected the incidence relation between each information of callee.Described electronic address list can also comprise called home address and called e-mail address etc.

As the optional execution mode of one of pre-configured this operation of electronic address list in IAD, specifically can comprise:

Electronic address list is imported to IAD, particularly, the electronic address list in smart mobile phone can be imported to IAD;

IAD is resolved described electronic address list, and to obtain described the first mapping relations, and described the first mapping relations are stored in the data field of IAD.

Alternatively, the form that imports to the electronic address list in gateway described in can be VCF form, VCard form, CSV form, doc form or excel form.Preferably, described in import to the electronic address list in gateway device form be VCF form, VCard form or CSV form, to increase the versatility of electronic address list.

Preferably, IAD is being resolved described electronic address list, after obtaining described the first mapping relations, can also comprise: the electronic address list after resolving is sent in advance phone by IAD.

In other words, calling subscriber after off-hook, can by turn on phone button and/or under turn over button and check the electronic address list being presented on display screen, to determine callee; Then input the natural language of called name by phone receiver, for example " yellow Xiao Ming ".

When the natural language that is called cornet at described voice calls, the second mapping relations of preset called number and called cornet in IAD, the figure place of called cornet can be 1-3.

Called cornet is applicable to abbreviated call, particularly, user can be according to the frequency of calling out called number, the called cornet corresponding with called number is set, for example, according to user's behavioural habits, called number higher calling frequency " 13012345678 " can be arranged to corresponding called cornet " 01 ", in the time that user says called cornet " 01 " by the microphone of phone after off-hook, calling subscriber's voice " 01 " are the natural language of called cornet.

Describe as example taking the figure place of called cornet as 2 above, it should be noted that, in the time that the figure place of called cornet is 3, can not conflict be set the 3 item codes corresponding with existing conventional business, also can not be set to the cornet such as " 110 ", " 119 " and " 120 " by called cornet.It should be noted that, when called while being cellphone subscriber, outer city or foreign user, corresponding called number is conventionally more than 11, and the figure place of called cornet is preferably 1-3 position, is convenient to user and carries out call operation.

Step 120, IAD are according to described voice calls, and identification obtains called number, and carries out audio call.

This step is specifically by speech recognition, obtains called number, then carries out audio call according to the called number obtaining, to realize and the object of called voice call.

The technical scheme of the present embodiment, identifies user's voice calls by IAD, and can from voice calls, identify and obtain called number, the called number obtaining based on identification, thus realize audio call.Owing to pressing by turn a succession of button corresponding with called number on phone without user, therefore can simplify audio call flow process, realize the object of short calling called number, simultaneously owing to having avoided user's operation push-button, the demand that therefore can adapt to this class specific crowd of the elderly of disabled person or trick inconvenience, has promoted user's experience.

Embodiment bis-

Refer to Fig. 2, the flow chart of a kind of voice call method based on speech recognition providing for the embodiment of the present invention two.The present embodiment, on the basis of above-described embodiment, provides according to described voice calls, and identification obtains the method for optimizing of called number.The voice call method based on speech recognition that the present embodiment provides is applicable to the natural language that voice calls is called number.The method of the present embodiment can be carried out by IAD.

As shown in Figure 2, described method comprises:

Step 210, IAD obtain the voice calls of user's input, and described voice calls comprises called party information;

Described voice calls is carried out analog-to-digital conversion by step 220, IAD, the lang sound preliminary treatment of going forward side by side;

This step specifically, before described voice calls is carried out to voice preliminary treatment, is carried out analog-to-digital conversion, is also converted to digital electric signal by user's natural language from analog signal; Then carry out voice preliminary treatment.

Preferably, described voice preliminary treatment comprises: digital filtering processing, preemphasis processing, windowing divide frame to process and end-point detection processing.

Wherein, digital filtering processing is to utilize the characteristic of discrete-time system to carry out filtering processing to the waveform of described digital electric signal, to reduce noise, makes the frequency spectrum of output signal become smooth.

It will be appreciated by those skilled in the art that, under very high frequency (more than GHz), the decay of high-frequency signal in transmission clearly, in order to compensate the high frequency signal attenuation in transmission, can take signal pre-emphasis method, HFS in signal transmission is compensated, make the amplitude of the high and low frequency signal receiving consistent.

Windowing process is to make originally do not have periodic voice signal to present periodic phonetic feature, also avoids occurring Gibbs' effect simultaneously.Wherein, Gibbs' effect be when the harmonic component with signal and there is and the phenomenon that can observe while explaining the waveform with discontinuous point.

Those skilled in the art will appreciate that it is one of key technology whether speech recognition is correct that sound end detects, can in speech recognition, improve accuracy of identification and reduce recognition time.Conventional sound end detecting method comprises energy method, zero-crossing rate method and correlation coefficient process etc.

Step 230, IAD from obtaining phonetic feature through the pretreated voice calls of voice;

Phonetic feature mainly refers to speech characteristic parameter, comprises the characteristic parameter of linear prediction cepstrum coefficient coefficient, MF2CC (Mel-cepstrum coefficient), wavelet analysis.These parameters are the frame of voice one by one in essence.That is to say, utilize Fourier analysis in short-term, pretreated voice signal is divided into the frame of 10-20ms, or obtain phonetic feature by wavelet analysis.

Step 240, IAD mate described phonetic feature in default speech model storehouse, determine the sound template corresponding with described phonetic feature;

Wherein, both comprised sound template in speech model storehouse, and also comprised speech polling table, described speech polling storehouse comprises described sound template and corresponding field.The field of described correspondence is numeral in the present embodiment.

In this step, preferably using sound template the highest matching degree as the sound template corresponding with described phonetic feature.

Step 250, IAD, according to described sound template, utilize the speech polling table in default sound bank, obtain called number, and carry out audio call.

This step is specifically according to the optimum sound template matching, and speech polling table, using field corresponding with the optimum sound template matching in speech polling table as called number, thereby obtain the recognition result of the voice calls of user's input, then carry out audio call, realize the object with called voice call.

For example, the voice calls of user's input is the natural language of " 13012345678 ", method by the present embodiment can match optimum sound template, by inquiring about in speech polling storehouse, recognize the field corresponding with optimum sound template for " 13012345678 ", obtained called number.

In the present embodiment, can adopt above-mentioned off-line voice recognition mode, IAD carries out call identifying voice by the off-line speech model storehouse of self, also can adopt online voice recognition mode.Difference is: when online speech recognition, need to set up IAD and the connecting link that can provide between the server of speech identifying function, realize voice calls identification by described server, then IAD is based on described recognition result execution audio call.

The technical scheme of the present embodiment, after IAD obtains user's voice calls, IAD by from extracting feature through analog-to-digital conversion and the pretreated voice calls of voice, and carry out characteristic matching, can identify user's voice calls, and identify and obtain called number from voice calls according to matching result, the called number obtaining based on identification, thus realize audio call.Owing to pressing by turn a succession of button corresponding with called number on phone without user, therefore can simplify audio call flow process, realize the object of short calling called number, simultaneously owing to having avoided user's operation push-button, the demand that therefore can adapt to this class specific crowd of the elderly of disabled person or trick inconvenience, has promoted user's experience.

Embodiment tri-

Refer to Fig. 3, the flow chart of a kind of voice call method based on speech recognition providing for the embodiment of the present invention three.The present invention, on the basis of above-described embodiment, provides according to described sound template, utilizes the speech polling table in default sound bank, obtains the preferred version of called number.The voice call method based on speech recognition that the present embodiment provides is applicable to the natural language that voice calls is called name.The present embodiment method can be carried out by IAD.

As shown in Figure 3, described method for optimizing comprises:

Step 310, IAD, according to described sound template, utilize the speech polling table in default sound bank, and identification obtains called name;

Be with the difference of above-described embodiment: the first, the natural language that in the present embodiment, the voice calls of user's input is called name; And the natural language that in above-described embodiment, the voice calls of user's input is called number.The second, in the present embodiment, speech polling storehouse had both comprised sound template, also comprised speech polling table, and described speech polling storehouse comprises described sound template and corresponding field, and wherein corresponding field is Chinese character; And in above-described embodiment, in speech model storehouse, both comprised sound template, and also comprising speech polling table, described speech polling storehouse comprises that described sound template and corresponding field, the field of wherein said correspondence are numeral.The 3rd, described in the present embodiment, in IAD, be provided with electronic address list in advance, described electronic address list comprises the first mapping relations of called name and called number.

As the optional execution mode of one of pre-configured this operation of electronic address list in IAD, specifically can comprise: electronic address list is imported to IAD, particularly, the electronic address list in smart mobile phone can be imported to IAD; IAD is resolved described electronic address list, and to obtain described the first mapping relations, and described the first mapping relations are stored in the data field of IAD.

Further preferably, resolve described electronic address list at IAD, after obtaining described the first mapping relations, can also comprise: the electronic address list after resolving is sent in advance phone by IAD.

In other words, calling subscriber after off-hook, can by turn on phone button and/or under turn over button and check the electronic address list being presented on display screen, to determine callee; Then input the natural language of called name by phone receiver.

The 4th, in the present embodiment, the recognition result of IAD is the called name corresponding with voice calls; And the recognition result of IAD is the called number corresponding with voice calls in above-described embodiment.

Step 320, IAD utilize described the first mapping relations, obtain called number, and carry out audio call.

This step is specifically according to the optimum sound template matching, and speech polling table, using field corresponding with the optimum sound template matching in speech polling table as called name, and utilize described first mapping relations of storing in IAD to obtain called number, then carry out audio call, realize the object with called voice call.

It should be noted that, described the first mapping relations that in IAD, storage obtains by parsing, to recognize at IAD after the called name that the natural language of described called name is corresponding, based on pre-stored described the first mapping relations, thereby obtain the called number corresponding with called name, and then carry out audio call.

Preferably, resolve described electronic address list at IAD, after obtaining described the first mapping relations, can also comprise: the electronic address list after resolving is sent in advance phone by IAD, so that calling subscriber is after off-hook, can by turn on phone button and/or under turn over button and check the electronic address list being presented on display screen, thereby determine callee.

Embodiment tetra-

Refer to Fig. 4, the flow chart of a kind of voice call method based on speech recognition providing for the embodiment of the present invention four.The present invention, on the basis of embodiment bis-, provides according to described sound template, utilizes the speech polling table in default sound bank, obtains the preferred version of called number.The voice call method based on speech recognition that the present embodiment provides is applicable to the natural language that voice calls is called cornet.The method of the present embodiment can be carried out by IAD.

As shown in Figure 4, described method for optimizing comprises:

Step 410, IAD, according to described sound template, utilize the speech polling table in default sound bank, and identification obtains called cornet;

The present embodiment is that the difference of embodiment bis-is: the first, and the natural language that in the present embodiment, the voice calls of user's input is called cornet; And the natural language that in embodiment bis-, the voice calls of user's input is called number.The second, the second mapping relations of preset called number and called cornet in IAD in the present embodiment, the figure place of called cornet can be 1-3.The 3rd, in the present embodiment, the recognition result of IAD is the called cornet corresponding with incoming call voice; And the recognition result of IAD is the called number corresponding with incoming call voice in embodiment bis-.

Step 420, IAD utilize described the second mapping relations, obtain called number, and carry out audio call.

This step is specifically according to the optimum sound template matching, and speech polling table, using field corresponding with the optimum sound template matching in speech polling table as called cornet, and utilize described second mapping relations of storing in IAD to obtain called number, then carry out audio call, realize the object with called voice call.

It should be noted that, preset described the second mapping relations in IAD, to recognize at IAD after the called cornet that the natural language of described called cornet is corresponding, based on described the second mapping relations, can obtain the called number corresponding with called cornet, and then carry out audio call.

Preferably, IAD can be sent to phone by described the second mapping relations, also be, described the second mapping relations are preset in phone, so that calling subscriber is after off-hook, can be by operation the 1-3 position button on phone, can adopt traditional touch-call mode to realize and the object of called voice call.Also be, the method of calling of the natural-sounding of called cornet can with traditional touch-call mode compliant applications, calling subscriber can be according to calling custom and the actual demand of self, determines to adopt which kind of method of calling, simplify call flow, and increased the flexibility of calling out.

Embodiment five

Refer to Fig. 5, the flow chart of a kind of voice call method based on speech recognition providing for the embodiment of the present invention five.The present embodiment, on the basis of the various embodiments described above, provides the preferred version of the voice calls that obtains user's input.As shown in Figure 5, described method for optimizing comprises:

Step 510, IAD obtain user by the key value of phone key-press input;

Step 520, IAD mate described key value in pre-configured speech recognition service key directory, if the match is successful, trigger the operation of the voice calls that obtains user's input.

The key value that user inputs by phone, transfer to IAD, if the key value of the user's input speech recognition service key directory pre-configured with IAD mates consistent, IAD triggers voice calls and the speech recognition flow process of obtaining user, and IAD is identified the voice that after this receive from phone as voice calls.IAD also can send the instruction that starts speech recognition by explicitly to phone, control phone and point out to user, can start input voice information.

Wherein, phone is to be generally connected by POTS mouth with IAD.

Exemplary, in the pre-configured speech recognition service key directory of IAD, speech recognition service key is set to * #, and so when user's off-hook and press successively * # button, the match is successful.

It should be noted that, if it fails to match, this matching result can be returned to phone, to point out user, for example, by playing the voice message of " please re-enter " at phone receiver, or show the prompting of " inputting unsuccessfully " or " please re-enter " at the display screen of phone.

Undertaken trigger action by the key value on phone except above-mentioned, can also be by triggering alternately between user and phone, for example interactive voice, or the mode that touches the setting regions of the demonstration frequency of phone triggers, and described setting regions can be redefined for voice and obtain region.

The technical scheme of the present embodiment, after obtaining the key value of user's input, by mate described key value in pre-configured speech recognition service key directory, and determine whether to trigger according to matching result and obtain the voice calls of user input and the operation of speech recognition.

On the basis of the present embodiment, the voice calls that IAD obtains user's input preferably includes:

IAD obtains the voice calls of user's input by default DigitMap (number figure) digit collecting rule or default DialPlan (call plan) digit collecting rule, and wherein said DigitMap digit collecting rule comprises: the duration of the first dialing timer, the duration of interdigit timer.

In other words, this preferred version specifically obtains user's voice calls by described DigitMap digit collecting rule or described DialPlan digit collecting rule.

Describe as an example of DigitMap digit collecting rule example.

The first dialing timer, interdigit timer are used in the different phase of user's off-hook to end of calling.Can there is particularly numerous embodiments, introduce wherein two kinds below.

Mode one, off-hook to the stage before incoming call voice by the first place Timer Controlling that dials, for example, if in the duration (15s) of first place dialing timer, user does not have incoming call voice, and IAD issues howler tone or busy tone prompting to phone.If in the duration of first place dialing timer, user starts incoming call voice, IAD is timer between enable bit, when user's voice calls dwell interval duration exceedes the duration (such as 5s) of interdigit timer, IAD carries out speech recognition to voice calls, also voice before are once identified, then carried out follow-up exhalation flow process.

Mode two, off-hook to the stage before incoming call voice by the first place Timer Controlling that dials, for example, if in the duration (15s) of first place dialing timer, user does not have incoming call voice, and IAD issues howler tone or busy tone prompting to phone.If in the duration of first place dialing timer, user starts incoming call voice, IAD carries out Real-time speech recognition, for example, user inputs a voice calls, and IAD just carries out a speech recognition, simultaneously timer between enable bit, when user's voice calls dwell interval duration exceedes the duration (such as 5s) of interdigit timer, carry out follow-up exhalation flow process.

Mode one is with the difference of mode two: it is different that IAD carries out the time point of speech recognition, carries out the number of times difference of speech recognition.

In other words, user is by voice calls and called carrying out in voice call process, be duration taking each pre-configured timer as foundation, determine whether finish to collect user's voice calls to carry out identification process.

It should be noted that, the duration of each timer can arrange and change by configuring.

Embodiment six

Refer to Fig. 6, the structural representation of a kind of IAD based on speech recognition providing for the embodiment of the present invention six.Described IAD comprises: voice calls acquisition module 610 and audio call module 620.

Wherein, voice calls acquisition module 610 is for obtaining the voice calls of user's input, and described voice calls comprises called party information; Audio call module 620, for according to described voice calls, is identified and is obtained called number, and carry out audio call.

The technical scheme of the present embodiment by identification user's voice calls, can be identified and obtain called number from voice calls, the called number obtaining based on identification, thus realize audio call.Owing to pressing by turn a succession of button corresponding with called number on phone without user, therefore can simplify audio call flow process, realize the object of short calling called number, simultaneously owing to having avoided user's operation push-button, the demand that therefore can adapt to this class specific crowd of the elderly of disabled person or trick inconvenience, has promoted user's experience.

In such scheme, described voice calls comprises following at least one: the natural language of the natural language of called number, the natural language of called name and called cornet;

Wherein, when the natural language that is called name, in described IAD, be provided with electronic address list in advance at described voice calls, described electronic address list comprises the first mapping relations of called name and called number;

In such scheme, audio call module 620 preferably includes: pretreatment unit, phonetic feature acquiring unit, phonetic feature matching unit and called number acquiring unit.

Wherein, pretreatment unit is for carrying out analog-to-digital conversion by described voice calls, the lang sound preliminary treatment of going forward side by side; Phonetic feature acquiring unit is used for from obtaining phonetic feature through the pretreated voice calls of voice; Phonetic feature matching unit, for mating described phonetic feature in default speech model storehouse, is determined the sound template corresponding with described phonetic feature; Called number acquiring unit, for according to described sound template, utilizes the speech polling table in default sound bank, obtains called number.

As the one of called number acquiring unit preferred embodiment, described called number acquiring unit preferably includes: the first recognin unit and the first mapping subelement.

Wherein, the first recognin unit, for according to described sound template, utilizes the speech polling table in default sound bank, and identification obtains called name; The first mapping subelement is used for utilizing described the first mapping relations, obtains called number.

As another preferred embodiment of called number acquiring unit, described called number acquiring unit preferably includes: the second recognin unit and the second mapping subelement.

Wherein, the second recognin unit, for according to described sound template, utilizes the speech polling table in default sound bank, and identification obtains called cornet; The second mapping subelement is used for utilizing described the second mapping relations, obtains called number.

In such scheme, described voice preliminary treatment comprises: digital filtering processing, preemphasis processing, windowing divide frame to process and end-point detection processing.

As the one of the present embodiment preferred embodiment, this device can also comprise: key value acquisition module and trigger module.

Wherein, key value acquisition module, for before obtaining the voice calls of user's input, obtains user by the key value of phone key-press input; Trigger module, for mating described key value at pre-configured speech recognition service key directory, if the match is successful, triggers the operation of the voice calls that obtains user's input.

In such scheme, voice calls acquisition module 610 specifically for:

Obtain the voice calls of user's input by default DigitMap digit collecting rule or default DialPlan digit collecting rule, wherein said DigitMap digit collecting rule comprises: the duration of the first dialing timer, the duration of interdigit timer.

The IAD based on speech recognition that the embodiment of the present invention provides can be carried out the voice call method based on speech recognition that any embodiment of the present invention provides, and possesses the corresponding functional module of manner of execution and beneficial effect.

One of ordinary skill in the art will appreciate that: all or part of step that realizes above-mentioned each embodiment of the method can complete by the relevant hardware of program command.Aforesaid program can be stored in a computer read/write memory medium.This program, in the time carrying out, is carried out the step that comprises above-mentioned each embodiment of the method; And aforesaid storage medium comprises: various media that can be program code stored such as ROM, RAM, magnetic disc or CDs.

Finally it should be noted that: above each embodiment is only for technical scheme of the present invention is described, but not be limited; In embodiment, preferred embodiment, be not limited, to those skilled in the art, the present invention can have various changes and variation.All any amendments of doing, be equal to replacement, improvement etc., within protection scope of the present invention all should be included within spirit of the present invention and principle.

Claims

1. the voice call method based on speech recognition, is characterized in that, comprising:

IAD obtains the voice calls of user's input, and described voice calls comprises called party information;

Described IAD is according to described voice calls, and identification obtains called number, and carries out audio call.

2. method according to claim 1, is characterized in that, described voice calls comprises following at least one: the natural language of the natural language of called number, the natural language of called name and called cornet;

When the natural language that is called cornet at described voice calls, the second mapping relations of preset called number and called cornet in IAD.

3. method according to claim 2, is characterized in that, described IAD is according to described voice calls, and identification obtains called number, comprising:

Described voice calls is carried out analog-to-digital conversion by described IAD, the lang sound preliminary treatment of going forward side by side;

Described IAD from obtaining phonetic feature through the pretreated voice calls of voice;

Described IAD mates described phonetic feature in default speech model storehouse, determines the sound template corresponding with described phonetic feature;

Described IAD, according to described sound template, utilizes the speech polling table in default sound bank, obtains called number.

4. method according to claim 3, is characterized in that, described IAD, according to described sound template, utilizes the speech polling table in default sound bank, obtains called number, comprising:

Described IAD, according to described sound template, utilizes the speech polling table in default sound bank, and identification obtains called name;

Described IAD utilizes described the first mapping relations, obtains called number; Or

Described IAD, according to described sound template, utilizes the speech polling table in default sound bank, and identification obtains called cornet;

Described IAD utilizes described the second mapping relations, obtains called number.

5. according to the method described in claim 3 or 4, it is characterized in that, described voice preliminary treatment comprises: digital filtering processing, preemphasis processing, windowing divide frame to process and end-point detection processing.

6. according to the method described in claim 3 or 4, it is characterized in that, obtain the voice calls of user's input at IAD before, also comprise:

IAD obtains user by the key value of phone key-press input;

IAD mates described key value in pre-configured speech recognition service key directory, if the match is successful, triggers the operation of the voice calls that obtains user's input.

7. according to the method described in claim 3 or 4, it is characterized in that, IAD obtains the voice calls of user's input, comprising:

IAD obtains the voice calls of user's input by present count figure DigitMap digit collecting rule or default dial plan DialPlan digit collecting rule, wherein said DigitMap digit collecting rule comprises: the duration of the first dialing timer, the duration of interdigit timer.

8. the IAD based on speech recognition, is characterized in that, comprising:

9. IAD according to claim 8, is characterized in that, described voice calls comprises following at least one: the natural language of the natural language of called number, the natural language of called name and called cornet;

10. IAD according to claim 9, is characterized in that, audio call module comprises:

Pretreatment unit, for described voice calls is carried out to analog-to-digital conversion, the lang sound preliminary treatment of going forward side by side;

Phonetic feature acquiring unit, for from obtaining phonetic feature through the pretreated voice calls of voice;

Phonetic feature matching unit, for mating described phonetic feature in default speech model storehouse, determines the sound template corresponding with described phonetic feature;

Called number acquiring unit, for according to described sound template, utilizes the speech polling table in default sound bank, obtains called number.