CN105957524A - Speech processing method and speech processing device - Google Patents
Speech processing method and speech processing device Download PDFInfo
- Publication number
- CN105957524A CN105957524A CN201610264283.XA CN201610264283A CN105957524A CN 105957524 A CN105957524 A CN 105957524A CN 201610264283 A CN201610264283 A CN 201610264283A CN 105957524 A CN105957524 A CN 105957524A
- Authority
- CN
- China
- Prior art keywords
- volume value
- entry
- voice messaging
- energy indexes
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Abstract
The invention discloses a speech processing method and a speech processing device. The speech processing method is characterized in that speech information input by a user is received; a first letter corresponding to a preset-operational item and a second letter corresponding to item content are acquired by identifying the speech information; a first volume value corresponding to the first speech information and a second volume value corresponding to the second speech information are determined; a first energy index parameter corresponding to the first volume value and a second energy index parameter corresponding to the second volume value are determined; when the second energy index parameter is greater than the first energy index parameter, and the second energy index parameter is greater than a preset energy index parameter, target item content matched with the second letter is searched in a to-be-selected item content database; and the target item content is written in a corresponding item content table. By adopting the technical scheme provided by the invention, on the basis of guaranteeing the accuracy of the speech processing, and success rate and accuracy of semantic analysis are improved, and therefore user experience is improved.
Description
Technical field
The present invention relates to technical field of voice recognition, particularly relate to a kind of method of speech processing and device.
Background technology
During speech processes, being understood by semanteme when, depend on the effect of speech recognition
Really.If speech recognition effect is poor, then can affect the effect of semantic analysis.Such as, for Fig. 1's
Typing masterplate, user wants, by giving an oral account the purpose that " performing radiology department of section office " carries out automatically selecting, to select
Control is as shown in Figure 2.Saying from the angle of semantic analyzer, " performing section office " is pre-operation entry, " puts
Penetrate section " it is entry contents, semantic analyzer can be made a distinction by dictionary masterplate, is then converted into holding
Line command.But during speech recognition, if for certain the word identification in whole piece sentence
If effect is bad, may result in the failure of semantic analysis.Such as text has been identified as " holding section office to put
Penetrate section ", lack " OK " word.So, whole identification process will be failed, and causing can not basis
The phonetic entry of user performs corresponding operation, thus affects Consumer's Experience.
Summary of the invention
The embodiment of the present invention provides a kind of method of speech processing and device, in order to realize in guarantee speech processes
Accuracy rate on the basis of, improve the success rate of semantic analysis and accuracy rate, thus promote the use of user
Experience.
First aspect according to embodiments of the present invention, it is provided that a kind of method of speech processing, including:
Receiving the voice messaging of user's input, wherein, described voice messaging includes that pre-operation entry is corresponding
The second voice messaging that first voice messaging is corresponding with entry contents;
Described voice messaging is identified, to obtain the first word corresponding to described pre-operation entry and institute
State the second word that entry contents is corresponding;
Determine that the first volume value that described first voice messaging is corresponding is corresponding with described second voice messaging
Two volume values;
Determine that the first energy indexes parameter that described first volume value is corresponding is corresponding with described second volume value
Second energy indexes parameter;
When described second energy indexes parameter is more than described first energy indexes parameter, and described second energy
When index parameter is more than preset energy index parameter, search with described in entry contents data base to be selected
The target entry content of the second characters matching;
Described target entry content is filled up in the entry contents form of correspondence.
In this embodiment, to comprising the first voice messaging corresponding to pre-operation entry and entry contents pair
After the voice messaging of the second voice messaging answered is identified, determine the volume value of two voice messagings respectively,
And then determine energy indexes parameter according to the volume value of two voice messagings, entry contents corresponding second
The energy indexes parameter of voice messaging is more than the energy indexes parameter of the first voice messaging, and refers to more than presetting
During mark parameter, then by word corresponding for this entry contents preferentially with waiting in entry contents data base to be selected
Select entry contents to mate, thus matched target entry content is filled up to the entry of correspondence
In table of contents.So, user can be filled up a form by phonetic entry, it is not necessary to manually selects, and
And allow user that different vocabulary are used different volumes, thus determine different according to the difference of volume
Energy indexes parameter, determines whether the operation performing to fill in entry contents form according to energy indexes parameter,
Avoid after mistake occurs in speech recognition, it is impossible to carry out the problem that form fills in and occur, ensureing at voice
On the basis of the accuracy rate of reason, improve success rate and the accuracy rate of semantic analysis, also improve user and utilize
Success rate that phonetic entry is filled up a form and user experience.
In one embodiment, described determine the first energy indexes parameter that described first volume value is corresponding and
The second energy indexes parameter that described second volume value is corresponding, including:
Obtain the corresponding relation between volume value interval and energy indexes parameter, wherein, described volume value district
Between become positive correlation with described energy indexes parameter;
Determine belonging to interval and described second volume value of the first volume value belonging to described first volume value
Two volume values are interval;
According to the corresponding relation between volume value interval and energy indexes parameter, determine described first volume value
The second energy indexes ginseng that interval the first corresponding energy indexes parameter is corresponding with described second volume value interval
Number.
In this embodiment it is possible to the corresponding relation between preset volume value interval and energy indexes parameter,
Thus determine that the first volume value interval is right according to the corresponding relation between volume value interval and energy indexes parameter
The second energy indexes parameter that the first energy indexes parameter answered is corresponding with described second volume value interval.Tool
Body ground, the interval value that positive correlation, i.e. volume value can be become with energy indexes parameter interval of volume value is the biggest,
Energy indexes parameter is the biggest, and the value in volume value interval is the least, and energy indexes parameter is the least.And energy indexes
Parameter is big, then the word that entry contents is corresponding is the highest by the probability carrying out priority match.So, former
Energy indexes parameter is increased on the basis of having speech recognition technology, can be in the accuracy rate ensureing speech processes
On the basis of, improve success rate and the accuracy rate of semantic analysis, also improve user and utilize phonetic entry to fill in
The success rate of form and user experience.
In one embodiment, search and described second characters matching in entry contents data base to be selected
Target entry content, including:
Calculate described second word and each entry contents to be selected in described entry contents data base to be selected
Between the first similarity;
Entry contents to be selected the highest for first similarity is defined as described target entry content.
In this embodiment, calculate that the second word is each with entry contents data base to be selected treats selector bar
The first similarity between mesh content, thus entry contents to be selected the highest for similarity is defined as target
Entry contents, this way it is ensured that the accuracy rate of speech recognition, it also avoid owing to voice identification result goes out
Show mistake and cause carrying out the problem generation of semantic analysis, improve success rate and the standard of semantic analysis
Really rate.
In one embodiment, the described entry contents form that described target entry content is filled up to correspondence
In, including:
Determine the object run entry that described target entry content is corresponding;
Calculate between described object run entry and described first word corresponding to described pre-operation entry
Two similarities;
In described second similarity more than or equal to when presetting similarity, described target entry content is filled out
Write in the entry contents form that described object run entry is corresponding.
In this embodiment, before target entry content is filled up to the entry contents form of correspondence, also
Can first determine the object run entry that target entry content is corresponding, then by object run entry and pre-behaviour
The first word making entry corresponding carries out Similarity Measure, if both similarities are more than presetting similarity,
Be then coupling both explanation, i.e. the entry of user's pre-operation is exactly object run entry, so, enters one
Step ensure that the accuracy of semantic analysis result.
In one embodiment, at the first volume value and described determining that described first voice messaging is corresponding
Before the second volume value that two voice messagings are corresponding, described method also includes:
Calculate the confidence level of described voice messaging;
Judge that whether described confidence level is less than pre-seting reliability;
At described confidence level less than when pre-seting reliability, perform described to determine that described first voice messaging is corresponding
The step of the first volume value second volume value corresponding with described second voice messaging.
In this embodiment it is possible to first calculate the confidence level of voice messaging, if the confidence level of voice messaging
More than or equal to pre-seting reliability, then explanation confidence level is higher, can successfully carry out semantic analysis, then may be used
Using the speech analysis scheme in existing correlation technique to carry out semantic analysis, and if voice messaging
Confidence level is less than pre-seting reliability, then explanation confidence level is relatively low, and semantic analysis may be failed, now,
The difference according to volume that can use the present invention carries out the scheme of semantic analysis.
Second aspect according to embodiments of the present invention, it is provided that a kind of voice processing apparatus, including:
Receiver module, for receiving the voice messaging of user's input, wherein, described voice messaging includes pre-
The second voice messaging that first voice messaging corresponding to operation entries is corresponding with entry contents;
Identification module is for being identified described voice messaging, corresponding to obtain described pre-operation entry
The first word second word corresponding with described entry contents;
First determines module, for determining the first volume value that described first voice messaging is corresponding and described the
The second volume value that two voice messagings are corresponding;
Second determines module, for determining the first energy indexes parameter and institute that described first volume value is corresponding
State the second energy indexes parameter that the second volume value is corresponding;
Search module, for being more than described first energy indexes parameter when described second energy indexes parameter,
And described second energy indexes parameter more than preset energy index parameter time, in entry contents data to be selected
Storehouse is searched the target entry content with described second characters matching;
Fill in module, in the entry contents form that described target entry content is filled up to correspondence.
In one embodiment, described second determines that module includes:
Obtain submodule, for obtaining the corresponding relation between volume value interval and energy indexes parameter, its
In, the interval and described energy indexes parameter of described volume value becomes positive correlation;
First determines submodule, for determining the first volume value interval and institute belonging to described first volume value
State the second volume value belonging to the second volume value interval;
Second determines submodule, is used for according to the corresponding relation between volume value interval and energy indexes parameter,
Determine that the first energy indexes parameter corresponding to described first volume value interval and described second volume value interval are right
The the second energy indexes parameter answered.
In one embodiment, described lookup module includes:
First calculating sub module, is used for calculating described second word and described entry contents data base to be selected
In the first similarity between each entry contents to be selected;
Content determines submodule, described for entry contents to be selected the highest for the first similarity being defined as
Target entry content.
In one embodiment, fill in module described in include:
Entry determines submodule, for determining the object run entry that described target entry content is corresponding;
Second calculating sub module, corresponding with described pre-operation entry for calculating described object run entry
The second similarity between described first word;
Fill in submodule, for when described second similarity is more than or equal to default similarity, by institute
State target entry content to be filled up in the entry contents form that described object run entry is corresponding.
In one embodiment, described device also includes:
Computing module, at the first volume value and described second determining that described first voice messaging is corresponding
Before the second volume value that voice messaging is corresponding, calculate the confidence level of described voice messaging;
Judge module, is used for judging that whether described confidence level is less than pre-seting reliability;
Trigger module, for when described confidence level is less than and pre-sets reliability, triggering described second and determine mould
Block determines second that the first volume value that described first voice messaging is corresponding is corresponding with described second voice messaging
Volume value.
It should be appreciated that it is only exemplary and explanatory that above general description and details hereinafter describe
, the present invention can not be limited.
Other features and advantages of the present invention will illustrate in the following description, and, partly from froming the perspective of
Bright book becomes apparent, or understands by implementing the present invention.The purpose of the present invention is excellent with other
Point can come real by structure specifically noted in the description write, claims and accompanying drawing
Now and obtain.
Below by drawings and Examples, technical scheme is described in further detail.
Accompanying drawing explanation
Accompanying drawing herein is merged in description and constitutes the part of this specification, it is shown that meet this
Bright embodiment, and for explaining the principle of the present invention together with description.
Fig. 1 is the typing template schematic diagram in correlation technique.
Fig. 2 is the entry contents option schematic diagram in correlation technique.
Fig. 3 is the flow chart according to a kind of method of speech processing shown in an exemplary embodiment.
Fig. 4 is according to the flow chart of step S304 in the method for speech processing shown in an exemplary embodiment.
Fig. 5 is according to the flow chart of step S305 in the method for speech processing shown in an exemplary embodiment.
Fig. 6 is according to the flow chart of step S306 in the method for speech processing shown in an exemplary embodiment.
Fig. 7 is the flow chart according to the another kind of method of speech processing shown in an exemplary embodiment.
Fig. 8 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment.
Fig. 9 is to determine module according in a kind of voice processing apparatus shown in an exemplary embodiment second
Block diagram.
Figure 10 is according to the frame searching module in a kind of voice processing apparatus shown in an exemplary embodiment
Figure.
Figure 11 is according to the frame filling in module in a kind of voice processing apparatus shown in an exemplary embodiment
Figure.
Figure 12 is the block diagram according to the another kind of voice processing apparatus shown in an exemplary embodiment.
Detailed description of the invention
Here will illustrate exemplary embodiment in detail, its example represents in the accompanying drawings.Following retouches
Stating when relating to accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represents same or analogous and wants
Element.Embodiment described in following exemplary embodiment does not represent own consistent with the present invention
Embodiment.On the contrary, they only with as appended claims describes in detail, the present invention some
The example of the apparatus and method that aspect is consistent.
Fig. 3 is the flow chart according to the method for speech processing shown in an exemplary embodiment.This voice side of waking up up
Method is applied in terminal unit, and this terminal unit can be mobile phone, computer, digital broadcast terminal,
Messaging devices, game console, tablet device, armarium, body-building equipment, individual digital helps
Arbitrary equipment with voice control function such as reason.As it is shown on figure 3, the method comprising the steps of S301-S306:
In step S301, receiving the voice messaging of user's input, wherein, voice messaging includes pre-behaviour
Make the second voice messaging that the first voice messaging corresponding to entry is corresponding with entry contents;
Wherein, in some forms, all there is operation entries and entry contents, during as filled in school report,
Including name, personality, achievement etc., these broadly fall into operation entries, and concrete Zhang San, female, 90 points
It it is then corresponding entry contents.And for example, the inspection classification in Fig. 1, execution section office and execution time etc. are all
Belong to operation entries, and the plain film of correspondence, radiology department and on March 8th, 2011 etc. belong to entry contents.
User wants which operation entries is carried out voice operating, it is possible to this operation entries of phonetic entry, this behaviour
It is pre-operation entry as entry.
In step s 302, voice messaging is identified, to obtain pre-operation entry corresponding first
The second word that word is corresponding with entry contents;
In step S303, determine the first volume value and the second voice messaging that the first voice messaging is corresponding
The second corresponding volume value;
For the ease of carrying out semantic analysis, user is to different vocabulary, it is possible to use different volumes, from
And emphasizing the implication in semanteme, such as user says " holding radiology department of section office ", wherein " radiology department " three
The sound of individual word is bigger, then explanation radiology department is the emphasis in semantic analysis.
In step s 304, the first energy indexes parameter and the second volume that the first volume value is corresponding are determined
The second energy indexes parameter that value is corresponding;
The energy indexes parameter that different volume values is corresponding can be different, thus according to energy indexes parameter
Go to determine whether that executive table is filled in.
In step S305, when the second energy indexes parameter is more than the first energy indexes parameter, and second
Energy indexes parameter more than preset energy index parameter time, in entry contents data base to be selected search and
The target entry content of the second characters matching;
In step S306, target entry content is filled up in the entry contents form of correspondence.
In this embodiment, to comprising the first voice messaging corresponding to pre-operation entry and entry contents pair
After the voice messaging of the second voice messaging answered is identified, determine the volume value of two voice messagings respectively,
And then determine energy indexes parameter according to the volume value of two voice messagings, entry contents corresponding second
The energy indexes parameter of voice messaging is more than the energy indexes parameter of the first voice messaging, and refers to more than presetting
During mark parameter, then by word corresponding for this entry contents preferentially with waiting in entry contents data base to be selected
Select entry contents to mate, thus matched target entry content is filled up to the entry of correspondence
In table of contents.So, user can be filled up a form by phonetic entry, it is not necessary to manually selects, and
And allow user that different vocabulary are used different volumes, thus determine different according to the difference of volume
Energy indexes parameter, determines whether the operation performing to fill in entry contents form according to energy indexes parameter,
Avoid after mistake occurs in speech recognition, it is impossible to carry out the problem that form fills in and occur, ensureing at voice
On the basis of the accuracy rate of reason, improve success rate and the accuracy rate of semantic analysis, also improve user and utilize
Success rate that phonetic entry is filled up a form and user experience.
In one embodiment, as shown in Figure 4, above-mentioned steps S304 includes step S401-S403:
In step S401, obtain the corresponding relation between volume value interval and energy indexes parameter, its
In, volume value interval becomes positive correlation with energy indexes parameter;
Wherein it is possible to the corresponding relation between preset volume value interval and energy indexes parameter, thus according to
Corresponding relation between volume value interval and energy indexes parameter determines first that the first volume value interval is corresponding
The second energy indexes parameter that energy indexes parameter is corresponding with the second volume value interval.Specifically, volume value
The interval value that positive correlation, i.e. volume value can be become with energy indexes parameter interval is the biggest, energy indexes parameter
The biggest, the value in volume value interval is the least, and energy indexes parameter is the least.
For example, volume value is characterized by decibel value, for improving the success that user speech is filled up a form
Rate, the decibel value that can arrange voice is the highest, and energy indexes parameter is the biggest, volume value interval and energy in this example
Corresponding relation between figureofmerit parameter is as shown in table 1.
Table 1
Decibel value | Energy indexes parameter |
0~20 | 1 |
21~30 | 2 |
31~60 | 3 |
61~80 | 4 |
In step S402, determine the first volume value interval and the second volume value belonging to the first volume value
The second affiliated volume value is interval;
In step S403, according to the corresponding relation between volume value interval and energy indexes parameter, really
Determine the second energy that the first energy indexes parameter corresponding to the first volume value interval is corresponding with the second volume value interval
Figureofmerit parameter.
In this embodiment, energy indexes parameter is big, then the word that entry contents is corresponding is carried out preferential
The probability joined is the highest.So, on the basis of original speech recognition technology, increase energy indexes parameter,
Success rate and the accuracy rate of semantic analysis on the basis of the accuracy rate ensureing speech processes, can be improved,
Also success rate and user experience that user utilizes phonetic entry to fill up a form are improved.
In one embodiment, as it is shown in figure 5, above-mentioned steps S305 can include step S501-S502:
In step S501, calculate the second word and select with each waiting in entry contents data base to be selected
The first similarity between entry contents;
In entry contents data base to be selected, have entry to be selected, as medical treatment class data
Storehouse, entry contents to be selected can include performing section office, such as radiology department, affect section, general medicine, disappears
Change section, department of endocrinology etc. can also include doctor, as second-class in Zhang San, Li Si, king, and for other class
Data base, such as student performance class, then can include subject, such as politics, history, geography etc..
In step S502, entry contents to be selected the highest for the first similarity is defined as target entry
Content.
In this embodiment, calculate that the second word is each with entry contents data base to be selected treats selector bar
The first similarity between mesh content, thus entry contents to be selected the highest for similarity is defined as target
Entry contents, this way it is ensured that the accuracy rate of speech recognition, it also avoid owing to voice identification result goes out
Show mistake and cause carrying out the problem generation of semantic analysis, improve success rate and the standard of semantic analysis
Really rate.
In one embodiment, as shown in Figure 6, above-mentioned steps S306 includes step S601-S603:
In step s 601, the object run entry that target entry content is corresponding is determined;
In entry contents data base to be selected, entry contents to be selected and its affiliated operation entries should
It is corresponding storage, therefore, according to target entry content, it may be determined that object run entry.
In step S602, calculate between object run entry and the first word corresponding to pre-operation entry
The second similarity;
In step S603, in the second similarity more than or equal to when presetting similarity, by target bar
Mesh content is filled up in the entry contents form that object run entry is corresponding.
In this embodiment, before target entry content is filled up to the entry contents form of correspondence, also
Can first determine the object run entry that target entry content is corresponding, then by object run entry and pre-behaviour
The first word making entry corresponding carries out Similarity Measure, if both similarities are more than presetting similarity,
Be then coupling both explanation, i.e. the entry of user's pre-operation is exactly object run entry, so, enters one
Step ensure that the accuracy of semantic analysis result.
In one embodiment, as it is shown in fig. 7, at the first volume value determining that the first voice messaging is corresponding
Before the second volume value corresponding with the second voice messaging, method also includes step S701-S703:
In step s 701, the confidence level of voice messaging is calculated;
Wherein, the value of confidence level is between the scope of 0~1, owing to confidence level is used to assess voice
The reliability of recognition result, therefore confidence level is the highest, illustrates that voice identification result is the most accurate.
In step S702, it is judged that whether confidence level is less than pre-seting reliability;Pre-set taking of confidence threshold
Value is between the scope of 0~1.
In step S703, at confidence level less than when pre-seting reliability, perform to determine the first voice messaging
The first corresponding volume value and the step of the second volume value corresponding to the second voice messaging.
In this embodiment it is possible to first calculate the confidence level of voice messaging, if the confidence level of voice messaging
More than or equal to pre-seting reliability, then explanation confidence level is higher, can successfully carry out semantic analysis, then may be used
Using the speech analysis scheme in existing correlation technique to carry out semantic analysis, and if voice messaging
Confidence level is less than pre-seting reliability, then explanation confidence level is relatively low, and semantic analysis may be failed, now,
The difference according to volume that can use the present invention carries out the scheme of semantic analysis.
Following for apparatus of the present invention embodiment, may be used for performing the inventive method embodiment.
Fig. 8 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment, and this device can
With by software, hardware or both be implemented in combination with become the some or all of of terminal unit.Such as figure
Shown in 8, this voice processing apparatus includes:
Receiver module 81, for receiving the voice messaging of user's input, wherein, described voice messaging includes
The second voice messaging that first voice messaging corresponding to pre-operation entry is corresponding with entry contents;
Identification module 82, for being identified described voice messaging, to obtain described pre-operation entry pair
The second word that the first word of answering is corresponding with described entry contents;
First determines module 83, for determining the first volume value that described first voice messaging is corresponding and described
The second volume value that second voice messaging is corresponding;
Second determines module 84, for determine the first energy indexes parameter that described first volume value is corresponding and
The second energy indexes parameter that described second volume value is corresponding;
Search module 85, for when described second energy indexes parameter is more than described first energy indexes ginseng
Number, and when described second energy indexes parameter is more than preset energy index parameter, in entry contents to be selected
Data base searches the target entry content with described second characters matching;
Fill in module 86, in the entry contents form that described target entry content is filled up to correspondence.
In this embodiment, to comprising the first voice messaging corresponding to pre-operation entry and entry contents pair
After the voice messaging of the second voice messaging answered is identified, determine the volume value of two voice messagings respectively,
And then determine energy indexes parameter according to the volume value of two voice messagings, entry contents corresponding second
The energy indexes parameter of voice messaging is more than the energy indexes parameter of the first voice messaging, and refers to more than presetting
During mark parameter, then by word corresponding for this entry contents preferentially with waiting in entry contents data base to be selected
Select entry contents to mate, thus matched target entry content is filled up to the entry of correspondence
In table of contents.So, user can be filled up a form by phonetic entry, it is not necessary to manually selects, and
And allow user that different vocabulary are used different volumes, thus determine different according to the difference of volume
Energy indexes parameter, determines whether the operation performing to fill in entry contents form according to energy indexes parameter,
Avoid after mistake occurs in speech recognition, it is impossible to carry out the problem that form fills in and occur, ensureing at voice
On the basis of the accuracy rate of reason, improve success rate and the accuracy rate of semantic analysis, also improve user and utilize
Success rate that phonetic entry is filled up a form and user experience.
In one embodiment, as it is shown in figure 9, described second determines that module 84 includes:
Obtain submodule 91, for obtaining the corresponding relation between volume value interval and energy indexes parameter,
Wherein, the interval and described energy indexes parameter of described volume value becomes positive correlation;
First determines submodule 92, for determine the first volume value belonging to described first volume value interval and
The second volume value belonging to described second volume value is interval;
Second determines submodule 93, for according to the corresponding pass between volume value interval and energy indexes parameter
System, determines the first energy indexes parameter corresponding to described first volume value interval and described second volume value district
Between corresponding the second energy indexes parameter.
In this embodiment, energy indexes parameter is big, then the word that entry contents is corresponding is carried out preferential
The probability joined is the highest.So, on the basis of original speech recognition technology, increase energy indexes parameter,
Success rate and the accuracy rate of semantic analysis on the basis of the accuracy rate ensureing speech processes, can be improved,
Also success rate and user experience that user utilizes phonetic entry to fill up a form are improved.
As shown in Figure 10, in one embodiment, described lookup module 85 includes:
First calculating sub module 101, is used for calculating described second word and described entry contents number to be selected
According to the first similarity between entry contents to be selected each in storehouse;
Content determines submodule 102, for entry contents to be selected the highest for the first similarity being defined as
Described target entry content.
In this embodiment, calculate that the second word is each with entry contents data base to be selected treats selector bar
The first similarity between mesh content, thus entry contents to be selected the highest for similarity is defined as target
Entry contents, this way it is ensured that the accuracy rate of speech recognition, it also avoid owing to voice identification result goes out
Show mistake and cause carrying out the problem generation of semantic analysis, improve success rate and the standard of semantic analysis
Really rate.
As shown in figure 11, in one embodiment, fill in module 86 described in include:
Entry determines submodule 111, for determining the object run entry that described target entry content is corresponding;
Second calculating sub module 112, is used for calculating described object run entry and described pre-operation entry pair
The second similarity between described first word answered;
Fill in submodule 113, be used for when described second similarity is more than or equal to default similarity,
Described target entry content is filled up in the entry contents form that described object run entry is corresponding.
In this embodiment, before target entry content is filled up to the entry contents form of correspondence, also
Can first determine the object run entry that target entry content is corresponding, then by object run entry and pre-behaviour
The first word making entry corresponding carries out Similarity Measure, if both similarities are more than presetting similarity,
Be then coupling both explanation, i.e. the entry of user's pre-operation is exactly object run entry, so, enters one
Step ensure that the accuracy of semantic analysis result.
As shown in figure 12, in one embodiment, said apparatus also includes:
Computing module 121, for determining the first volume value that described first voice messaging is corresponding and described
Before the second volume value that second voice messaging is corresponding, calculate the confidence level of described voice messaging;
Judge module 122, is used for judging that whether described confidence level is less than pre-seting reliability;
Trigger module 123, at described confidence level less than when pre-seting reliability, trigger described second true
Cover half block determines that the first volume value that described first voice messaging is corresponding is corresponding with described second voice messaging
Second volume value.
In this embodiment it is possible to first calculate the confidence level of voice messaging, if the confidence level of voice messaging
More than or equal to pre-seting reliability, then explanation confidence level is higher, can successfully carry out semantic analysis, then may be used
Using the speech analysis scheme in existing correlation technique to carry out semantic analysis, and if voice messaging
Confidence level is less than pre-seting reliability, then explanation confidence level is relatively low, and semantic analysis may be failed, now,
The difference according to volume that can use the present invention carries out the scheme of semantic analysis.
Those skilled in the art it should be appreciated that embodiments of the invention can be provided as method, system or
Computer program.Therefore, the present invention can use complete hardware embodiment, complete software implementation,
Or combine the form of embodiment in terms of software and hardware.And, the present invention can use one or more
The computer-usable storage medium wherein including computer usable program code (includes but not limited to disk
Memorizer and optical memory etc.) form of the upper computer program implemented.
The present invention is with reference to method, equipment (system) and computer program according to embodiments of the present invention
The flow chart of product and/or block diagram describe.It should be understood that flow process can be realized by computer program instructions
Stream in each flow process in figure and/or block diagram and/or square frame and flow chart and/or block diagram
Journey and/or the combination of square frame.These computer program instructions can be provided to general purpose computer, dedicated computing
The processor of machine, Embedded Processor or other programmable data processing device, to produce a machine, makes
Must be produced by the instruction that the processor of computer or other programmable data processing device performs and be used for realizing
The merit specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame
The device of energy.
These computer program instructions may be alternatively stored in and computer or the process of other programmable datas can be guided to set
In the standby computer-readable memory worked in a specific way so that be stored in this computer-readable memory
In instruction produce and include the manufacture of command device, this command device realize in one flow process of flow chart or
The function specified in multiple flow processs and/or one square frame of block diagram or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device,
Make on computer or other programmable devices, perform sequence of operations step computer implemented to produce
Process, thus the instruction performed on computer or other programmable devices provides for realizing at flow chart
The step of the function specified in one flow process or multiple flow process and/or one square frame of block diagram or multiple square frame
。
Obviously, those skilled in the art can carry out various change and modification without deviating from this to the present invention
The spirit and scope of invention.So, if these amendments of the present invention and modification belong to right of the present invention and want
Ask and within the scope of equivalent technologies, then the present invention is also intended to comprise these change and modification.
Claims (10)
1. a method of speech processing, it is characterised in that including:
Receiving the voice messaging of user's input, wherein, described voice messaging includes that pre-operation entry is corresponding
The second voice messaging that first voice messaging is corresponding with entry contents;
Described voice messaging is identified, to obtain the first word corresponding to described pre-operation entry and institute
State the second word that entry contents is corresponding;
Determine that the first volume value that described first voice messaging is corresponding is corresponding with described second voice messaging
Two volume values;
Determine that the first energy indexes parameter that described first volume value is corresponding is corresponding with described second volume value
Second energy indexes parameter;
When described second energy indexes parameter is more than described first energy indexes parameter, and described second energy
When index parameter is more than preset energy index parameter, search with described in entry contents data base to be selected
The target entry content of the second characters matching;
Described target entry content is filled up in the entry contents form of correspondence.
Method the most according to claim 1, it is characterised in that described determine described first volume value
The second energy indexes parameter that the first corresponding energy indexes parameter is corresponding with described second volume value, including:
Obtain the corresponding relation between volume value interval and energy indexes parameter, wherein, described volume value district
Between become positive correlation with described energy indexes parameter;
Determine belonging to interval and described second volume value of the first volume value belonging to described first volume value
Two volume values are interval;
According to the corresponding relation between volume value interval and energy indexes parameter, determine described first volume value
The second energy indexes ginseng that interval the first corresponding energy indexes parameter is corresponding with described second volume value interval
Number.
Method the most according to claim 1, it is characterised in that entry contents data base to be selected
Middle lookup and the target entry content of described second characters matching, including:
Calculate described second word and each entry contents to be selected in described entry contents data base to be selected
Between the first similarity;
Entry contents to be selected the highest for first similarity is defined as described target entry content.
Method the most according to claim 1, it is characterised in that described by described target entry content
It is filled up in the entry contents form of correspondence, including:
Determine the object run entry that described target entry content is corresponding;
Calculate between described object run entry and described first word corresponding to described pre-operation entry
Two similarities;
In described second similarity more than or equal to when presetting similarity, described target entry content is filled out
Write in the entry contents form that described object run entry is corresponding.
Method the most according to any one of claim 1 to 4, it is characterised in that described determining
The first volume value that first voice messaging is corresponding, second volume value corresponding with described second voice messaging it
Before, described method also includes:
Calculate the confidence level of described voice messaging;
Judge that whether described confidence level is less than pre-seting reliability;
At described confidence level less than when pre-seting reliability, perform described to determine that described first voice messaging is corresponding
The step of the first volume value second volume value corresponding with described second voice messaging.
6. a voice processing apparatus, it is characterised in that including:
Receiver module, for receiving the voice messaging of user's input, wherein, described voice messaging includes pre-
The second voice messaging that first voice messaging corresponding to operation entries is corresponding with entry contents;
Identification module is for being identified described voice messaging, corresponding to obtain described pre-operation entry
The first word second word corresponding with described entry contents;
First determines module, for determining the first volume value that described first voice messaging is corresponding and described the
The second volume value that two voice messagings are corresponding;
Second determines module, for determining the first energy indexes parameter and institute that described first volume value is corresponding
State the second energy indexes parameter that the second volume value is corresponding;
Search module, for being more than described first energy indexes parameter when described second energy indexes parameter,
And described second energy indexes parameter more than preset energy index parameter time, in entry contents data to be selected
Storehouse is searched the target entry content with described second characters matching;
Fill in module, in the entry contents form that described target entry content is filled up to correspondence.
Device the most according to claim 6, it is characterised in that described second determines that module includes:
Obtain submodule, for obtaining the corresponding relation between volume value interval and energy indexes parameter, its
In, the interval and described energy indexes parameter of described volume value becomes positive correlation;
First determines submodule, for determining the first volume value interval and institute belonging to described first volume value
State the second volume value belonging to the second volume value interval;
Second determines submodule, is used for according to the corresponding relation between volume value interval and energy indexes parameter,
Determine that the first energy indexes parameter corresponding to described first volume value interval and described second volume value interval are right
The the second energy indexes parameter answered.
Device the most according to claim 6, it is characterised in that described lookup module includes:
First calculating sub module, is used for calculating described second word and described entry contents data base to be selected
In the first similarity between each entry contents to be selected;
Content determines submodule, described for entry contents to be selected the highest for the first similarity being defined as
Target entry content.
Device the most according to claim 6, it is characterised in that described in fill in module and include:
Entry determines submodule, for determining the object run entry that described target entry content is corresponding;
Second calculating sub module, corresponding with described pre-operation entry for calculating described object run entry
The second similarity between described first word;
Fill in submodule, for when described second similarity is more than or equal to default similarity, by institute
State target entry content to be filled up in the entry contents form that described object run entry is corresponding.
10. according to the device according to any one of claim 6 to 9, it is characterised in that described device
Also include:
Computing module, at the first volume value and described second determining that described first voice messaging is corresponding
Before the second volume value that voice messaging is corresponding, calculate the confidence level of described voice messaging;
Judge module, is used for judging that whether described confidence level is less than pre-seting reliability;
Trigger module, for when described confidence level is less than and pre-sets reliability, triggering described second and determine mould
Block determines second that the first volume value that described first voice messaging is corresponding is corresponding with described second voice messaging
Volume value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610264283.XA CN105957524B (en) | 2016-04-25 | 2016-04-25 | Voice processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610264283.XA CN105957524B (en) | 2016-04-25 | 2016-04-25 | Voice processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105957524A true CN105957524A (en) | 2016-09-21 |
CN105957524B CN105957524B (en) | 2020-03-31 |
Family
ID=56915661
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610264283.XA Active CN105957524B (en) | 2016-04-25 | 2016-04-25 | Voice processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105957524B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107895579A (en) * | 2018-01-02 | 2018-04-10 | 联想(北京)有限公司 | A kind of audio recognition method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1453766A (en) * | 2002-04-24 | 2003-11-05 | 株式会社东芝 | Sound identification method and sound identification apparatus |
CN101211504A (en) * | 2006-12-31 | 2008-07-02 | 康佳集团股份有限公司 | Method, system and apparatus for remote control for TV through voice |
CN101604521A (en) * | 2008-06-12 | 2009-12-16 | Lg电子株式会社 | Portable terminal and the method that is used to discern its voice |
CN105161104A (en) * | 2015-07-31 | 2015-12-16 | 北京云知声信息技术有限公司 | Voice processing method and device |
CN105183081A (en) * | 2015-09-07 | 2015-12-23 | 北京君正集成电路股份有限公司 | Voice control method of intelligent glasses and intelligent glasses |
-
2016
- 2016-04-25 CN CN201610264283.XA patent/CN105957524B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1453766A (en) * | 2002-04-24 | 2003-11-05 | 株式会社东芝 | Sound identification method and sound identification apparatus |
CN101211504A (en) * | 2006-12-31 | 2008-07-02 | 康佳集团股份有限公司 | Method, system and apparatus for remote control for TV through voice |
CN101604521A (en) * | 2008-06-12 | 2009-12-16 | Lg电子株式会社 | Portable terminal and the method that is used to discern its voice |
CN105161104A (en) * | 2015-07-31 | 2015-12-16 | 北京云知声信息技术有限公司 | Voice processing method and device |
CN105183081A (en) * | 2015-09-07 | 2015-12-23 | 北京君正集成电路股份有限公司 | Voice control method of intelligent glasses and intelligent glasses |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107895579A (en) * | 2018-01-02 | 2018-04-10 | 联想(北京)有限公司 | A kind of audio recognition method and system |
Also Published As
Publication number | Publication date |
---|---|
CN105957524B (en) | 2020-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10489112B1 (en) | Method for user training of information dialogue system | |
US10210154B2 (en) | Input method editor having a secondary language mode | |
CN106202059B (en) | Machine translation method and machine translation device | |
CN110164435A (en) | Audio recognition method, device, equipment and computer readable storage medium | |
US20190279622A1 (en) | Method for speech recognition dictation and correction, and system | |
US20230274729A1 (en) | Acoustic model training using corrected terms | |
CN111767021A (en) | Voice interaction method, vehicle, server, system and storage medium | |
CN105283914A (en) | System and methods for recognizing speech | |
CN112417102B (en) | Voice query method, device, server and readable storage medium | |
CN108519998B (en) | Problem guiding method and device based on knowledge graph | |
CN110889280B (en) | Knowledge base construction method and device based on document splitting | |
CN112417128B (en) | Method and device for recommending dialect, computer equipment and storage medium | |
CN109256125B (en) | Off-line voice recognition method and device and storage medium | |
KR20190000776A (en) | Information inputting method | |
CN106649696A (en) | Information classification method and device | |
CN110415679A (en) | Voice error correction method, device, equipment and storage medium | |
CN102867510A (en) | Speech recognition system | |
US20190139544A1 (en) | Voice controlling method and system | |
CN109582775B (en) | Information input method, device, computer equipment and storage medium | |
CN114254658A (en) | Method, device, equipment and storage medium for generating translation evaluation training data | |
CN105957524A (en) | Speech processing method and speech processing device | |
CN114528851B (en) | Reply sentence determination method, reply sentence determination device, electronic equipment and storage medium | |
US20190279623A1 (en) | Method for speech recognition dictation and correction by spelling input, system and storage medium | |
CN115509485A (en) | Filling-in method and device of business form, electronic equipment and storage medium | |
CN112395402A (en) | Depth model-based recommended word generation method and device and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: Room 101, 1st floor, building 1, Xisanqi building materials City, Haidian District, Beijing 100096 Patentee after: Yunzhisheng Intelligent Technology Co.,Ltd. Address before: 100028, Beijing, Haidian District Chaoyang District Sun Palace Road No. 16, building 1, AOC building, 12 floor Patentee before: BEIJING UNISOUND INFORMATION TECHNOLOGY Co.,Ltd. |
|
CP03 | Change of name, title or address |