CN104933048A - Voice message processing method and device, and electronic device - Google Patents

Voice message processing method and device, and electronic device Download PDF

Info

Publication number
CN104933048A
CN104933048A CN201410098994.5A CN201410098994A CN104933048A CN 104933048 A CN104933048 A CN 104933048A CN 201410098994 A CN201410098994 A CN 201410098994A CN 104933048 A CN104933048 A CN 104933048A
Authority
CN
China
Prior art keywords
time
label information
voice messaging
time parameter
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410098994.5A
Other languages
Chinese (zh)
Other versions
CN104933048B (en
Inventor
彭刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201410098994.5A priority Critical patent/CN104933048B/en
Publication of CN104933048A publication Critical patent/CN104933048A/en
Application granted granted Critical
Publication of CN104933048B publication Critical patent/CN104933048B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a voice message processing method. The voice message processing method is applied to an electronic device. The electronic device is provided with a voice acquisition unit. The voice message processing method comprises that a voice message and a time parameter thereof which are acquired by the voice acquisition unit are obtained; a plurality of label messages corresponding to the time parameter are obtained, and each label message corresponds to at least one time point in the time parameter of the voice message; the voice message is cut into fragments according to the time parameter of each label message in a label message set and the time parameter of the voice message, voice message fragments the number of which is the same as the number of the label messages are obtained and identified, and sounds corresponding to the voice message fragments are obtained. Because each voice message fragment corresponds to a sound, each sound of the voice of a user is added with a label, each sound corresponds to a character, the characters are identified one by one during the identification process, even if the user has a liaison case, the case that the sounds are without intervals, and are identified wrongly can be avoided. By employing the voice message processing method, the voice recognition accuracy is high, and the user experience is improved.

Description

A kind of voice information processing method, device and electronic equipment
Technical field
The invention belongs to field of speech recognition, particularly relate to a kind of voice information processing method, device and electronic equipment.
Background technology
Along with the progress of electronic technology, speech recognition technology is developed rapidly, and user can adopt this speech recognition technology to realize multiple operation, as phonetic dialing, Voice Navigation, indoor equipment control, voice document searching, simple dictation data inputting etc.
In existing speech recognition technology, be by user pronunciation between dwell interval be judged as different pronunciations, and then be identified as corresponding word according to different pronunciations in identifying.In the method for existing speech recognition, the dwell interval between needing for pronunciation judges, once dwell interval is shorter between the pronunciation of user, or when there is the situation of liaison, then likely causes identifying by mistake.Therefore, the accuracy of current speech recognition needs to be improved further.
Summary of the invention
In view of this, the object of the present invention is to provide a kind of voice information processing method, to solve in prior art, two or more pronunciation is identified as a word, there is the technical matters of mistake in recognition result.
A kind of voice information processing method, described method is applied to an electronic equipment, and described electronic equipment comprises voice collecting unit, and described method comprises:
Obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Obtain the label information group for described time parameter, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
According to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Respectively described voice messaging fragment is identified, obtain the pronunciation corresponding with described voice messaging fragment.
Above-mentioned method, preferably, described acquisition comprises for the label information group of described time parameter:
Label information group is obtained according to information;
Obtain the time parameter of each label in described label information group according to the rise time of information, described information is for pointing out user pronunciation.
Above-mentioned method, preferably, the time parameter obtaining each label in described label information group according to the rise time of information comprises:
Receive the first cue, at the operational motion of described electronic equipment predeterminable area when described first cue characterizing consumer sends voice;
The rise time of described first cue is carried out record as the time of described label information, obtains the time parameter of described label information.
Above-mentioned method, preferably, described electronic equipment is also provided with sensor, receives the first cue and comprises:
Obtain the detection data of sensor, described detection data are that described operational motion knocks the force value of generation at described electronic equipment casing predeterminable area;
Described force value is compared with the pressure threshold preset;
When described force value is greater than described pressure threshold, described hammer action meets default operational motion condition, knocks event described in record, described in the event of knocking comprise hammer action and knock the time;
Otherwise described hammer action does not meet the operational motion condition preset, not record.
Above-mentioned method, preferably, described reception first cue comprises:
Detect electric signal when programmable button is pressed in described electronic equipment;
When described electric signal being detected, record this key-press event, described key-press event comprises actuation of keys and key press time.
Above-mentioned method, preferably, described electronic equipment comprises touch-screen, and described reception first cue comprises:
Detect the electric signal that user clicks the generation of described touch-screen;
When described electric signal being detected, record described click event, described click event comprises click action and click time.
Above-mentioned method, preferably, when arranging click event area in described touch-screen, described after described electric signal being detected, before recording described click event, also comprise:
Obtain the coordinate figure of described click at described touch-screen;
Described click location is judged whether in described click event region according to described coordinate figure;
If when described click location is in described click event region, performs and describedly record described click event step.
Above-mentioned method, preferably, described electronic equipment is also provided with timer, and the time parameter that the described rise time according to information obtains each label in described label information group comprises:
Obtain rise time of the second cue of presetting, described second cue is that the timing time value of described timer generates when meeting preset value;
The rise time of described second cue is carried out record as the time of described label information, obtains the time parameter of described label information.
Above-mentioned method, preferably, describedly according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtains the voice messaging fragment identical with described label information number and comprise:
According to the time parameter of described voice messaging, set up the time shaft of described voice messaging;
According to the time parameter of each label information in the time parameter group of described label information, add described label information to described time shaft;
According to the intercepting time range preset, obtain the voice messaging fragment that each label is corresponding in described time shaft.
A kind of speech information processing apparatus, be applied to an electronic equipment, described device comprises:
First acquisition module, for obtaining the voice messaging and the time parameter of described voice messaging that described voice collecting unit gathers;
Second acquisition module, for obtaining the label information group for described time parameter, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Segmentation module, for carrying out segmentation intercepting according to the time parameter of each label information in described label information group and the time parameter of described voice messaging to described voice messaging, obtains the voice messaging fragment identical with described label information number;
Identification module, for identifying described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Above-mentioned device, preferably, described second acquisition module comprises:
Tag unit, for obtaining label information group according to information;
Time parameter unit, for obtaining the time parameter of each label in described label information group according to the rise time of information, described information is for pointing out user pronunciation.
Above-mentioned device, preferably, described time parameter unit comprises:
Receive subelement, for receiving the first cue, at the operational motion of described electronic equipment predeterminable area when described first cue characterizing consumer sends voice;
First record subelement, for the rise time of described first cue is carried out record as the time of described label information, obtains the time parameter of described label information.
Above-mentioned device, preferably, described electronic equipment is also provided with sensor, and described reception subelement comprises:
First obtains subelement, and for obtaining the detection data of sensor, described detection data are that described operational motion knocks the force value of generation at described electronic equipment casing predeterminable area;
First judgment sub-unit, for comparing described force value with the pressure threshold preset; When described force value is greater than described pressure threshold, described hammer action meets default operational motion condition, triggers described in the first record subelement record and knocks event, described in the event of knocking comprise hammer action and knock the time; Otherwise described hammer action does not meet the operational motion condition preset, not record.
Above-mentioned device, preferably, described reception subelement number comprises:
First detection sub-unit, for detecting electric signal when programmable button is pressed in described electronic equipment; When described electric signal being detected, trigger first this key-press event of record subelement record, described key-press event comprises actuation of keys and key press time.
Above-mentioned device, preferably, described electronic equipment comprises touch-screen, and described reception subelement comprises:
Second detection sub-unit, clicks the electric signal of described touch-screen generation for detecting user; When described electric signal being detected, trigger click event described in the first record subelement record, described click event comprises click action and click time.
Above-mentioned device, preferably, when arranging click event area in described touch-screen, described reception subelement also comprises:
Second obtains subelement, for obtaining the coordinate figure of described click at described touch-screen;
Second judgment sub-unit, for judging described click location whether in described click event region according to described coordinate figure; If when described click location is in described click event region, trigger the first record subelement.
Above-mentioned device, preferably, described electronic equipment is also provided with timer, and described time parameter unit comprises:
3rd obtains subelement, and for obtaining the rise time of the second default cue, described second cue is that the timing time value of described timer generates when meeting preset value;
Second record subelement, for the rise time of described second cue is carried out record as the time of described label information, obtains the time parameter of described label information.
Above-mentioned device, preferably, described segmentation module comprises:
Time shaft unit, for the time parameter according to described voice messaging, sets up the time shaft of described voice messaging;
Add tag unit, for the time parameter according to each label information in the time parameter group of described label information, add described label information to described time shaft;
Interception unit, for according to the intercepting time range preset, obtains the voice messaging fragment that each label is corresponding in described time shaft.
A kind of electronic equipment, comprising: voice collecting unit and the speech information processing apparatus as described in above-mentioned any one.
A kind of voice information processing method provided by the invention, is applied to an electronic equipment, is provided with voice collecting unit in this electronic equipment, adopts the method, obtains the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging; Obtain the label information group for described time parameter, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging; According to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number; Respectively described voice messaging fragment is identified, obtain the pronunciation corresponding with described voice messaging fragment.In this identifying, according to label information and time, segmentation intercepting is carried out to voice messaging, and due to the corresponding pronunciation of each voice messaging fragment, achieve each pronunciation interpolation label corresponding in the voice to user, improve the accuracy of identification to each pronunciation, adopt this method, the accuracy of speech recognition is high, improves Consumer's Experience.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is the process flow diagram of a kind of voice information processing method embodiment 1 that the application provides;
Fig. 2 is the process flow diagram of a kind of voice information processing method embodiment 2 that the application provides;
Fig. 3 is the process flow diagram of a kind of voice information processing method embodiment 3 that the application provides;
Fig. 4 is the process flow diagram of a kind of voice information processing method embodiment 4 that the application provides;
Fig. 5 is the process flow diagram of a kind of voice information processing method embodiment 5 that the application provides;
Fig. 6 is the process flow diagram of a kind of voice information processing method embodiment 6 that the application provides;
Fig. 7 is the process flow diagram of a kind of voice information processing method embodiment 7 that the application provides;
Fig. 8 is the process flow diagram of a kind of voice information processing method embodiment 8 that the application provides;
Fig. 9 is the process flow diagram of a kind of voice information processing method embodiment 9 that the application provides;
Figure 10 is the structural representation of a kind of speech information processing apparatus embodiment 1 that the application provides;
Figure 11 is the structural representation of a kind of speech information processing apparatus embodiment 2 that the application provides;
Figure 12 is the structural representation of a kind of speech information processing apparatus embodiment 3 that the application provides;
Figure 13 is the structural representation of a kind of speech information processing apparatus embodiment 4 that the application provides;
Figure 14 is the structural representation of a kind of speech information processing apparatus embodiment 5 that the application provides;
Figure 15 is the structural representation of a kind of speech information processing apparatus embodiment 6 that the application provides;
Figure 16 is the structural representation of a kind of speech information processing apparatus embodiment 7 that the application provides;
Figure 17 is the structural representation of a kind of speech information processing apparatus embodiment 8 that the application provides;
Figure 18 is the structural representation of a kind of speech information processing apparatus embodiment 9 that the application provides.
Embodiment
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
In order to more specifically emphasize the independence implemented, this instructions relates to number of modules or unit.For example, module or unit can be realized by hardware circuit, and this hardware circuit comprises special VLSI circuit or gate array, such as logic chip, transistor, or other assembly.Module or unit also can realize in programmable computer hardware, such as field effect programmable gate array, programmable logic array, programmable logic device etc.
Module or unit also can realize in by the software performed by various forms of processor.Such as, computer instruction block that is that an executable code module can comprise one or more entity or logic, this block may be formed as, such as, object, program or function.But the performed part of identification module or unit does not need physically to put together, but can being made up of the different instruction being stored in diverse location, when combining in logic, forming module or unit and the object reached required by this module or unit.
In fact, executable code module or unit can be a single instruction or multiple instruction, even can be distributed in the several different code section being arranged in different programs, and across several memory device.Similarly, service data by identification and can be shown in this module or unit, and can implement in any suitable form and in any suitable data structure form inner tissue.Service data can assemble single data set, maybe can be distributed in the different position with different memory devices, and is only present in a system or network in electronic signal mode at least in part.
" embodiment " mentioned by this instructions or similar term represent characteristic, structure or the feature relevant with embodiment, are included at least one embodiment of the present invention.Therefore, this instructions occur term " in one embodiment ", " in an embodiment " and similar term may but inevitablely all point to identical embodiment.
Moreover characteristic of the present invention, structure or feature can combine in one or more embodiments by any way.Below illustrate and will provide many specific details, the examples such as such as programming, software module, user's selection, network trading, data base querying, database structure, hardware module, hardware circuit, hardware chip, to provide the understanding to the embodiment of the present invention.But those of ordinary skill in the related art will find out the present invention, even without utilizing wherein one or more specific detail, or utilize other method, assembly, material etc. also can implement.On the other hand, be the present invention that avoids confusion, known structure, material or operation are not described in detail.
The process flow diagram of a kind of voice information processing method embodiment 1 that the application as shown in Figure 1 provides, described method can be applied to an electronic equipment, described electronic equipment can be desktop computer, notebook, panel computer, mobile phone, intelligent television, intelligent watch, the electronic equipments such as Wearable, have in described electronic equipment and be provided with voice collecting unit, this voice collecting unit is used for gathering the voice messaging in the external environment of electronic equipment, the voice that voice messaging in this application in this external environment especially uses the user of this electronic equipment to send.
The method can adopt following steps to realize:
Step S101: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Wherein, this voice messaging is the voice that user sends.
Wherein, this voice collecting unit is Real-time Collection voice messaging.
Wherein, this voice messaging can be various language, as Chinese, English, French etc., also can be two kinds even voice messaging of multiple voice mixing.
Wherein, the time that in this Speech time parameter, in this voice messaging, voice arrange according to time sequencing.
Concrete, this Speech time parameter can be as accurate as even microsecond second, and the precision of this Speech time parameter can be arranged according to actual conditions, does not limit in the present embodiment.
Step S102: obtain the label information group for described time parameter;
Wherein, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Wherein, in the time parameter of this voice messaging, has a time point at least, the time that each sound pronunciation that this time point can input for user is corresponding, as, user pronunciation is " zhang shan ", and be two pronunciations, the time point in the time parameter of this voice messaging is two.
Wherein, corresponding with the time parameter of voice messaging, each label information in this label information group is corresponding with the time point in this time parameter, and e.g., above-mentioned user two pronunciation " zhang " and " shan " is a corresponding label information respectively.
Step S103: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Wherein, each label information in label information group has the time parameter of its correspondence.
Concrete, according to the time parameter of this label information and the time parameter of this voice messaging, segmentation intercepting is carried out to this voice messaging, this voice messaging is divided into the voice messaging fragment identical with this label information number.
Wherein, the corresponding label information of each voice messaging fragment.
It should be noted that, in the intercepting process of voice messaging fragment, the time parameter of this label information can be identical with the time parameter of this voice messaging, also can have the mistiming of certain limit, or have the error meeting the error range preset.
Step S104: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein, adopt speech recognition engine to identify one by one this voice messaging fragment, obtain the pronunciation corresponding respectively with this voice messaging fragment.
Actual implementing, also can continuing according to identifying the pronunciation obtained, obtain the contents such as the word of this pronunciation correspondence or numerical information further, and the content corresponding to this voice messaging is shown.
In a kind of voice information processing method that the present embodiment provides, according to label information and time, segmentation intercepting is carried out to voice messaging, and due to the corresponding pronunciation of each voice messaging fragment, achieve each pronunciation interpolation label corresponding in the voice to user, improve the accuracy of identification to each pronunciation, adopt this method, the accuracy of speech recognition is high, improves Consumer's Experience.
See Fig. 2, the process flow diagram of a kind of voice information processing method embodiment 2 that the application illustrated provides, the method realizes especially by following steps:
Step S201: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Wherein, step S201 is consistent with the step S101 in embodiment 1, repeats no more in the present embodiment.
Step S202: obtain label information group according to information;
Wherein, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Wherein, described information is for pointing out user pronunciation;
Concrete, this information comprises: the information that the suggestion device that electronic equipment self is arranged generates, or, from the information be externally received of this electronic equipment.
Wherein, the suggestion device that this electronic equipment self is arranged can comprise: flashing light unit or vibrations unit etc., the information that this suggestion device generates comprises accordingly: flash of light or vibrations etc.
It should be noted that, this information has multiple, and wherein, each information obtains a label information corresponding with it, just can obtain label information group according to this information, the label information number in this label information group is identical with the number of this information.
Step S203: the time parameter obtaining each label in described label information group according to the rise time of information;
Wherein, each label in this label information group is generate according to this information, and namely each label has the information corresponding with it, and the time parameter of this label is the rise time of the information corresponding with it.
As, the rise time of information is 1 ', 3 ', then the time parameter of two labels in the label information group obtained according to this information is 1 ', 3 ' accordingly.
Step S204: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S205: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S203-205 is consistent with the step S103-104 in embodiment 1, repeats no more in the present embodiment.
See the process flow diagram of a kind of voice information processing method embodiment 3 that the application shown in Fig. 3 provides, the method realizes by following steps:
Step S301: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S302: obtain label information group according to information;
Wherein, step S301-302 is consistent with the step S201-202 in embodiment 2, repeats no more in the present embodiment.
Step S303: receive the first cue, at the operational motion of described electronic equipment predeterminable area when described first cue characterizing consumer sends voice;
Wherein, this first cue is the signal that user generates in the outside of this electronic equipment, concrete for user send voice time, at the operational motion of the predeterminable area of this electronic equipment.
Wherein, this first cue and this user send the time consistency of voice, and such as, user was pronunciation " zhang " in first second, and user is in the executable operations action of the predeterminable area of electronic equipment simultaneously; User was pronunciation " shan " in the 3rd second, and user is in the executable operations action of the predeterminable area of electronic equipment simultaneously, and this operational motion can for clicking this predeterminable area or the executable operational motion of other users.
Step S304: the rise time of described first cue is carried out record as the time of described label information, obtains the time parameter of described label information;
Wherein, this first cue and this label information one_to_one corresponding.
Wherein, while reception first cue, record the rise time of this first cue, the rise time of this first cue is carried out record as the time of label information.
It should be noted that, the operational motion that user performs at the predeterminable area of electronic equipment, object is exactly for the voice synchronous that this user sends adds corresponding mark, therefore, user is when sending voice, and synchronous performs default operational motion in the predeterminable area of this electronic equipment.
Wherein, the starting point of the rise time of this first cue is consistent with the starting point that voice collecting unit starts to gather voice messaging, and what namely this first cue and voice collecting unit gathered voice employing is same time shaft.
As, user was pronunciation " zhang " in first second, and user is in the executable operations action of the predeterminable area of electronic equipment simultaneously; User was pronunciation " shan " in the 3rd second, and simultaneously user is in the executable operations action of the predeterminable area of electronic equipment, this first second and within the 3rd second, be recorded as the time parameter of label information.
Step S305: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S306: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S305-306 is consistent with the step S204-205 in embodiment 2, repeats no more in the present embodiment.
Wherein, be also provided with sensor in this electronic equipment, for detecting in the operation of the predeterminable area of electronic equipment user, in the present embodiment, this is operating as the shell knocking electronic equipment.
See the process flow diagram of a kind of voice information processing method embodiment 4 that the application shown in Fig. 4 provides, the method realizes by following steps:
Step S401: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S402: obtain label information group according to information;
Wherein, step S401-402 is consistent with the step S301-302 in embodiment 3, repeats no more in the present embodiment.
Step S403: the detection data obtaining sensor, described detection data are that described operational motion knocks the force value of generation at described electronic equipment casing predeterminable area;
Wherein, sensor to knock the force value of generation at electronic equipment casing predeterminable area for detecting user.
Wherein, the shell predeterminable area of this electronic equipment can be any region of electronic equipment casing, as the back cover, side etc. of electronic equipment.
It should be noted that, the rhythm of action that this user knocks the shell of electronic equipment is consistent with the rhythm that this user sends voice, namely sends the shell that voice knock electronic equipment successively simultaneously.
Step S404: described force value is compared with the pressure threshold preset;
Wherein, this pressure threshold preset is meet to knock required pressure size.
Concrete, the force value of this collection and the pressure threshold preset are compared, when this force value is greater than default pressure threshold, then this hammer action is satisfied with default operational motion condition, characterize customer objective for performing this hammer action, then perform step S405, record this and knock event; When this force value is less than default pressure threshold, then this hammer action does not meet the operational motion condition preset, and characterizes user for maloperation, then ignores this hammer action, not record.
Step S405: when described force value is greater than described pressure threshold, described hammer action meets default operational motion condition, knock event described in record, described in the event of knocking comprise hammer action and knock the time, described in the time of knocking be the time parameter of described label information;
Wherein, when the force value that this sensor detects is greater than default pressure threshold, this hammer action meets default operational motion condition, characterizes customer objective for performing this hammer action, then records this and knock event.
Wherein, the event of knocking comprises: hammer action and knock the time, records this hammer action and is specifically as follows record and knocks once, and this time of knocking is recorded as the time parameter of label information.
It should be noted that, for ensureing that this electronic equipment normally runs, when user knocks electronic equipment casing predeterminable area, force can not be excessive, therefore, in concrete enforcement, also will this sensor detection data meet be less than the second predetermined threshold value, namely this user knocks the force value of generation in a preset pressure interval at electronic equipment casing predeterminable area.
It should be noted that, during user pronunciation, synchronously knock this electronic equipment casing predeterminated position once, namely user often sends voice, then synchronously knock this electronic equipment casing predeterminated position once.
Step S406: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S407: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S406-407 is consistent with the step S305-306 in embodiment 3, repeats no more in the present embodiment.
See the process flow diagram of a kind of voice information processing method embodiment 5 that the application shown in Fig. 5 provides, the method realizes by following steps:
Step S501: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S502: obtain label information group according to information;
Wherein, step S501-502 is consistent with the step S301-302 in embodiment 3, repeats no more in the present embodiment.
Step S503: detect electric signal when programmable button is pressed in described electronic equipment;
Wherein, the electric signal of programmable button in this electronic equipment is detected.
Wherein, when this programmable button is not pressed, its electric signal is the first value, and when this programmable button is pressed, its electric signal is the second value, and the second value is greater than the first value, and e.g., this first value can be the 0, second value is 0.5V.
It should be noted that, this programmable button can be the special button arranged, and also can be the button with other multiplexing functions.
When the electric signal that this programmable button is pressed being detected, characterizing consumer presses this button preset, then perform step S504, record this key-press event.
Step S504: when described electric signal being detected, records this key-press event, and described key-press event comprises actuation of keys and key press time;
Wherein, when the electric signal that this programmable button is pressed being detected, characterizing consumer presses this button preset, and records this key-press event.
Wherein, this key press time comprises: actuation of keys and key press time, and record this actuation of keys and specifically can be recorded as depresses button once, this key press time is recorded as the time parameter of label information.
It should be noted that, be subject performance to ensure that user presses this button, then also can judge the clicking action of user.
Concrete, at this programmable button location settings sensor, this sensor senses user presses force value during this programmable button, when force value meets default pressure limit, then judges that the target of user is as pressing this button, otherwise is judged to be maloperation.
In concrete enforcement, default pressure limit can be met in the force value detected when this sensor, and when this electric signal being detected, then record this event.
It should be noted that, during user pronunciation, synchronously press programmable button once, namely user often sends voice, then synchronously press programmable button once.
Step S505: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S506: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S505-506 is consistent with the step S305-306 in embodiment 3, repeats no more in the present embodiment.
See the process flow diagram of a kind of voice information processing method embodiment 6 that the application shown in Fig. 6 provides, the method realizes by following steps:
Step S601: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S602: obtain label information group according to information;
Wherein, step S601-602 is consistent with the step S301-302 in embodiment 3, repeats no more in the present embodiment.
Step S603: detect the electric signal that user clicks the generation of described touch-screen;
Wherein, user clicks in the touch-screen of electronic equipment, realizes the input to the first cue, and user often clicks once, then generate first cue.
Concrete, when user does not click this touch-screen, the electric signal size in this touch-screen is a value, when user clicks this touch-screen, it is the second value that the electric signal of this touch-screen changes, and the situation of change is determined according to touch screen structure, does not limit in the present embodiment.
Concrete, detect the electric signal of regional in this touch-screen, when this user clicks in this touch-screen, the electric signal in this touch-screen changes, and when detecting that this electric signal changes, then can judge that in this touch-screen, a certain position is clicked.
Step S604: when described electric signal being detected, records described click event, and described click event comprises click action and click time;
Wherein, when this electric signal being detected, then show that user performs click action in touch-screen, record this click event.
Wherein, this click event comprises: click action and the time of click, and record this click action and be specially record and click once, this click time is recorded as the time parameter of label information.
It should be noted that, during user pronunciation, synchronous this touch-screen of click, namely user often sends voice, then synchronously click this touch-screen once.
It should be noted that, because user clicks this touch-screen for clicking with finger, and the region clicked due to finger is less, then this electric signal be changed to a scope, when this electric signal meets this scope, then judge to occur to click the event of touch-screen.
Step S605: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S606: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S605-606 is consistent with the step S305-306 in embodiment 3, repeats no more in the present embodiment.
Wherein, the predeterminated position of touch-screen arranges click event area in the electronic device, is specifically designed to reception clicking operation.
See the process flow diagram of a kind of voice information processing method embodiment 7 that the application shown in Fig. 7 provides, the method realizes by following steps:
Step S701: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S702: obtain label information group according to information;
Step S703: detect the electric signal that user clicks the generation of described touch-screen;
Wherein, step S701-703 is consistent with the step S601-603 in embodiment 6, repeats no more in the present embodiment.
Step S704: when described electric signal being detected, obtains the coordinate figure of described click at described touch-screen;
Wherein, be specially provided with click event region in the touch-screen of this electronic equipment, user clicks in this click event region and could receive as the first cue, otherwise can be left in the basket, or responds according to the corresponding function of this click on area.
Wherein, the touch-screen of this electronic equipment can adopt xy coordinate axis, and the lower left corner coordinate as this touch-screen is (0,0), and upper right corner coordinate is (5,10).
Concrete, this clicking operation can in this touch-screen any position, as in the region centered by (2,3).
Step S705: judge described click location whether in described click event region according to described coordinate figure;
Wherein, this click event region can for the arbitrary region preset, and actually arranges, the conveniently selection of user, in the position of the center-left side of display screen, can facilitate during this electronic equipment of user's handling and clicks with thumb, as centered by (3,2), 1 centimetre is in the region of radius.
Concrete, the position that comparison point percussion is raw, as the region centered by (2,3), this region does not meet the region of presetting, then this click location does not meet in this click event predeterminable area, then do not record this click.
When the position that this click occurs meets this predeterminable area, perform step S706, record this click event.
It should be noted that, this predeterminable area can be multiplexing with other functional area in electronic equipment, when carrying out voice messaging and gathering, the priority of this click is higher than other multiplexing functions, when then carrying out voice messaging collection, user clicks this region, only carries out record to click event, and does not respond other multiplexing function.
It should be noted that, when carrying out voice messaging and gathering, when user clicks in the touch-screen of electronic equipment, when clicking other to other regions outside click event area, corresponding click event whether can be set according to actual conditions and respond.
Concrete, can arrange the response priority of the click event in different regions in advance, area priorities as corresponding in " exiting " function is the highest.
Step S706: if when described click location is in described click event region, record described click event, described click event comprises click action and click time;
Step S707: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S708: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S707-708 is consistent with the step S604-606 in embodiment 6, repeats no more in the present embodiment.
Wherein, be also provided with timer in this electronic equipment, this cue is generated by this timer, and user sends voice according to the cue of this timer, and voice collecting unit gathers this voice messaging.
See the process flow diagram of a kind of voice information processing method embodiment 8 that the application shown in Fig. 8 provides, the method realizes by following steps:
Step S801: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S802: obtain label information group according to information;
Wherein, step S801-802 is consistent with the step S201-202 in embodiment 2, repeats no more in the present embodiment.
Step S803: obtain rise time of the second cue of presetting, described second cue is that the timing time value of described timer generates when meeting preset value;
Wherein, have timer in electronic equipment, this timer is used for carrying out timing to the generation of generation second cue, when the timing time of this timer meets a preset value, generates the second cue, e.g., generates second cue every 3 seconds.
Concrete, this second cue can be vibrations or flash of light etc.
In concrete enforcement, when this second cue of this generation, user can pronounce according to this second cue, as, when electronic equipment sends a flash of light, user pronunciation " zhang ", when electronic equipment sends second flash of light, user pronunciation " shan ", the voice messaging that final voice collecting unit collects is " zhang " " shan ".
Step S804: the rise time of described second cue is carried out record as the time of described label information, obtains the time parameter of described label information;
Wherein, this second cue and label information one_to_one corresponding.
Wherein, while generation second cue, record the rise time of this second cue, the rise time of this second cue is carried out record as the time of this label information.
It should be noted that, the generation of this second cue, object is pointed out for this user sends voice exactly, and therefore, in generation second cue, when pointing out user with this electronic equipment made, user sends voice according to this prompting.
Wherein, the starting point of the rise time of this second cue is consistent with the starting point that voice collecting unit starts to gather voice messaging, and what namely this second cue and voice collecting unit gathered voice employing is same time shaft.
So, the rise time of this second cue, the time of the voice messaging collected with this voice collecting unit, there is corresponding relation therebetween.
As, when electronic equipment sends flash of light in the 1st second, user according to this flash of light pronunciation " zhang ", during electronic equipment the 3rd second flash of light, user pronunciation " shan ", this first second and within the 3rd second, be recorded as the time parameter of label information.
Step S805: according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Step S806: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein step S805-806 is consistent with the step S204-205 in embodiment 2, repeats no more in the present embodiment.
See the process flow diagram of a kind of voice information processing method embodiment 9 that the application shown in Fig. 9 provides, the method realizes especially by following steps:
Step S901: obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Step S902: obtain the label information group for described time parameter;
Wherein, step S901-902 is consistent with the step S101-102 in embodiment 1, repeats no more in the present embodiment.
Step S903: according to the time parameter of described voice messaging, sets up the time shaft of described voice messaging;
Wherein, according to the time parameter of this voice messaging, set up the time shaft of this voice messaging.
Wherein, the starting point of this time shaft is after opening voice collecting unit, starts the time gathered voice messaging, and the end point of this time shaft is the time stopping gathering voice messaging.
Concrete, this stopping can comprising the time that voice messaging gathers: this voice collecting unit exits state of activation, or this voice collecting unit stops gathering voice.
It should be noted that, according to the time order and function order of this time shaft, each pronunciation in this voice messaging is arranged successively.
Step S904: according to the time parameter of each label information in the time parameter group of described label information, add described label information to described time shaft;
Wherein, in the time parameter group of this label information, include the time parameter of multiple label information, the corresponding time parameter of each label information.
Concrete, by the time parameter of each label information, this label information is added in time shaft one by one, to make corresponding two parts of this time shaft: label information and voice messaging.
It should be noted that, this label information be according to the first cue generate time, in this voice messaging, time of user pronunciation and label information are around; And this label information be according to second cue generate time, this voice messaging is that the pronunciation of voice collecting unit to user collects, and this user pronunciation sends according to the second cue, so, user is in pronunciation and perceive between this second cue and have the mistiming, then also have the mistiming between time of user pronunciation and the time of label information in this voice messaging.
Step S905: according to the intercepting time range preset, obtains the voice messaging fragment that each label is corresponding in described time shaft;
Wherein, according to each label in this time shaft, this voice messaging is intercepted, obtain multiple voice messaging fragment.
Wherein, this intercepting time range preset, determines according to actual conditions, comprising: the type of cue and/or the word speed etc. of user.
Wherein, when this label information is according to the first cue generation, an intercepting time range can be set.As this intercepting time range can be: centered by this label information, length is the time range of 2 seconds, that is, according to centered by time corresponding to label information arbitrary in this time shaft, length is that voice messaging corresponding to the time range of 2 seconds intercepts, and obtains multiple voice messaging fragment.
Wherein, when this label information is according to the second cue generation, because the time of this label information and the time of voice messaging collection have the regular hour poor, then when the voice messaging fragment that this label information of intercepting is corresponding, so, when intercepting voice messaging according to label information, the intercepting time range of setting should have certain skew with the time parameter of this label information.As, the time range of this intercepting can be: with after the time parameter of this label information 0.5 second for starting point, length is the time range of 2 seconds, namely, according to after the time that label information arbitrary in this time shaft is corresponding 0.5 second be starting point, length is that voice messaging corresponding to the time range of 2 seconds intercepts, and obtains multiple voice messaging fragment.
It should be noted that, the time span of this skew, can arrange according to the reaction time of user, default value can be 0.5 second, but is not limited to this value.
It should be noted that the length of the time range of this intercepting is set to 2 seconds, but is not limited to this in the present embodiment, in actual enforcement, the length of the time range of this intercepting can be arranged according to conditions such as the word speeds of user.
It should be noted that, when in intercepting process, when overlapping cases appears in time range corresponding to two adjacent label information, lap can be truncated to respectively in two voice messaging fragments, make in two voice messaging fragments all containing this lap.
Step S906: identify described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein, step S906 is consistent with the step S104 in embodiment 1, repeats no more in the present embodiment.
Corresponding to a kind of voice information processing method embodiment that above-mentioned the application provides, present invention also provides a kind of speech information processing apparatus embodiment.
The structural representation of a kind of speech information processing apparatus embodiment 1 that the application as shown in Figure 10 provides, described device can be applied to an electronic equipment, described electronic equipment can be desktop computer, notebook, panel computer, mobile phone, intelligent television, intelligent watch, the electronic equipments such as Wearable, have in described electronic equipment and be provided with voice collecting unit, this voice collecting unit is used for gathering the voice messaging in the external environment of electronic equipment, the voice that voice messaging in this application in this external environment especially uses the user of this electronic equipment to send.
This device can comprise: the first acquisition module 1001, second acquisition module 1002, segmentation module 1003 and identification module 1004;
First acquisition module 1001, for obtaining the voice messaging and the time parameter of described voice messaging that described voice collecting unit gathers;
Wherein, this voice messaging is the voice that user sends.
Wherein, this voice collecting unit is Real-time Collection voice messaging.
Wherein, this voice messaging can be various language, as Chinese, English, French etc., also can be two kinds even voice messaging of multiple voice mixing.
Wherein, the time that in this Speech time parameter, in this voice messaging, voice arrange according to time sequencing, the first acquisition module 1001 obtains the time parameter of this voice messaging and described voice messaging.
Concrete, this Speech time parameter can be as accurate as even microsecond second, and the precision of this Speech time parameter can be arranged according to actual conditions, does not limit in the present embodiment.
Second acquisition module 1002, for obtaining the label information group for described time parameter, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Wherein, the second acquisition module 1002 obtains set of tags information, comprises at least one label information in this label information group, and described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Wherein, in the time parameter of this voice messaging, has a time point at least, the time that each sound pronunciation that this time point can input for user is corresponding, as, user pronunciation is " zhang shan ", and be two pronunciations, the time point in the time parameter of this voice messaging is two.
Wherein, corresponding with the time parameter of voice messaging, each label information in this label information group is corresponding with the time point in this time parameter, and e.g., above-mentioned user two pronunciation " zhang " and " shan " is a corresponding label information respectively.
Segmentation module 1003, for carrying out segmentation intercepting according to the time parameter of each label information in described label information group and the time parameter of described voice messaging to described voice messaging, obtains the voice messaging fragment identical with described label information number;
Wherein, each label information in label information group has the time parameter of its correspondence.
Concrete, the time parameter of segmentation module 1003 according to this label information and the time parameter of this voice messaging, carry out segmentation intercepting to this voice messaging, this voice messaging be divided into the voice messaging fragment identical with this label information number.
Wherein, the corresponding label information of each voice messaging fragment.
It should be noted that, in the intercepting process of voice messaging fragment, the time parameter of this label information can be identical with the time parameter of this voice messaging, also can have the mistiming of certain limit, or have the error meeting the error range preset.
Identification module 1004, for identifying described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
Wherein, identification module 1004 adopts speech recognition engine to identify one by one this voice messaging fragment, obtains the pronunciation corresponding respectively with this voice messaging fragment.
Actual implement, electronic equipment also can continue according to identifying the pronunciation obtained, and obtain the contents such as the word of this pronunciation correspondence or numerical information further, and the content corresponding to this voice messaging shown.
In a kind of voice messaging recognition device that the present embodiment provides, according to label information and time, segmentation intercepting is carried out to voice messaging, and due to the corresponding pronunciation of each voice messaging fragment, achieve each pronunciation interpolation label corresponding in the voice to user, improve the accuracy of identification to each pronunciation, adopt this method, the accuracy of speech recognition is high, improves Consumer's Experience.
The structural representation of a kind of speech information processing apparatus embodiment 2 that the application shown in Figure 11 provides, this structure comprises: the first acquisition module 1101, second acquisition module 1102, segmentation module 1103 and identification module 1104, wherein, this second acquisition module 1102 comprises tag unit 1105 and time parameter unit 1106;
Wherein, in the present embodiment, the first acquisition module 1101, segmentation module 1103 and identification module 1104 are consistent with the function of corresponding construction in embodiment 1, repeat no more in the present embodiment.
Tag unit 1105, for obtaining label information group according to information;
Wherein, tag unit 1105 obtains label information group according to this information, comprises at least one label information in this label information group, and described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Wherein, described information is for pointing out user pronunciation.
Concrete, this information comprises: the information that the suggestion device that electronic equipment self is arranged generates, or, from the information be externally received of this electronic equipment.
Wherein, the suggestion device that this electronic equipment self is arranged can comprise: flashing light unit or vibrations unit etc., the information that this suggestion device generates comprises accordingly: flash of light or vibrations etc.
It should be noted that, this information has multiple, and wherein, each information obtains a label information corresponding with it, just can obtain label information group according to this information, the label information number in this label information group is identical with the number of this information.
Time parameter unit 1106, for obtaining the time parameter of each label in described label information group according to the rise time of information, described information is for pointing out user pronunciation.
Wherein, each label in this label information group is generate according to this information, and namely each label has the information corresponding with it, and the time parameter of this label is the rise time of the information corresponding with it.
As, the rise time of information is 1 ', 3 ', then the time parameter of two labels in the label information group obtained according to this information is 1 ', 3 ' accordingly.
Concrete, the rise time that this time parameter unit 1106 records this information is the time parameter of each label in this set of tags information.
The structural representation of time parameter unit in a kind of speech information processing apparatus embodiment 3 that the application shown in Figure 12 provides, this time parameter unit comprises: receive subelement 1201 and the first record subelement 1202.
Wherein, in the present embodiment, the first acquisition module, segmentation module, identification module and tag unit are consistent with the function of corresponding construction in embodiment 2, repeat no more in the present embodiment.
Receive subelement 1201, for receiving the first cue, at the operational motion of described electronic equipment predeterminable area when described first cue characterizing consumer sends voice;
Wherein, this first cue is the signal that user generates in the outside of this electronic equipment, concrete for user send voice time, at the operational motion of the predeterminable area of this electronic equipment, this reception subelement 1201 receives this first cue.
Wherein, this first cue and this user send the time consistency of voice, and such as, user was pronunciation " zhang " in first second, and user is in the executable operations action of the predeterminable area of electronic equipment simultaneously; User was pronunciation " shan " in the 3rd second, and user is in the executable operations action of the predeterminable area of electronic equipment simultaneously, and this operational motion can for clicking this predeterminable area or the executable operational motion of other users.
First record subelement 1202, for the rise time of described first cue is carried out record as the time of described label information, obtains the time parameter of described label information.
Wherein, this first cue and this label information one_to_one corresponding.
Wherein, receive subelement 1201 while reception first cue, the first record subelement 1202 records the rise time of this first cue, and the rise time of this first cue is carried out record as the time of label information.
It should be noted that, the operational motion that user performs at the predeterminable area of electronic equipment, object is exactly for the voice synchronous that this user sends adds corresponding mark, therefore, user is when sending voice, and synchronous performs default operational motion in the predeterminable area of this electronic equipment.
Wherein, the starting point of the rise time of this first cue is consistent with the starting point that voice collecting unit starts to gather voice messaging, and what namely this first cue and voice collecting unit gathered voice employing is same time shaft.
As, user was pronunciation " zhang " in first second, and user is in the executable operations action of the predeterminable area of electronic equipment simultaneously; User was pronunciation " shan " in the 3rd second, and simultaneously user is in the executable operations action of the predeterminable area of electronic equipment, this first second and within the 3rd second, be recorded as the time parameter of label information.
Wherein, be also provided with sensor in this electronic equipment, for detecting in the operation of the predeterminable area of electronic equipment user, in the present embodiment, this is operating as the shell knocking electronic equipment.
See the structural representation of tag unit in a kind of speech information processing apparatus embodiment 4 that the application shown in Figure 13 provides, comprise: receive subelement 1301 and the first record subelement 1302, wherein, this reception subelement 1301 comprises the first acquisition subelement 1303 and the first judgment sub-unit 1304.
Wherein, the first acquisition module in the present embodiment, segmentation module are consistent with the function of corresponding construction in embodiment 3 with identification module, repeat no more in the present embodiment.
First obtains subelement 1303, and for obtaining the detection data of sensor, described detection data are that described operational motion knocks the force value of generation at described electronic equipment casing predeterminable area;
Wherein, sensor to knock the force value of generation at electronic equipment casing predeterminable area for detecting user, and first obtains subelement 1303 receives this force value.
Wherein, the shell predeterminable area of this electronic equipment can be any region of electronic equipment casing, as the back cover, side etc. of electronic equipment.
It should be noted that, the rhythm of action that this user knocks the shell of electronic equipment is consistent with the rhythm that this user sends voice, namely sends the shell that voice knock electronic equipment successively simultaneously.
First judgment sub-unit 1304, for comparing described force value with the pressure threshold preset;
Wherein, this pressure threshold preset is meet to knock required pressure size.
Concrete, the force value of this collection and the pressure threshold preset compare by the first judgment sub-unit 1304, when this force value is greater than default pressure threshold, then this hammer action is satisfied with default operational motion condition, characterize customer objective for performing this hammer action, then trigger the first record subelement 1302, record this and knock event; When this force value is less than default pressure threshold, then this hammer action does not meet the operational motion condition preset, and characterizes user for maloperation, then ignores this hammer action, not record.
First record subelement 1302, for when described force value is greater than described pressure threshold, described hammer action meets default operational motion condition, knocks event described in record, the described event of knocking comprises hammer action and knock the time, described in the time of knocking be the time parameter of described label information.
Wherein, when the force value that this sensor detects is greater than default pressure threshold, this hammer action meets default operational motion condition, characterizes customer objective for performing this hammer action, then the first record subelement 1302 records this and knocks event.
Wherein, the event of knocking comprises: hammer action and knock the time, records this hammer action and is specifically as follows record and knocks once, and this time of knocking is recorded as the time parameter of label information.
It should be noted that, for ensureing that this electronic equipment normally runs, when user knocks electronic equipment casing predeterminable area, force can not be excessive, therefore, in concrete enforcement, also will this sensor detection data meet be less than the second predetermined threshold value, namely this user knocks the force value of generation in a preset pressure interval at electronic equipment casing predeterminable area.
It should be noted that, during user pronunciation, synchronously knock this electronic equipment casing predeterminated position once, namely user often sends voice, then synchronously knock this electronic equipment casing predeterminated position once.
See the structural representation of a kind of speech information processing apparatus embodiment 5 tag unit that the application shown in Figure 14 provides, comprise: receive subelement 1401 and the first record subelement 1402, wherein, this reception subelement 1401 comprises the first detection sub-unit 1403.
Wherein, the first acquisition module in the present embodiment, segmentation module are consistent with the function of corresponding construction in embodiment 3 with identification module, repeat no more in the present embodiment.
First detection sub-unit 1403, for detecting electric signal when programmable button is pressed in described electronic equipment;
Wherein, the electric signal of programmable button in this electronic equipment is detected.
Wherein, when this programmable button is not pressed, its electric signal is the first value, and when this programmable button is pressed, its electric signal is the second value, and the second value is greater than the first value, and e.g., this first value can be the 0, second value is 0.5V.
It should be noted that, this programmable button can be the special button arranged, and also can be the button with other multiplexing functions.
When the first detection sub-unit 1403 detects the electric signal that this programmable button is pressed, characterizing consumer presses this button preset, then trigger the first record subelement 1402, record this key-press event.
First record subelement 1402, for when described electric signal being detected, record this key-press event, described key-press event comprises actuation of keys and key press time.
Wherein, when the first detection sub-unit 1403 detects the electric signal that this programmable button is pressed, characterizing consumer presses this button preset, and the first record subelement 1402 records this key-press event.
Wherein, this key press time comprises: actuation of keys and key press time, and record this actuation of keys and specifically can be recorded as depresses button once, this key press time is recorded as the time parameter of label information.
It should be noted that, be subject performance to ensure that user presses this button, then also can judge the clicking action of user.
Concrete, at this programmable button location settings sensor, this sensor senses user presses force value during this programmable button, when force value meets default pressure limit, then judges that the target of user is as pressing this button, otherwise is judged to be maloperation.
In concrete enforcement, default pressure limit can be met in the force value detected when this sensor, and when this electric signal being detected, then the first record subelement 1402 records this event.
It should be noted that, during user pronunciation, synchronously press programmable button once, namely user often sends voice, then synchronously press programmable button once.
The structural representation of the tag unit of a kind of speech information processing apparatus embodiment 6 provided see the application shown in Figure 15, comprise: receive subelement 1501 and the first record subelement 1502, wherein, this reception subelement 1501 comprises the second detection sub-unit 1503.
Wherein, the first acquisition module in the present embodiment, segmentation module are consistent with the function of corresponding construction in embodiment 3 with identification module, repeat no more in the present embodiment.
Second detection sub-unit 1503, clicks the electric signal of described touch-screen generation for detecting user;
Wherein, user clicks in the touch-screen of electronic equipment, realizes the input to the first cue, and user often clicks once, then generate first cue.
Concrete, when user does not click this touch-screen, the electric signal size in this touch-screen is a value, when user clicks this touch-screen, it is the second value that the electric signal of this touch-screen changes, and the situation of change is determined according to touch screen structure, does not limit in the present embodiment.
Concrete, detect the electric signal of regional in this touch-screen, when this user clicks in this touch-screen, electric signal in this touch-screen changes, second detection sub-unit 1503 records this electric signal, when detecting that this electric signal changes, then can judge that in this touch-screen, a certain position is clicked.
First record subelement 1502, for when described electric signal being detected, record described click event, described click event comprises click action and click time.
Wherein, when the second detection sub-unit 1503 detects this electric signal, then show that user performs click action in touch-screen, the first record subelement 1502 records this click event.
Wherein, this click event comprises: click action and the time of click, and record this click action and be specially record and click once, this click time is recorded as the time parameter of label information.
It should be noted that, during user pronunciation, synchronous this touch-screen of click, namely user often sends voice, then synchronously click this touch-screen once.
It should be noted that, because user clicks this touch-screen for clicking with finger, and the region clicked due to finger is less, then this electric signal be changed to a scope, when this electric signal meets this scope, then judge to occur to click the event of touch-screen.
Wherein, the predeterminated position of touch-screen arranges click event area in the electronic device, is specifically designed to reception clicking operation.
The structural representation of the tag unit of a kind of voice information processing method embodiment 7 provided see the application shown in Figure 16, comprise: receive subelement 1601 and the first record subelement 1602, wherein, this reception subelement 1601 comprises the second detection sub-unit 1603, second acquisition subelement 1604 and the second judgment sub-unit 1605.
Wherein, the first acquisition module in the present embodiment, segmentation module, identification module are consistent with the function of corresponding construction in embodiment 6 with the second detection sub-unit, repeat no more in the present embodiment.
Second obtains subelement 1604, for obtaining the coordinate figure of described click at described touch-screen;
Wherein, be specially provided with click event region in the touch-screen of this electronic equipment, user clicks in this click event region and could receive as the first cue, otherwise can be left in the basket, or responds according to the corresponding function of this click on area.
Wherein, the touch-screen of this electronic equipment can adopt xy coordinate axis, and the lower left corner coordinate as this touch-screen is (0,0), and upper right corner coordinate is (5,10).
Concrete, this clicking operation can in this touch-screen any position, as in the region centered by (2,3), second obtains subelement 1604 obtains the coordinate figure of this clicking operation at this touch-screen, as (2,3).
Second judgment sub-unit 1605, for judging described click location whether in described click event region according to described coordinate figure; If when described click location is in described click event region, trigger the first record subelement.
Wherein, this click event region can for the arbitrary region preset, and actually arranges, the conveniently selection of user, in the position of the center-left side of display screen, can facilitate during this electronic equipment of user's handling and clicks with thumb, as centered by (3,2), 1 centimetre is in the region of radius.
Concrete, the position that the second judgment sub-unit 1605 comparison point percussion is raw, as the region centered by (2,3), this region does not meet the region of presetting, then this click location does not meet in this click event predeterminable area, then do not record this click.
When the position that this click occurs meets this predeterminable area, trigger the first record cell 1602, record this click event.
It should be noted that, this predeterminable area can be multiplexing with other functional area in electronic equipment, when carrying out voice messaging and gathering, the priority of this click is higher than other multiplexing functions, when then carrying out voice messaging collection, user clicks this region, only carries out record to click event, and does not respond other multiplexing function.
It should be noted that, when carrying out voice messaging and gathering, when user clicks in the touch-screen of electronic equipment, when clicking other to other regions outside click event area, corresponding click event whether can be set according to actual conditions and respond.
Concrete, can arrange the response priority of the click event in different regions in advance, area priorities as corresponding in " exiting " function is the highest.
Wherein, be also provided with timer in this electronic equipment, this cue is generated by this timer, and user sends voice according to the cue of this timer, and voice collecting unit gathers this voice messaging.
The structural representation of time parameter unit in a kind of speech information processing apparatus embodiment 8 that the application shown in Figure 17 provides, wherein, this time parameter unit comprises the 3rd and obtains subelement 1701 and the second record subelement 1702.
Wherein, the first acquisition module, the second acquisition module, segmentation module, identification module are consistent with the function of corresponding construction in embodiment 2 with the tag unit of the second acquisition module, repeat no more in the present embodiment.
3rd obtains subelement 1701, and for obtaining the rise time of the second default cue, described second cue is that the timing time value of described timer generates when meeting preset value;
Wherein, in electronic equipment, there is timer, this timer is used for carrying out timing to the generation of generation second cue, when the timing time of this timer meets a preset value, generate the second cue, as, generated second cue every 3 seconds, the 3rd obtains the rise time that subelement 1701 obtains this second cue.
Concrete, this second cue can be vibrations or flash of light etc.
In concrete enforcement, when this second cue of this generation, user can pronounce according to this second cue, as, when electronic equipment sends a flash of light, user pronunciation " zhang ", when electronic equipment sends second flash of light, user pronunciation " shan ", the voice messaging that final voice collecting unit collects is " zhang " " shan ".
Second record subelement 1702, for the rise time of described second cue is carried out record as the time of described label information, obtains the time parameter of described label information.
Wherein, this second cue and label information one_to_one corresponding.
Wherein, while generation second cue, the second record subelement 1702 records the rise time of this second cue, and the rise time of this second cue is carried out record as the time of this label information.
It should be noted that, the generation of this second cue, object is pointed out for this user sends voice exactly, and therefore, in generation second cue, when pointing out user with this electronic equipment made, user sends voice according to this prompting.
Wherein, the starting point of the rise time of this second cue is consistent with the starting point that voice collecting unit starts to gather voice messaging, and what namely this second cue and voice collecting unit gathered voice employing is same time shaft.
So, the rise time of this second cue, the time of the voice messaging collected with this voice collecting unit, there is corresponding relation therebetween.
As, when electronic equipment sends flash of light in the 1st second, user according to this flash of light pronunciation " zhang ", during electronic equipment the 3rd second flash of light, user pronunciation " shan ", this first second and within the 3rd second, be recorded as the time parameter of label information.
See the structural representation of a kind of speech information processing apparatus embodiment 9 that the application shown in Figure 18 provides, comprising: the first acquisition module 1801, second acquisition module 1802, segmentation module 1803 and identification module 1804; Wherein, described segmentation module 1803 comprises: time shaft unit 1805, interpolation tag unit 1806 and interception unit 1807;
Wherein, in the present embodiment, the first acquisition module 1801, second acquisition module 1802 is consistent with the function of corresponding construction in embodiment 1 with identification module 1804, repeats no more in the present embodiment.
Time shaft unit 1805, for the time parameter according to described voice messaging, sets up the time shaft of described voice messaging;
Wherein, time shaft unit 1805, according to the time parameter of this voice messaging, sets up the time shaft of this voice messaging.
Wherein, the starting point of this time shaft is after opening voice collecting unit, starts the time gathered voice messaging, and the end point of this time shaft is the time stopping gathering voice messaging.
Concrete, this stopping can comprising the time that voice messaging gathers: this voice collecting unit exits state of activation, or this voice collecting unit stops gathering voice.
It should be noted that, according to the time order and function order of this time shaft, each pronunciation in this voice messaging is arranged successively.
Add tag unit 1806, for the time parameter according to each label information in the time parameter group of described label information, add described label information to described time shaft;
Wherein, in the time parameter group of this label information, include the time parameter of multiple label information, the corresponding time parameter of each label information.
Concrete, add tag unit 1806 by the time parameter of each label information, this label information is added in time shaft one by one, to make corresponding two parts of this time shaft: label information and voice messaging.
It should be noted that, this label information be according to the first cue generate time, in this voice messaging, time of user pronunciation and label information are around; And this label information be according to second cue generate time, this voice messaging is that the pronunciation of voice collecting unit to user collects, and this user pronunciation sends according to the second cue, so, user is in pronunciation and perceive between this second cue and have the mistiming, then also have the mistiming between time of user pronunciation and the time of label information in this voice messaging.
Interception unit 1807, for according to the intercepting time range preset, obtains the voice messaging fragment that each label is corresponding in described time shaft.
Wherein, interception unit 1807, according to each label in this time shaft, intercepts this voice messaging, obtains multiple voice messaging fragment.
Wherein, this intercepting time range preset, determines according to actual conditions, comprising: the type of cue and/or the word speed etc. of user.
Wherein, when this label information is according to the first cue generation, an intercepting time range can be set.As this intercepting time range can be: centered by this label information, length is the time range of 2 seconds, that is, according to centered by time corresponding to label information arbitrary in this time shaft, length is that voice messaging corresponding to the time range of 2 seconds intercepts, and obtains multiple voice messaging fragment.
Wherein, when this label information is according to the second cue generation, because the time of this label information and the time of voice messaging collection have the regular hour poor, then when the voice messaging fragment that this label information of intercepting is corresponding, so, when intercepting voice messaging according to label information, the intercepting time range of setting should have certain skew with the time parameter of this label information.As, the time range of this intercepting can be: with after the time parameter of this label information 0.5 second for starting point, length is the time range of 2 seconds, namely, according to after the time that label information arbitrary in this time shaft is corresponding 0.5 second be starting point, length is that voice messaging corresponding to the time range of 2 seconds intercepts, and obtains multiple voice messaging fragment.
It should be noted that, the time span of this skew, can arrange according to the reaction time of user, default value can be 0.5 second, but is not limited to this value.
It should be noted that the length of the time range of this intercepting is set to 2 seconds, but is not limited to this in the present embodiment, in actual enforcement, the length of the time range of this intercepting can be arranged according to conditions such as the word speeds of user.
It should be noted that, when in intercepting process, when overlapping cases appears in time range corresponding to two adjacent label information, lap can be truncated to respectively in two voice messaging fragments, make in two voice messaging fragments all containing this lap.
Corresponding to a kind of speech information processing apparatus embodiment that above-mentioned the application provides, present invention also provides a kind of electronic equipment, this electronic equipment comprises: a kind of speech information processing apparatus that voice collecting unit above-described embodiment 1 provides, this device comprises: the first acquisition module, the second acquisition module, segmentation module and identification module, the function of this speech information processing apparatus all modules is consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeats no more in the present embodiment.
Preferably, above-mentioned second acquisition module comprises: tag unit and time parameter unit, all modules in this voice processing apparatus, unit are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, above-mentioned time parameter unit comprises: receive subelement and the first record subelement, all modules in this voice processing apparatus, unit are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, also sensor is provided with in this electronic equipment, this reception subelement comprises: first obtains subelement and the first judgment sub-unit, all modules in this voice processing apparatus, unit, subelement are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, this reception subelement comprises the first detection sub-unit, and all modules in this voice processing apparatus, unit, subelement are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, also touch-screen is provided with in this electronic equipment, this reception subelement comprises the second detection sub-unit, all modules in this voice processing apparatus, unit, subelement are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, when arranging click event area in described touch-screen, described reception subelement also comprises: second obtains subelement and the second judgment sub-unit, all modules in this voice processing apparatus, unit, subelement are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, this electronic equipment is also provided with timer, this time parameter unit comprises the 3rd and obtains subelement and the second record subelement, all modules in this voice processing apparatus, unit, subelement are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
Preferably, this segmentation module comprises time shaft unit, adds tag unit and interception unit,, all modules in this voice processing apparatus, unit are consistent with the function of corresponding construction in above-mentioned a kind of speech information processing apparatus embodiment, repeat no more in the present embodiment.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (19)

1. a voice information processing method, is characterized in that, described method is applied to an electronic equipment, and described electronic equipment comprises voice collecting unit, and described method comprises:
Obtain the voice messaging of described voice collecting unit collection and the time parameter of described voice messaging;
Obtain the label information group for described time parameter, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
According to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtain the voice messaging fragment identical with described label information number;
Respectively described voice messaging fragment is identified, obtain the pronunciation corresponding with described voice messaging fragment.
2. method according to claim 1, is characterized in that, described acquisition comprises for the label information group of described time parameter:
Label information group is obtained according to information;
Obtain the time parameter of each label in described label information group according to the rise time of information, described information is for pointing out user pronunciation.
3. method according to claim 2, is characterized in that, the time parameter obtaining each label in described label information group according to the rise time of information comprises:
Receive the first cue, at the operational motion of described electronic equipment predeterminable area when described first cue characterizing consumer sends voice;
The rise time of described first cue is carried out record as the time of described label information, obtains the time parameter of described label information.
4. method according to claim 3, is characterized in that, described electronic equipment is also provided with sensor, receives the first cue and comprises:
Obtain the detection data of sensor, described detection data are that described operational motion knocks the force value of generation at described electronic equipment casing predeterminable area;
Described force value is compared with the pressure threshold preset;
When described force value is greater than described pressure threshold, described hammer action meets default operational motion condition, knocks event described in record, described in the event of knocking comprise hammer action and knock the time;
Otherwise described hammer action does not meet the operational motion condition preset, not record.
5. method according to claim 3, is characterized in that, described reception first cue comprises:
Detect electric signal when programmable button is pressed in described electronic equipment;
When described electric signal being detected, record this key-press event, described key-press event comprises actuation of keys and key press time.
6. method according to claim 3, is characterized in that, described electronic equipment comprises touch-screen, and described reception first cue comprises:
Detect the electric signal that user clicks the generation of described touch-screen;
When described electric signal being detected, record described click event, described click event comprises click action and click time.
7. method according to claim 6, is characterized in that, when arranging click event area in described touch-screen, described after described electric signal being detected, before recording described click event, also comprises:
Obtain the coordinate figure of described click at described touch-screen;
Described click location is judged whether in described click event region according to described coordinate figure;
If when described click location is in described click event region, performs and describedly record described click event step.
8. method according to claim 2, is characterized in that, described electronic equipment is also provided with timer, and the time parameter that the described rise time according to information obtains each label in described label information group comprises:
Obtain rise time of the second cue of presetting, described second cue is that the timing time value of described timer generates when meeting preset value;
The rise time of described second cue is carried out record as the time of described label information, obtains the time parameter of described label information.
9. method according to claim 1, it is characterized in that, describedly according to the time parameter of each label information in described label information group and the time parameter of described voice messaging, segmentation intercepting is carried out to described voice messaging, obtains the voice messaging fragment identical with described label information number and comprise:
According to the time parameter of described voice messaging, set up the time shaft of described voice messaging;
According to the time parameter of each label information in the time parameter group of described label information, add described label information to described time shaft;
According to the intercepting time range preset, obtain the voice messaging fragment that each label is corresponding in described time shaft.
10. a speech information processing apparatus, is applied to an electronic equipment, it is characterized in that, described device comprises:
First acquisition module, for obtaining the voice messaging and the time parameter of described voice messaging that described voice collecting unit gathers;
Second acquisition module, for obtaining the label information group for described time parameter, comprise at least one label information in described label information group, described label information group is corresponding with at least one time point in the time parameter of described voice messaging;
Segmentation module, for carrying out segmentation intercepting according to the time parameter of each label information in described label information group and the time parameter of described voice messaging to described voice messaging, obtains the voice messaging fragment identical with described label information number;
Identification module, for identifying described voice messaging fragment respectively, obtains the pronunciation corresponding with described voice messaging fragment.
11. devices according to claim 10, is characterized in that, described second acquisition module comprises:
Tag unit, for obtaining label information group according to information;
Time parameter unit, for obtaining the time parameter of each label in described label information group according to the rise time of information, described information is for pointing out user pronunciation.
12. devices according to claim 11, is characterized in that, described time parameter unit comprises:
Receive subelement, for receiving the first cue, at the operational motion of described electronic equipment predeterminable area when described first cue characterizing consumer sends voice;
First record subelement, for the rise time of described first cue is carried out record as the time of described label information, obtains the time parameter of described label information.
13. devices according to claim 12, is characterized in that, described electronic equipment is also provided with sensor, and described reception subelement comprises:
First obtains subelement, and for obtaining the detection data of sensor, described detection data are that described operational motion knocks the force value of generation at described electronic equipment casing predeterminable area;
First judgment sub-unit, for comparing described force value with the pressure threshold preset; When described force value is greater than described pressure threshold, described hammer action meets default operational motion condition, triggers described in the first record subelement record and knocks event, described in the event of knocking comprise hammer action and knock the time; Otherwise described hammer action does not meet the operational motion condition preset, not record.
14. devices according to claim 12, is characterized in that, described reception subelement number comprises:
First detection sub-unit, for detecting electric signal when programmable button is pressed in described electronic equipment; When described electric signal being detected, trigger first this key-press event of record subelement record, described key-press event comprises actuation of keys and key press time.
15. devices according to claim 12, is characterized in that, described electronic equipment comprises touch-screen, and described reception subelement comprises:
Second detection sub-unit, clicks the electric signal of described touch-screen generation for detecting user; When described electric signal being detected, trigger click event described in the first record subelement record, described click event comprises click action and click time.
16. devices according to claim 15, is characterized in that, when arranging click event area in described touch-screen, described reception subelement also comprises:
Second obtains subelement, for obtaining the coordinate figure of described click at described touch-screen;
Second judgment sub-unit, for judging described click location whether in described click event region according to described coordinate figure; If when described click location is in described click event region, trigger the first record subelement.
17. devices according to claim 11, is characterized in that, described electronic equipment is also provided with timer, and described time parameter unit comprises:
3rd obtains subelement, and for obtaining the rise time of the second default cue, described second cue is that the timing time value of described timer generates when meeting preset value;
Second record subelement, for the rise time of described second cue is carried out record as the time of described label information, obtains the time parameter of described label information.
18. devices according to claim 10, is characterized in that, described segmentation module comprises:
Time shaft unit, for the time parameter according to described voice messaging, sets up the time shaft of described voice messaging;
Add tag unit, for the time parameter according to each label information in the time parameter group of described label information, add described label information to described time shaft;
Interception unit, for according to the intercepting time range preset, obtains the voice messaging fragment that each label is corresponding in described time shaft.
19. 1 kinds of electronic equipments, is characterized in that, comprising: voice collecting unit and the speech information processing apparatus as described in any one of claim 10-18.
CN201410098994.5A 2014-03-17 2014-03-17 A kind of voice information processing method, device and electronic equipment Active CN104933048B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410098994.5A CN104933048B (en) 2014-03-17 2014-03-17 A kind of voice information processing method, device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410098994.5A CN104933048B (en) 2014-03-17 2014-03-17 A kind of voice information processing method, device and electronic equipment

Publications (2)

Publication Number Publication Date
CN104933048A true CN104933048A (en) 2015-09-23
CN104933048B CN104933048B (en) 2018-08-31

Family

ID=54120217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410098994.5A Active CN104933048B (en) 2014-03-17 2014-03-17 A kind of voice information processing method, device and electronic equipment

Country Status (1)

Country Link
CN (1) CN104933048B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106782508A (en) * 2016-12-20 2017-05-31 美的集团股份有限公司 The cutting method of speech audio and the cutting device of speech audio
CN107844470A (en) * 2016-09-18 2018-03-27 腾讯科技(深圳)有限公司 A kind of voice data processing method and its equipment
CN109983432A (en) * 2016-11-22 2019-07-05 微软技术许可有限责任公司 Control for dictated text navigation
EP3671412A4 (en) * 2017-09-11 2020-08-05 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Touch operation response method and device
CN112216275A (en) * 2019-07-10 2021-01-12 阿里巴巴集团控股有限公司 Voice information processing method and device and electronic equipment
US10901553B2 (en) 2017-09-11 2021-01-26 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for responding to touch operation and electronic device
US11061558B2 (en) 2017-09-11 2021-07-13 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Touch operation response method and device
US11194425B2 (en) 2017-09-11 2021-12-07 Shenzhen Heytap Technology Corp., Ltd. Method for responding to touch operation, mobile terminal, and storage medium
CN113936697A (en) * 2020-07-10 2022-01-14 北京搜狗智能科技有限公司 Voice processing method and device for voice processing

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020141597A1 (en) * 2001-01-29 2002-10-03 Hewlett-Packard Company Audio user interface with selectively-mutable synthesised sound sources
CN1760974A (en) * 2004-10-15 2006-04-19 微软公司 Hidden conditional random field models for phonetic classification and speech recognition
US20100287161A1 (en) * 2007-04-05 2010-11-11 Waseem Naqvi System and related techniques for detecting and classifying features within data
CN102968493A (en) * 2012-11-27 2013-03-13 上海量明科技发展有限公司 Method, client and system for executing voice search by input method tool
CN103065625A (en) * 2012-12-25 2013-04-24 广东欧珀移动通信有限公司 Method and device for adding digital voice tag
CN103457834A (en) * 2013-08-18 2013-12-18 苏州量跃信息科技有限公司 Method and client terminal for triggering ITEM voice searching in instant messaging
CN103559880A (en) * 2013-11-08 2014-02-05 百度在线网络技术(北京)有限公司 Voice input system and voice input method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020141597A1 (en) * 2001-01-29 2002-10-03 Hewlett-Packard Company Audio user interface with selectively-mutable synthesised sound sources
CN1760974A (en) * 2004-10-15 2006-04-19 微软公司 Hidden conditional random field models for phonetic classification and speech recognition
US20100287161A1 (en) * 2007-04-05 2010-11-11 Waseem Naqvi System and related techniques for detecting and classifying features within data
CN102968493A (en) * 2012-11-27 2013-03-13 上海量明科技发展有限公司 Method, client and system for executing voice search by input method tool
CN103065625A (en) * 2012-12-25 2013-04-24 广东欧珀移动通信有限公司 Method and device for adding digital voice tag
CN103457834A (en) * 2013-08-18 2013-12-18 苏州量跃信息科技有限公司 Method and client terminal for triggering ITEM voice searching in instant messaging
CN103559880A (en) * 2013-11-08 2014-02-05 百度在线网络技术(北京)有限公司 Voice input system and voice input method

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844470A (en) * 2016-09-18 2018-03-27 腾讯科技(深圳)有限公司 A kind of voice data processing method and its equipment
CN109983432A (en) * 2016-11-22 2019-07-05 微软技术许可有限责任公司 Control for dictated text navigation
CN106782508A (en) * 2016-12-20 2017-05-31 美的集团股份有限公司 The cutting method of speech audio and the cutting device of speech audio
EP3671412A4 (en) * 2017-09-11 2020-08-05 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Touch operation response method and device
US10901553B2 (en) 2017-09-11 2021-01-26 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for responding to touch operation and electronic device
US11061558B2 (en) 2017-09-11 2021-07-13 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Touch operation response method and device
US11086442B2 (en) 2017-09-11 2021-08-10 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Method for responding to touch operation, mobile terminal, and storage medium
US11194425B2 (en) 2017-09-11 2021-12-07 Shenzhen Heytap Technology Corp., Ltd. Method for responding to touch operation, mobile terminal, and storage medium
CN112216275A (en) * 2019-07-10 2021-01-12 阿里巴巴集团控股有限公司 Voice information processing method and device and electronic equipment
CN113936697A (en) * 2020-07-10 2022-01-14 北京搜狗智能科技有限公司 Voice processing method and device for voice processing
CN113936697B (en) * 2020-07-10 2023-04-18 北京搜狗智能科技有限公司 Voice processing method and device for voice processing

Also Published As

Publication number Publication date
CN104933048B (en) 2018-08-31

Similar Documents

Publication Publication Date Title
CN104933048A (en) Voice message processing method and device, and electronic device
CN104750378B (en) The input pattern automatic switching method and device of input method
CN103513910A (en) Information processing method and device and electronic equipment
EP2960761A1 (en) Method, system and device for inputting text by consecutive slide
WO2015164116A1 (en) Learning language models from scratch based on crowd-sourced user text input
CN110534109B (en) Voice recognition method and device, electronic equipment and storage medium
CN105869635B (en) Voice recognition method and system
CN107544684B (en) Candidate word display method and device
CN104917904A (en) Voice information processing method and device and electronic device
CN104850222A (en) Instruction recognition method and electronic terminal
CN103226436A (en) Man-machine interaction method and system of intelligent terminal
CN105528130A (en) A control method and device and an electronic apparatus
CN102830924A (en) Method and device for adjusting input method keyboards
CN105788597A (en) Voice recognition-based screen reading application instruction input method and device
CN105204621A (en) Information transfer method and smart watch
CN105260369A (en) Reading assisting method and electronic equipment
CN105183217A (en) Touch display device and touch display method
CN105760084A (en) Voice input control method and device
CN105549882A (en) Time setting method and mobile terminal
CN102395941A (en) Motion sensing input method, motion sensing device, wireless handheld device and motion sensing system
CN107132927A (en) Input recognition methods and device and the device for identified input character of character
CN105824429A (en) Screen reading application instruction input method and device based on infrared sensor
CN107797676B (en) Single character input method and device
CN105867811A (en) Message reply method and terminal
CN104407763A (en) Content input method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant