CN109754788A - A kind of sound control method, device, equipment and storage medium - Google Patents
A kind of sound control method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN109754788A CN109754788A CN201910101100.6A CN201910101100A CN109754788A CN 109754788 A CN109754788 A CN 109754788A CN 201910101100 A CN201910101100 A CN 201910101100A CN 109754788 A CN109754788 A CN 109754788A
- Authority
- CN
- China
- Prior art keywords
- keyword
- voice messaging
- text information
- wake
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
Abstract
The embodiment of the present invention proposes a kind of sound control method, device, equipment and storage medium, and wherein whether method includes: in the preceding multiple syllables for detect voice messaging comprising waking up keyword;If comprising carrying out speech recognition to the voice messaging, obtaining the corresponding text information of the voice messaging;Corresponding operation is executed for the text information.The embodiment of the present invention can simplify the interaction flow with interactive voice equipment, improve user experience.
Description
Technical field
The present invention relates to technical field of voice interaction more particularly to a kind of sound control method, device, equipment and storage to be situated between
Matter.
Background technique
In existing interactive voice technology, user needs to say fixed wake-up first when using interactive voice equipment
Word carrys out wake-up device, after waiting facilities casting wakes up successfully prompt, then says the phonetic order comprising demand.
For example, the wake-up word of certain interactive voice equipment is " the small small A of A ".When user needs interactive voice device plays music
When, it says first " the small small A of A ", waiting voice interactive device plays the voice messaging of " I comes ".Later, user says " I again
Listen to music ", the voice messaging that interactive voice equipment interconnection is received carries out speech recognition, obtains corresponding text information, and be directed to
Text information executes corresponding operation.
As it can be seen that the every sending once command of user requires two-wheeled interaction in existing interactive voice mode.Also, user
It needs to be grasped and wakes up word and understand when interactive voice equipment is waken up success, said again after waiting wake-up successfully to prompt
Corresponding instruction.This mode is time-consuming and laborious, causes user experience poor.
Summary of the invention
The embodiment of the present invention provides a kind of sound control method and device, at least to solve the above technology in the prior art
Problem.
In a first aspect, the embodiment of the invention provides a kind of sound control methods, comprising:
It whether detects in preceding multiple syllables of voice messaging comprising waking up keyword;
If comprising carrying out speech recognition to the voice messaging, obtaining the corresponding text information of the voice messaging;
Corresponding operation is executed for the text information.
In one embodiment, whether comprising waking up keyword, packet in preceding multiple syllables of the detection voice messaging
It includes:
Model is waken up for multiple voices for waking up keyword using pre-set, detects the preceding more of the voice messaging
Whether comprising any one wake-up keyword in the multiple wake-up keyword in a syllable, if it is, determining institute's predicate
Comprising waking up keyword in preceding multiple syllables of message breath.
It is in one embodiment, described to execute corresponding processing for the text information, comprising:
Judge whether the text information is command information;
If it is, executing corresponding operation for the text information.
It is in one embodiment, described to judge whether the text information is command information, comprising:
According to preset wake-up keyword and the corresponding relationship of determination strategy is instructed, obtains in the voice messaging and includes
Wake up instruction determination strategy corresponding to keyword;
The instruction determination strategy of acquisition is used to judge the text information whether for command information.
In one embodiment, the voice wakes up model and is set to local device.
In one embodiment, described that speech recognition is carried out to the voice messaging, comprising: to use and be set to local dress
The speech recognition modeling set carries out speech recognition to the voice messaging;Alternatively, the voice messaging is sent to cloud service
Device carries out speech recognition to the voice messaging using the speech recognition modeling for being set to cloud server.
Second aspect, the embodiment of the present invention also propose a kind of phonetic controller, comprising:
Detection module, for whether detecting before voice messaging in multiple syllables comprising waking up keyword;If comprising,
Instruction identification module is identified;
The identification module carries out speech recognition to the voice messaging, obtains for the instruction according to the detection module
To the corresponding text information of the voice messaging;
Operation module, for executing corresponding operation for the text information.
In one embodiment, the detection module is used for, and wakes up keywords for multiple using pre-set
Whether voice wakes up model, detect in preceding multiple syllables of the voice messaging comprising any in the multiple wake-up keyword
One wake-up keyword, if it is, determining in preceding multiple syllables of the voice messaging comprising waking up keyword.
In one embodiment, the operation module includes:
Judging submodule, for judging whether the text information is command information;If it is, instruction implementation sub-module
It is executed;
The implementation sub-module executes correspondence for the text information for the instruction according to the judging submodule
Operation.
In one embodiment, the judging submodule, for judging plan according to preset wake-up keyword and instruction
Corresponding relationship slightly, obtains instruction determination strategy corresponding to the wake-up keyword for including in the voice messaging;Using acquisition
Instruction determination strategy judge whether the text information is command information.
In one embodiment, the detection module is used for, using be set to local device voice wake up model into
Row detection.
In one embodiment, the identification module is used for, using the speech recognition modeling pair for being set to local device
The voice messaging carries out speech recognition;Alternatively, the voice messaging is sent to cloud server, taken using cloud is set to
The speech recognition modeling of business device carries out speech recognition to the voice messaging.
The third aspect, the embodiment of the invention provides a kind of voice control device, the function of the equipment can be by hard
Part is realized, corresponding software realization can also be executed by hardware.The hardware or software include one or more and above-mentioned function
It can corresponding module.
It include processor and memory in the structure of the equipment in a possible design, the memory is used for
Storage supports the equipment to execute the program of above-mentioned sound control method, the processor is configured to for executing the storage
The program stored in device.The equipment can also include communication interface, be used for and other equipment or communication.
Fourth aspect, the embodiment of the invention provides a kind of computer readable storage mediums, set for storaged voice control
Standby computer software instructions used comprising for executing program involved in above-mentioned sound control method.
A technical solution in above-mentioned technical proposal have the following advantages that or the utility model has the advantages that
The sound control method and device that the embodiment of the present invention proposes, the preceding multiple sounds for the voice messaging that detection user issues
Whether comprising waking up keyword in section, if comprising directly carrying out speech recognition to the full content of voice messaging, and be directed to
Recognition result executes corresponding operation.As it can be seen that user is before saying as the voice messaging of instruction using the embodiment of the present invention
It does not need individually to say wake-up word, withouts waiting for waking up successfully yet, but can directly say voice messaging.Therefore the present invention
Embodiment can simplify the interaction flow with interactive voice equipment, improve user experience.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description
Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further
Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings
Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention
Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is a kind of sound control method implementation flow chart of the embodiment of the present invention;
Fig. 2 is the implementation flow chart of step S13 in a kind of sound control method of the embodiment of the present invention;
Fig. 3 is a kind of flow chart of the application example one of sound control method of the embodiment of the present invention;
Fig. 4 is a kind of flow chart of the application example two of sound control method of the embodiment of the present invention;
Fig. 5 is a kind of flow chart of the application example three of sound control method of the embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of phonetic controller of the embodiment of the present invention;
Fig. 7 is a kind of structural schematic diagram of phonetic controller of the embodiment of the present invention;
Fig. 8 is a kind of structural schematic diagram of voice control device of the embodiment of the present invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that
Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes.
Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
The embodiment of the present invention mainly provides sound control method and device, carries out skill by following embodiment separately below
The expansion of art scheme describes.
The sound control method that the embodiment of the present invention proposes can be applied to interactive voice equipment, and interactive voice equipment
Before carrying out the method, it may be at free position.When interactive voice equipment receives the voice messaging of user's sending, hold
The sound control method that the row embodiment of the present invention proposes.
If Fig. 1 is a kind of sound control method implementation flow chart of the embodiment of the present invention, this method comprises:
S11: it whether detects in preceding multiple syllables of voice messaging comprising waking up keyword;If comprising thening follow the steps
S12;
S12: speech recognition is carried out to voice messaging, obtains the corresponding text information of voice messaging;
S13: corresponding operation is executed for text information.
In a kind of possible embodiment, above-mentioned wake-up keyword can be directed to different application scenarios, and according to
Several usual words are configured before phonetic order under different application scene at family.For example, being directed to navigation scenarios, it can be set and call out
Keyword of waking up is " navigating to " or " I will go ";Scene is played for audio, can be set and wake up keyword is " I will listen ";Needle
To scene is made a phone call, can be set and wake up keyword is " phoning ".
After interactive voice equipment receives voice messaging, include if detected in preceding several syllables in voice messaging
Keyword is waken up, then continues to carry out speech recognition to the full content of the voice messaging, obtains corresponding text information.As it can be seen that
Keyword is waken up to play the role of waking up interactive voice equipment.But the wake-up keyword in the embodiment of the present invention is different
In wake-up word in the prior art.The reason is that: user does not need to understand the tool for waking up keyword when issuing voice messaging
Hold in vivo, and only needs to issue voice messaging according to general speech habits.Also, after interactive voice equipment is waken up,
It does not need to issue the prompt tone being waken up, but directly voice messaging is identified.Due to these features, user will not feel
Know the process that interactive voice equipment is waken up.
In a kind of possible embodiment, detects in above-mentioned steps S11 and whether wrapped in preceding multiple syllables of voice messaging
The keyword containing wake-up, comprising:
Model is waken up for multiple voices for waking up keyword using pre-set, detects the preceding more of the voice messaging
Whether comprising any one wake-up keyword in the multiple wake-up keyword in a syllable, if it is, determining institute's predicate
Comprising waking up keyword in preceding multiple syllables of message breath.Wherein, preceding multiple syllables can refer in a natural sentence or one
Preceding several syllables in words, specific length can be determined by the length of wake-up keyword.
For example, preset four wake-up keywords, including " navigating to ", " I will go ", " I will listen " and " make a phone call
To ".Model is waken up for this four wake-up keyword setting voices.When receiving voice messaging, mould is waken up using the voice
Type detects any one whether waken up in keyword comprising aforementioned four in preceding several syllables of voice messaging.If it is,
Determine to continue to carry out speech recognition to voice messaging comprising waking up keyword in voice messaging.In the present embodiment, before detection
The length of several syllables should be equal to or greater than this four wake-up keywords in syllable corresponding to longest wake-up keyword it is long
Degree.The pronunciation of a general Chinese character is a syllable in Chinese, and longest in aforementioned four wake-up keyword includes 4 Chinese characters,
Therefore, it can detecte whether crucial comprising aforementioned four wake-ups since preceding 4 syllables first syllable in voice messaging
Any one in word.
In a kind of possible embodiment, voice is waken up model and can be obtained using the training of multiple speech samples.For example,
For " navigating to ", not same tone, different dialects and the similar speech samples that pronounce are acquired, using the speech samples of acquisition to language
Sound wakes up model and is trained, and can detect whether close comprising " navigating to " this wake-up in the voice messaging received
Keyword.Keyword is waken up for others, also trains voice to wake up model in the same way.
In addition, being directed to each wake-up keyword, it can also be waken up in voice and corresponding wake-up key is stored in advance in model
The speech samples of the not same tone of word, different dialects.When detecting, content in the voice messaging received and each is calculated separately
The similarity of a speech samples determines the voice as long as the similarity with one of speech samples is greater than preset thresholding
It include the wake-up keyword in information.
As shown in Fig. 2, in a kind of possible embodiment, above-mentioned steps S13 may include:
S131: judge whether text information is command information;If so, thening follow the steps S132;Otherwise, terminate current stream
Journey is not processed text information.
S132: corresponding operation is executed for text information.
Wherein, judge whether text information is that the mode of command information may include:
According to preset wake-up keyword and the corresponding relationship of determination strategy is instructed, obtains in the voice messaging and includes
Wake up instruction determination strategy corresponding to keyword;The instruction determination strategy of acquisition is used to judge the text information whether for finger
Enable information.
Since a wake-up keyword corresponds to certain application scenarios, and the command information in each application scenarios but also with
Different features.Therefore, it can be arranged according to the feature for the different instruction determination strategies for waking up keyword, and store wake-up
The corresponding relationship of keyword and instruction determination strategy.
The corresponding relationship for showing application scenarios such as the following table 1, waking up keyword and common command information.
Table 1
For the corresponding relationship shown in the table 1.For navigation scenarios, two wake-up keywords are set, i.e., " I will go " and
" navigating to ".It include place name or location type, and the place name or location type in common command information in navigation scenarios
It is related with the position that interactive voice equipment is currently located.Under normal circumstances, the place name or location type are located at leading for current location
In range of navigating.Corresponding instruction determination strategy can be set in the characteristics of according to said instruction information.For example, keyword will be waken up
The setting of " I will go " and " navigating to " corresponding instruction determination strategy are as follows: when including place name or location type in text information, and
And the place name or location type in the range of navigation when, determine text information be command information.
Scene is played for audio, a wake-up keyword is set, i.e., " I will listen ".It is played in scene in audio, it is common
Command information in include the information such as performing artist's title, song title, album name, programm name or audio types.According to preceding
Corresponding instruction determination strategy can be set in the characteristics of stating command information.For example, keyword " I will listen " corresponding finger will be waken up
Enable determination strategy be arranged are as follows: when in text information include performing artist's title, song title, album name, programm name or audio
When type, determine that text information is command information.
For scene is made a phone call, a wake-up keyword is set, i.e., " is phoned ".In making a phone call scene, often
Include the information such as contact name, telephone number or Yellow Page content in the command information seen.The characteristics of according to said instruction information,
Corresponding instruction determination strategy can be set.For example, keyword " phoning " corresponding instruction determination strategy setting will be waken up
Are as follows: when in text information including contact name, telephone number or Yellow Page content, determine that text information is command information.
The wake-up keyword for different application scene is described above and wakes up the corresponding instruction of keyword and judges plan
Slightly.The above content is only for example, and the applicable application scenarios of the embodiment of the present invention are not limited to three of the above, and each application scenarios
Instruction determination strategy corresponding to corresponding wake-up keyword and wake-up keyword is also not necessarily limited to the above content.The present invention is real
Wake-up keyword and corresponding instruction determination strategy can also be updated according to demand by applying example.
Specific embodiment is used below, and the application that a kind of sound control method of the embodiment of the present invention is discussed in detail is real
Example, such as the flow chart that Fig. 3 is application example one, comprising:
S31: interactive voice equipment is in audio broadcast state.
S32: receiving the voice messaging of user's sending, which is " I will go to travel ".
S33: waking up model using voice, identifies in preceding multiple syllables of the voice messaging comprising waking up keyword " I
Go ", interactive voice equipment is waken up.
S34: speech recognition is carried out to the voice messaging that user issues, obtains corresponding text information " I will go to travel ".
S35: it obtains and wakes up keyword " I will go " corresponding instruction determination strategy, according to the instruction determination strategy to text
Information " I will go to travel " judges, place name or location type is not included in discovery text information " I will go to travel ", therefore
The command information being not belonging in navigation scenarios corresponding to " I will go ".Therefore, interactive voice equipment ignores text information, still
So it is in audio broadcast state.
Such as the flow chart that Fig. 4 is application example two, comprising:
S41: interactive voice equipment is in audio broadcast state.
S42: receiving the voice messaging of user's sending, which is " navigating to children's hospital ".
S43: model is waken up using voice, identifies in preceding multiple syllables of the voice messaging and " is led comprising waking up keyword
Navigate to ", interactive voice equipment is waken up.
S44: carrying out speech recognition to the voice messaging that user issues, and obtains corresponding text information and " navigates to children doctor
Institute ".
S45: it obtains and wakes up keyword " navigating to " corresponding instruction determination strategy, according to the instruction determination strategy to text
Information " navigating to children's hospital " is judged that discovery text information " navigating to children's hospital " belongs to corresponding to " navigating to "
Command information in navigation scenarios.
S46: interactive voice equipment stops audio and plays.If interactive voice equipment has display screen curtain, can be by text
Information " navigating to children's hospital " is displayed on the screen.Also, interactive voice equipment switches to navigation application, and by children's hospital
Destination as navigation.
Such as the flow chart that Fig. 5 is application example three, comprising:
S51: interactive voice equipment is in standby, audio broadcast state or is playing voice prompting.
S52: receiving the voice messaging of user's sending, which is " I will listen FM100 ".
S53: waking up model using voice, identifies in preceding multiple syllables of the voice messaging comprising waking up keyword " I
Listen ", interactive voice equipment is waken up.
S54: speech recognition is carried out to the voice messaging that user issues, obtains corresponding text information " I will listen FM100 ".
S55: it obtains and wakes up keyword " I will listen " corresponding instruction determination strategy, according to the instruction determination strategy to text
Information " I will listen FM100 " judges, includes audio types in discovery text information " I will listen FM100 ", is consequently belonging to " I
Listen " corresponding to audio play scene in command information.
S56: interactive voice device plays voice prompting " opens FM100 for you ", and opens broadcasting equipment.If voice is handed over
Mutual equipment has display screen curtain, text information " I will listen FM100 " can be shown on the screen.
The embodiment of the present invention can be applied to vehicle-mounted voice interactive device.In a kind of possible embodiment, the present invention
Embodiment supports offline wake-up and online wake-up, and supports identified off-line and online recognition.The embodiment of the present invention supports voice to hand over
The audio-frequency information that mutual equipment uploads telephonic communication record to cloud server and is locally stored.
In a kind of possible embodiment, above-mentioned voice wakes up model and can be set in local device, online to support
It wakes up.
In addition, in a kind of possible embodiment, in above-mentioned steps S12 speech recognition is carried out to voice messaging can be with
For identified off-line and online recognition two ways, specifically include:
Speech recognition is carried out to voice messaging using the speech recognition modeling for being set to local device;
Alternatively, voice messaging is sent to cloud server, using the speech recognition modeling pair for being set to cloud server
Voice messaging carries out speech recognition.
Identified off-line can support making a phone call in above-described embodiment and audio to play scene, and online recognition can be supported
Making a phone call in above-described embodiment, audio plays and navigation scenarios.
The embodiment of the present invention also proposes a kind of phonetic controller.Referring to Fig. 6, Fig. 6 is a kind of language of the embodiment of the present invention
Sound controling device structure diagram, comprising:
Detection module 610, for whether detecting before voice messaging in multiple syllables comprising waking up keyword;If packet
Contain, then indicates that identification module 620 is identified;
The identification module 620 carries out speech recognition to the voice messaging for the instruction according to detection module 610,
Obtain the corresponding text information of the voice messaging;
Operation module 630, for executing corresponding operation for the text information.
The embodiment of the present invention also proposes a kind of phonetic controller.Referring to Fig. 7, Fig. 7 is a kind of language of the embodiment of the present invention
Sound controling device structure diagram, comprising:
Detection module 610, identification module 620 and operation module 630, aforementioned three modules with it is corresponding in above-described embodiment
Module is identical, repeats no more.
In a kind of possible embodiment, the detection module 610 is used for, and is directed to multiple wake-ups using pre-set
Whether the voice of keyword wakes up model, detect in preceding multiple syllables of the voice messaging comprising the multiple wake-up keyword
In any one wake up keyword, if it is, determining in preceding multiple syllables of the voice messaging comprising waking up keyword.
In a kind of possible embodiment, the operation module 630 includes:
Judging submodule 631, for judging whether the text information is command information;If it is, instruction executes son
Module 632 is executed;
The implementation sub-module 632, for the instruction according to the judging submodule, for text information execution pair
The operation answered.
In a kind of possible embodiment, the judging submodule 631, for according to preset wake-up keyword and finger
The corresponding relationship for enabling determination strategy obtains instruction determination strategy corresponding to the wake-up keyword for including in the voice messaging;
The instruction determination strategy of acquisition is used to judge the text information whether for command information.
In a kind of possible embodiment, the detection module 610 is used for, and is called out using the voice for being set to local device
Awake model is detected.
In a kind of possible embodiment, the identification module 620 is used for, and is known using the voice for being set to local device
Other model carries out speech recognition to the voice messaging;Alternatively, the voice messaging is sent to cloud server, using setting
Speech recognition is carried out to the voice messaging in the speech recognition modeling of cloud server.
The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not
It repeats again.
The embodiment of the present invention also proposes a kind of voice control device, such as the voice control device that Fig. 8 is the embodiment of the present invention
Structural schematic diagram, comprising:
Memory 11 and processor 12, memory 11 are stored with the computer program that can be run on the processor 12.It is described
Processor 12 realizes the sound control method in above-described embodiment when executing the computer program.The memory 11 and processing
The quantity of device 12 can be one or more.
The equipment can also include:
Communication interface 13 carries out data exchange transmission for being communicated with external device.
Memory 11 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-volatile
Memory), a for example, at least magnetic disk storage.
If memory 11, processor 12 and the independent realization of communication interface 13, memory 11, processor 12 and communication are connect
Mouth 13 can be connected with each other by bus and complete mutual communication.The bus can be industry standard architecture
(ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral
Component Interconnect) bus or extended industry-standard architecture (EISA, Extended Industry
Standard Architecture) etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for
It indicates, is only indicated with a thick line in Fig. 8, be not offered as only a bus or a type of bus.
Optionally, in specific implementation, if memory 11, processor 12 and communication interface 13 are integrated in chip piece
On, then memory 11, processor 12 and communication interface 13 can complete mutual communication by internal interface.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show
The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example
Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described
It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this
The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples
Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance
Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden
It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise
Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes
It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion
Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable
Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention
Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use
In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for
Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction
The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set
It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass
Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment
It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings
Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory
(ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory
(CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie
Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media
Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned
In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage
Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware
Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal
Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene
Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries
It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium
In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module
It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as
Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer
In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
In conclusion sound control method and device that the embodiment of the present invention proposes, when the voice for receiving user's sending
When information, detect whether comprising waking up keyword in preceding multiple syllables of voice messaging, if comprising directly to voice messaging
Full content carry out speech recognition, and for recognition result execute corresponding operation.As it can be seen that being used using the embodiment of the present invention
Family does not need individually to say wake-up word before issuing the voice messaging as instruction, withouts waiting for waking up successfully yet, but
Voice messaging can directly be said.Therefore the embodiment of the present invention can simplify interaction flow, improve user experience.It is accidentally called out to reduce
A possibility that waking up, the embodiment of the present invention can also judge whether the text information obtained after speech recognition is command information, and
It is to execute corresponding operation in the case where command information.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement,
These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim
It protects subject to range.
Claims (14)
1. a kind of sound control method characterized by comprising
It whether detects in preceding multiple syllables of voice messaging comprising waking up keyword;
If comprising carrying out speech recognition to the voice messaging, obtaining the corresponding text information of the voice messaging;
Corresponding operation is executed for the text information.
2. the method according to claim 1, wherein whether being wrapped in preceding multiple syllables of the detection voice messaging
The keyword containing wake-up, comprising:
Model is waken up for multiple voices for waking up keyword using pre-set, detects preceding multiple sounds of the voice messaging
Whether comprising any one wake-up keyword in the multiple wake-up keyword in section, if it is, determining the voice letter
Comprising waking up keyword in preceding multiple syllables of breath.
3. method according to claim 1 or 2, which is characterized in that described to execute corresponding place for the text information
Reason, comprising:
Judge whether the text information is command information;
If it is, executing corresponding operation for the text information.
4. according to the method described in claim 3, it is characterized in that, described judge whether the text information is command information,
Include:
According to the corresponding relationship of preset wake-up keyword and instruction determination strategy, the wake-up for including in the voice messaging is obtained
Instruction determination strategy corresponding to keyword;
The instruction determination strategy of acquisition is used to judge the text information whether for command information.
5. according to the method described in claim 2, it is characterized in that, the voice, which wakes up model, is set to local device.
6. the method according to claim 1, wherein described carry out speech recognition to the voice messaging, comprising:
Speech recognition is carried out to the voice messaging using the speech recognition modeling for being set to local device;Alternatively, by institute's predicate
Message breath is sent to cloud server, carries out language to the voice messaging using the speech recognition modeling for being set to cloud server
Sound identification.
7. a kind of phonetic controller characterized by comprising
Detection module, for whether detecting before voice messaging in multiple syllables comprising waking up keyword;If comprising indicating
Identification module is identified;
The identification module carries out speech recognition to the voice messaging, obtains institute for the instruction according to the detection module
State the corresponding text information of voice messaging;
Operation module, for executing corresponding operation for the text information.
8. device according to claim 7, which is characterized in that the detection module is used for, and is directed to using pre-set
Whether multiple voices for waking up keyword wake up model, detect in preceding multiple syllables of the voice messaging and call out comprising the multiple
Any one in awake keyword wakes up keyword, if it is, determining in preceding multiple syllables of the voice messaging comprising calling out
Awake keyword.
9. device according to claim 7 or 8, which is characterized in that the operation module includes:
Judging submodule, for judging whether the text information is command information;If it is, instruction implementation sub-module carries out
It executes;
The implementation sub-module executes corresponding behaviour for the text information for the instruction according to the judging submodule
Make.
10. device according to claim 8, which is characterized in that the judging submodule, for being closed according to preset wake-up
The corresponding relationship of keyword and instruction determination strategy, obtains instruction corresponding to the wake-up keyword for including in the voice messaging and sentences
Disconnected strategy;The instruction determination strategy of acquisition is used to judge the text information whether for command information.
11. device according to claim 8, which is characterized in that the detection module is used for, using being set to local device
Voice wake up model detected.
12. device according to claim 7, which is characterized in that the identification module is used for, using being set to local device
Speech recognition modeling to the voice messaging carry out speech recognition;Alternatively, the voice messaging is sent to cloud server,
Speech recognition is carried out to the voice messaging using the speech recognition modeling for being set to cloud server.
13. a kind of voice control device, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors
Realize such as method as claimed in any one of claims 1 to 6.
14. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor
Such as method as claimed in any one of claims 1 to 6 is realized when row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910101100.6A CN109754788B (en) | 2019-01-31 | 2019-01-31 | Voice control method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910101100.6A CN109754788B (en) | 2019-01-31 | 2019-01-31 | Voice control method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109754788A true CN109754788A (en) | 2019-05-14 |
CN109754788B CN109754788B (en) | 2020-08-28 |
Family
ID=66407130
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910101100.6A Active CN109754788B (en) | 2019-01-31 | 2019-01-31 | Voice control method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109754788B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111081217A (en) * | 2019-12-03 | 2020-04-28 | 珠海格力电器股份有限公司 | Voice wake-up method and device, electronic equipment and storage medium |
CN111768783A (en) * | 2020-06-30 | 2020-10-13 | 北京百度网讯科技有限公司 | Voice interaction control method, device, electronic equipment, storage medium and system |
CN112420044A (en) * | 2020-12-03 | 2021-02-26 | 深圳市欧瑞博科技股份有限公司 | Voice recognition method, voice recognition device and electronic equipment |
WO2021051403A1 (en) * | 2019-09-20 | 2021-03-25 | 深圳市汇顶科技股份有限公司 | Voice control method and apparatus, chip, earphones, and system |
CN112837680A (en) * | 2019-11-25 | 2021-05-25 | 马上消费金融股份有限公司 | Audio keyword retrieval method, intelligent outbound method and related device |
CN113643711A (en) * | 2021-08-03 | 2021-11-12 | 常州匠心独具智能家居股份有限公司 | Voice system based on offline mode and online mode for intelligent furniture |
CN115512700A (en) * | 2022-09-07 | 2022-12-23 | 广州小鹏汽车科技有限公司 | Voice interaction method, voice interaction device, vehicle and readable storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104219388A (en) * | 2014-08-28 | 2014-12-17 | 小米科技有限责任公司 | Voice control method and device |
CN105654943A (en) * | 2015-10-26 | 2016-06-08 | 乐视致新电子科技(天津)有限公司 | Voice wakeup method, apparatus and system thereof |
CN105976813A (en) * | 2015-03-13 | 2016-09-28 | 三星电子株式会社 | Speech recognition system and speech recognition method thereof |
CN107886944A (en) * | 2017-11-16 | 2018-04-06 | 出门问问信息科技有限公司 | A kind of audio recognition method, device, equipment and storage medium |
CN107895578A (en) * | 2017-11-15 | 2018-04-10 | 百度在线网络技术(北京)有限公司 | Voice interactive method and device |
CN107992587A (en) * | 2017-12-08 | 2018-05-04 | 北京百度网讯科技有限公司 | A kind of voice interactive method of browser, device, terminal and storage medium |
WO2018131775A1 (en) * | 2017-01-13 | 2018-07-19 | 삼성전자주식회사 | Electronic device and method of operation thereof |
US20180357998A1 (en) * | 2017-06-13 | 2018-12-13 | Intel IP Corporation | Wake-on-voice keyword detection with integrated language identification |
CN109065044A (en) * | 2018-08-30 | 2018-12-21 | 出门问问信息科技有限公司 | Wake up word recognition method, device, electronic equipment and computer readable storage medium |
-
2019
- 2019-01-31 CN CN201910101100.6A patent/CN109754788B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104219388A (en) * | 2014-08-28 | 2014-12-17 | 小米科技有限责任公司 | Voice control method and device |
CN105976813A (en) * | 2015-03-13 | 2016-09-28 | 三星电子株式会社 | Speech recognition system and speech recognition method thereof |
CN105654943A (en) * | 2015-10-26 | 2016-06-08 | 乐视致新电子科技(天津)有限公司 | Voice wakeup method, apparatus and system thereof |
WO2018131775A1 (en) * | 2017-01-13 | 2018-07-19 | 삼성전자주식회사 | Electronic device and method of operation thereof |
US20180357998A1 (en) * | 2017-06-13 | 2018-12-13 | Intel IP Corporation | Wake-on-voice keyword detection with integrated language identification |
CN107895578A (en) * | 2017-11-15 | 2018-04-10 | 百度在线网络技术(北京)有限公司 | Voice interactive method and device |
CN107886944A (en) * | 2017-11-16 | 2018-04-06 | 出门问问信息科技有限公司 | A kind of audio recognition method, device, equipment and storage medium |
CN107992587A (en) * | 2017-12-08 | 2018-05-04 | 北京百度网讯科技有限公司 | A kind of voice interactive method of browser, device, terminal and storage medium |
CN109065044A (en) * | 2018-08-30 | 2018-12-21 | 出门问问信息科技有限公司 | Wake up word recognition method, device, electronic equipment and computer readable storage medium |
Non-Patent Citations (1)
Title |
---|
左祥 ET AL.: "《基于深度神经网络和身份认证矢量的自定义唤醒词检测》", 《第十三届全国人机语音通讯学术会议(NCMMSC2015)》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021051403A1 (en) * | 2019-09-20 | 2021-03-25 | 深圳市汇顶科技股份有限公司 | Voice control method and apparatus, chip, earphones, and system |
CN113039601A (en) * | 2019-09-20 | 2021-06-25 | 深圳市汇顶科技股份有限公司 | Voice control method, device, chip, earphone and system |
CN113039601B (en) * | 2019-09-20 | 2024-08-27 | 深圳市汇顶科技股份有限公司 | Voice control method, device, chip, earphone and system |
CN112837680A (en) * | 2019-11-25 | 2021-05-25 | 马上消费金融股份有限公司 | Audio keyword retrieval method, intelligent outbound method and related device |
CN111081217A (en) * | 2019-12-03 | 2020-04-28 | 珠海格力电器股份有限公司 | Voice wake-up method and device, electronic equipment and storage medium |
CN111081217B (en) * | 2019-12-03 | 2021-06-04 | 珠海格力电器股份有限公司 | Voice wake-up method and device, electronic equipment and storage medium |
CN111768783A (en) * | 2020-06-30 | 2020-10-13 | 北京百度网讯科技有限公司 | Voice interaction control method, device, electronic equipment, storage medium and system |
CN111768783B (en) * | 2020-06-30 | 2024-04-02 | 北京百度网讯科技有限公司 | Voice interaction control method, device, electronic equipment, storage medium and system |
CN112420044A (en) * | 2020-12-03 | 2021-02-26 | 深圳市欧瑞博科技股份有限公司 | Voice recognition method, voice recognition device and electronic equipment |
CN113643711A (en) * | 2021-08-03 | 2021-11-12 | 常州匠心独具智能家居股份有限公司 | Voice system based on offline mode and online mode for intelligent furniture |
CN113643711B (en) * | 2021-08-03 | 2024-04-19 | 常州匠心独具智能家居股份有限公司 | Voice system based on offline mode and online mode for intelligent furniture |
CN115512700A (en) * | 2022-09-07 | 2022-12-23 | 广州小鹏汽车科技有限公司 | Voice interaction method, voice interaction device, vehicle and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109754788B (en) | 2020-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109754788A (en) | A kind of sound control method, device, equipment and storage medium | |
CN108962217B (en) | Speech synthesis method and related equipment | |
JP6463825B2 (en) | Multi-speaker speech recognition correction system | |
US6775651B1 (en) | Method of transcribing text from computer voice mail | |
US20140036023A1 (en) | Conversational video experience | |
CN110085261A (en) | A kind of pronunciation correction method, apparatus, equipment and computer readable storage medium | |
CN111653265B (en) | Speech synthesis method, device, storage medium and electronic equipment | |
CN109003602A (en) | Test method, device, equipment and the computer-readable medium of speech production | |
CN109036396A (en) | A kind of exchange method and system of third-party application | |
CN107610702A (en) | Terminal device standby wakeup method, apparatus and computer equipment | |
Crowdy | Spoken corpus transcription | |
US20100178956A1 (en) | Method and apparatus for mobile voice recognition training | |
CN109545194A (en) | Wake up word pre-training method, apparatus, equipment and storage medium | |
CN107481720A (en) | A kind of explicit method for recognizing sound-groove and device | |
JP2000501847A (en) | Method and apparatus for obtaining complex information from speech signals of adaptive dialogue in education and testing | |
KR20120038000A (en) | Method and system for determining the topic of a conversation and obtaining and presenting related content | |
JP7158217B2 (en) | Speech recognition method, device and server | |
CN109543021B (en) | Intelligent robot-oriented story data processing method and system | |
CN112017650B (en) | Voice control method and device of electronic equipment, computer equipment and storage medium | |
CN108091324A (en) | Tone recognition methods, device, electronic equipment and computer readable storage medium | |
TWI270052B (en) | System for selecting audio content by using speech recognition and method therefor | |
CN109346057A (en) | A kind of speech processing system of intelligence toy for children | |
KR20230021556A (en) | Create interactive audio tracks from visual content | |
CN111079423A (en) | Method for generating dictation, reading and reporting audio, electronic equipment and storage medium | |
CN109697981A (en) | A kind of voice interactive method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |