CN107910003A - A kind of voice interactive method and speech control system for smart machine - Google Patents

A kind of voice interactive method and speech control system for smart machine Download PDF

Info

Publication number
CN107910003A
CN107910003A CN201711407315.8A CN201711407315A CN107910003A CN 107910003 A CN107910003 A CN 107910003A CN 201711407315 A CN201711407315 A CN 201711407315A CN 107910003 A CN107910003 A CN 107910003A
Authority
CN
China
Prior art keywords
sound
scene
voice
source
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711407315.8A
Other languages
Chinese (zh)
Inventor
林树宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chi Tong (xiamen) Technology Co Ltd
Original Assignee
Chi Tong (xiamen) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chi Tong (xiamen) Technology Co Ltd filed Critical Chi Tong (xiamen) Technology Co Ltd
Priority to CN201711407315.8A priority Critical patent/CN107910003A/en
Publication of CN107910003A publication Critical patent/CN107910003A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention discloses a kind of voice interactive method and speech control system for smart machine, pass through the task scene residing for collecting device, and differentiate whether the voice described in user forms gain source of sound, come whether decision device needs to perform the phonetic order that user wants to assign.Under this scenario, user, which can remove from, needs first to say specific the step of waking up word, need to only say specific content when needing to issue an order in special scenes, just smart machine can be made directly to perform voice command, is for a kind of intelligence, effective speech interactive mode.

Description

A kind of voice interactive method and speech control system for smart machine
Technical field
The present invention relates to the voice control field of smart machine, more particularly to a kind of interactive voice side for smart machine Method and speech control system.
Background technology
Voice control technology is all equipped with extensively on multiclass intelligent terminal.At present, the language of user and equipment room Sound interactive mode is mostly two-part interaction, that is, includes waking up interaction and content interaction.Such as " SIRI " of iPhone, user needs Wake-up word-" Hey, Siri set in advance is said to mobile microphone!", subsequent system enters SIRI interactive interfaces, listens to and uses The phonetic order content at family.
There are problems with for such interactive mode:(1) user needs first to say the wake-up word of corresponding speech control system, Waiting system enters content interaction mode so that user need said around interval time two sections of voices could will order transmit To speech control system, not enough intelligently;(2) inhomogeneity equipment room is there are a variety of different wake-up words on the market, such as Android class mobile phone Wake-up word be " OK, Google!", exacerbate the interface differentiating phenomenon in voice control field, and add user study into This, is unfavorable for standard integration;(3) in noisy environment or multi-person speech environment, system is difficult to distinguish whether user has said wake-up word, Causing to occur voice system can not wake up or the situation of false wake-up.
The content of the invention
It is an object of the invention to provide a kind of intelligence, effective speech control program, under this scenario, user can remove from The step of needing first to say specific wake-up word, solves above-mentioned technical problem.
To achieve the above object, the first aspect of the present invention provides a kind of voice interactive method for smart machine, Comprise the following steps:
Step S1:Phonetic entry is received, identifies the voice content of the phonetic entry;
Step S2:Extract the acoustical characteristic parameters of above-mentioned phonetic entry, and according to its differentiate this input voice whether structure Into gain source of sound;If being determined as forming gain source of sound, S3 is performed;
Step S3:Directly perform phonetic order corresponding with the voice content.
In one embodiment:Step A1 is also performed while step S1 is performed:Task scene residing for collecting device;
After step S1 and step A1 is performed, before performing step S3, following steps are also performed:
Step A2:Differentiate whether the voice content matches with above-mentioned task scene;
If the differentiation result of step S2 and step A2 is affirmative, step S3 is performed.
In one embodiment:The step A2 is performed before step S2, if the differentiation result of step A2 is affirmative, is performed Step S2.
In one embodiment:The step S2 includes the following steps:
Step S21:The characteristic parameter storehouse of component source of sound, this feature parameter library contain the default gain source of sound that can form The effective range of acoustical characteristic parameters;
Step S22:The vocal segments in phonetic entry are extracted, and therefrom extract its acoustical characteristic parameters;
Step S23:The acoustical characteristic parameters extracted are compared whether in the effective range of features described above parameter library, if In the range of, then it is determined as that the voice of this input forms gain source of sound, otherwise, is determined as not forming.
In one embodiment:The gain source of sound includes volume gain source of sound and/or quadrature gain source of sound;
When the gain source of sound is quadrature gain source of sound, language of the corresponding acoustical characteristic parameters for source of sound relative to equipment The input angle of sound input unit;
When the gain source of sound is volume gain source of sound, corresponding acoustical characteristic parameters are the volume of source of sound.
In one embodiment:The speech input device of equipment is microphone;
Multiple microphones are equipped with said device, to form microphone array, when microphone array receives phonetic entry When, by being sampled to voice, handling, the process such as calculating, obtain speech input device wheat of the input source of sound relative to equipment The input angle of gram wind array.
In one embodiment:Task scene in the step A1 corresponds to the task of processing needed for equipment;
The step A1 includes the following steps:
Step A11:Corresponding scene identifiers, structure scene identity storehouse are distributed being handled needed for equipment for task;
Step A12:When equipment starts a certain task, the scene identifiers of the corresponding task are exported;
Step A13:Identify the scene identifiers.
In one embodiment:The step A2 includes the following steps:
Step A21:Phonetic order collection is built, which is the available phonetic order under corresponding each task scene Set;
Step A22:Phonetic entry is converted into the voice content of device readable form, and by the voice of the readable form Hold and convert the false plan phonetic order identical with above-mentioned phonetic order form;
Step A23:All available phonetic orders under the task scene identified in extraction step A1, using step The false phonetic order of intending obtained in A22 compares one by one with above-mentioned available phonetic order;
Step A24:If phonetic order is intended in vacation covers available phonetic order under a certain task scene, terminate to compare It is right, and be determined as voice content and match with task scene, otherwise, it is determined as mismatching.
To achieve the above object, the second aspect of the present invention provides a kind of speech control system for smart machine, Including:Voice-input device, microprocessor;
The microprocessor is built-in with gain source of sound judgement unit, instruction execution unit and content recognition unit;It is described Content recognition unit connects voice-input device to identify the content of phonetic entry;
The gain source of sound judgement unit connects voice-input device, and can extract the acoustical characteristic parameters of phonetic entry, To differentiate whether the voice of input is formed gain source of sound;
Described instruction execution unit is connected respectively to content recognition unit and gain source of sound judgement unit, when gain source of sound When the differentiation result of judgement unit is certainly, described instruction execution unit performs the phonetic order that the corresponding voice content is answered.
In one embodiment:Storage device is further included, the storing device for storing has characteristic parameter storehouse, the characteristic parameter storehouse Contain the effective range of the default acoustical characteristic parameters that can form gain source of sound;
The gain source of sound judgement unit connects the characteristic parameter storehouse, and the acoustical characteristic parameters extracted described in comparison Whether in the effective range of features described above parameter library.
In one embodiment:The gain source of sound includes volume gain source of sound and/or quadrature gain source of sound;When the gain sound When source is quadrature gain source of sound, input angle of the corresponding acoustical characteristic parameters for source of sound relative to the speech input device of equipment Degree;When the gain source of sound is volume gain source of sound, corresponding acoustical characteristic parameters are the volume of source of sound.
In one embodiment:The voice-input device is multiple microphones, and forms microphone array;
When microphone array receives phonetic entry, by being sampled to voice, handling, the process such as calculating, acquisition is defeated Enter input angle of the source of sound relative to the speech input device microphone array of equipment, and output this to the gain source of sound and sentence Other unit;
The gain source of sound judgement unit further includes volume detecting unit, to detect the volume of phonetic entry.
In one embodiment:The microprocessor is also built-in with scene matching unit, and scene matching unit connection content is known Other unit, to differentiate whether voice content matches with the task scene residing for equipment;
Described instruction execution unit is also connected to the scene matching unit, when scene matching unit and gain source of sound differentiate When the differentiation result of unit is certainly, described instruction execution unit performs the phonetic order that the corresponding voice content is answered.
In one embodiment:Storage device is further included, the storing device for storing has scene identity storehouse, phonetic order collection;
The scene identity storehouse contain with equipment needed for the corresponding scene identifiers of task distribution that handle;The voice Instruction set is the set of the available phonetic order of corresponding each scene identifiers.
In one embodiment:Microprocessor further includes task processing unit, and for each task of processing equipment, it is connected to The scene identity storehouse of the storage device and scene matching unit;When equipment starts a certain task, the task processing unit The scene identifiers of the corresponding task are exported to scene matching unit.
In one embodiment:The scene matching unit connects the phonetic order collection of the storage device, in scene matching list After member receives the scene identifiers, the scene matching unit is extracted under the corresponding task scene according to the scene identifiers All available phonetic orders;
Phonetic entry is converted into the false plan phonetic order identical with above-mentioned phonetic order form by the content recognition unit, And output this to scene matching unit, scene matching unit by it is described it is false intend phonetic order and above-mentioned available phonetic order by One compares, to differentiate whether voice content matches with the task scene residing for equipment.
Compared to the prior art, the present invention has the advantage that:
Voice interactive method and speech control system provided by the invention, are based primarily upon the task scene conduct residing for equipment One of condition whether performed, in addition it is also necessary to take into account that the source of sound of phonetic entry needs to form gain source of sound, when both are satisfied by During condition, it is meant that the task scene make it that whether user has said suitable phonetic order under the scene, and user is at this Speak under specific scene to equipment, equipment should handle user what is said or talked about just now corresponding instruction at this time, thus, equipment is just The content of user speech input is directly performed, user is needless to say more once to wake up word.
So, a kind of intelligence, efficient interactive voice mode are not only realized, avoiding appearance can not wake up or miss The situation of wake-up, reduces user's study and the cost that uses, and this interactive voice mode can also Unified Generalization, be conducive to The resource consolidation of voice control industry.
Brief description of the drawings
Fig. 1 shown in embodiment one, the flow chart of voice interactive method;
Fig. 2 shown in embodiment two, the flow chart of voice interactive method;
Fig. 3 shown in embodiment three, the flow chart of voice interactive method;
Fig. 4 shows in example IV that the system of speech control system forms schematic diagram;
Fig. 5 shows in embodiment five that the system of speech control system forms schematic diagram;
Fig. 6 shows in embodiment six that the system of speech control system forms schematic diagram.
Embodiment
The present invention provides a kind of voice interactive method and speech control system for smart machine, below in conjunction with attached drawing The present invention is further illustrated with embodiment.What deserves to be explained is the smart machine of the present invention can be hand The terminal devices such as machine, tablet computer, computer, intelligent robot, but be not limited thereto.
Please also refer to Fig. 1, it illustrates the flow chart of voice interactive method in embodiment one, this method includes following step Suddenly:
Step S1:Phonetic entry is received, identifies the voice content of the phonetic entry;
Step S2:Extract the acoustical characteristic parameters of above-mentioned phonetic entry, and according to its differentiate this input voice whether structure Into gain source of sound;If being determined as forming gain source of sound, S3 is performed;
Step S3:Directly perform phonetic order corresponding with the voice content.
The gain source of sound, it embodies attention rate of the user to equipment, when user improves equipment attention rate, he Phonetic entry to equipment is with regard to that can form gain source of sound, it means that and user wishes that equipment listens to the content that he speaks, and to it Handled, equipment should handle user what is said or talked about just now corresponding instruction at this time.On the user attention rate of representing The embodiment of gain source of sound, its principle may be embodied in following quadrature gain source of sound and volume gain source of sound.
Next refer to Fig. 2, it illustrates the flow chart of voice interactive method in embodiment two, embodiment two relative to Embodiment one difference lies in,
Step A1 is also performed while step S1 is performed:Task scene residing for collecting device;
After step S1 and step A1 is performed, before performing step S3, following steps are also performed:
Step A2:Differentiate whether the voice content matches with above-mentioned task scene;
If the differentiation result of step S2 and step A2 is affirmative, step S3 is performed.
By adding step A1 and A2, add and be used as whether perform voice command based on the task scene residing for equipment One of criterion.When gain source of sound differentiates and scene matching is satisfied by condition, it is meant that user is in suitable task Suitable phonetic order has been said under scene to equipment, so relative to embodiment one, has optimized system processing and the flow analyzed, Eliminate unnecessary redundant process steps so that system can more efficiently be run.
Next refer to Fig. 3, it illustrates the flow chart of voice interactive method in embodiment three, embodiment two relative to Difference lies in the gain source of sound of embodiment two differentiates embodiment one and scene matching differentiation carries out side by side, and this implementation In example, the differentiation of scene matching is carried out first, only when the result that scene gain source of sound differentiates to match, just carries out gain sound The differentiation in source.So, differentiate scene matching prior to gain source of sound to differentiate, the flow of system differentiation is simplified, into one Walk optimization system flow.
As:The step A2 is performed before step S2, if the differentiation result of step A2 is affirmative, performs step S2。
However, in the present embodiment, the priority that scene matching differentiates is differentiated that this is mainly in view of prior to gain source of sound Scene matching differentiate actual match degree it is higher, in other embodiment, can also by itself otherwise and go, make gain source of sound differentiate it is excellent Differentiate prior to scene matching.
Above-described embodiment has supplied such a interactive mode, which eliminates the wake-up step of the prior art, adopts With the interactive mode of one-part form.It is based primarily upon task scene residing for equipment as whether one of the condition performed, additionally It need to consider that the source of sound of phonetic entry needs to form gain source of sound, when both are satisfied by condition, it is meant that user is in the spy Speak under fixed scene to equipment, equipment just directly performs the content of user speech input, and user is needless to say more once to wake up word.
Specifically, user is represented to equipment attention rate for how to differentiate whether the voice of input is formed in the step S2 The gain source of sound of lifting, it includes the following steps:
Step S21:The characteristic parameter storehouse of component source of sound, this feature parameter library contain the default gain source of sound that can form The effective range of acoustical characteristic parameters;
Step S22:The vocal segments in phonetic entry are extracted, and therefrom extract its acoustical characteristic parameters;
Step S23:The acoustical characteristic parameters extracted are compared whether in the effective range of features described above parameter library, if In the range of, then it is determined as that the voice of this input forms gain source of sound, otherwise, is determined as not forming.
Preferably, even if in more people or noisy environment, when user wishes to assign phonetic order to equipment, it is to equipment Attention rate can improve naturally, this may be embodied in volume and angle that he speaks equipment.In concrete scheme, the gain sound Source includes volume gain source of sound and/or quadrature gain source of sound;When the gain source of sound is quadrature gain source of sound, corresponding acoustics Input angle of the characteristic parameter for source of sound relative to the speech input device of equipment;When the gain source of sound is volume gain source of sound When, corresponding acoustical characteristic parameters are the volume of source of sound.In the present embodiment, the differentiation of gain source of sound preferably needs to meet defeated Enter two aspects of angle and volume, but in other embodiment, only meet one side.
Preferably, the speech input device of equipment is microphone;Multiple microphones are equipped with said device, to form wheat Gram wind array, when microphone array receives phonetic entry, by being sampled, handling to voice, process, the acquisition such as to calculate defeated Enter input angle of the source of sound relative to the speech input device microphone array of equipment.
In addition, for the method for acquisition tasks scene in the step A1, it specifically comprises the following steps:
Step A11:Corresponding scene identifiers, structure scene identity storehouse are distributed being handled needed for equipment for task;
Step A12:When equipment starts a certain task, the scene identifiers of the corresponding task are exported;
Step A13:Identify the scene identifiers.
And for how to differentiate whether voice matches with scene in the step A2, realized by following steps:
Step A21:Phonetic order collection is built, which is the available phonetic order under corresponding each task scene Set;
Step A22:Phonetic entry is converted into the voice content of device readable form, and by the voice of the readable form Hold and convert the false plan phonetic order identical with above-mentioned phonetic order form;
Step A23:All available phonetic orders under the task scene identified in extraction step A1, using step The false phonetic order of intending obtained in A22 compares one by one with above-mentioned available phonetic order;
Step A24:If phonetic order is intended in vacation covers available phonetic order under a certain task scene, terminate to compare It is right, and be determined as voice content and match with task scene, otherwise, it is determined as mismatching.
For example, when equipment is in the task scene of multimedia, and user says " next ", speech control system Judge that user has said suitable correct matched phonetic order under the task scene, just directly perform the order of " next ".
It refer to Fig. 4-5 below, another aspect of the present invention additionally provides a kind of speech control system, and wherein Fig. 4 is shown Speech control system in example IV, it includes:Voice-input device, microprocessor.
The microprocessor is built-in with gain source of sound judgement unit, instruction execution unit and content recognition unit;It is described Content recognition unit connects voice-input device to identify the content of phonetic entry;
The gain source of sound judgement unit connects voice-input device, and can extract the acoustical characteristic parameters of phonetic entry, To differentiate whether the voice of input is formed gain source of sound;
Described instruction execution unit is connected respectively to content recognition unit and gain source of sound judgement unit, when gain source of sound When the differentiation result of judgement unit is certainly, described instruction execution unit performs the phonetic order that the corresponding voice content is answered.
By above structure, a kind of speech control system of the sound exchange method based on embodiment one is constructed, is carried for it Hardware support is supplied.By loading the system in equipment so that user is more intelligent to the interactive voice of equipment, efficiently Change.
Preferably, in the embodiment five shown in Fig. 5, speech control system further includes storage device, the storage device storage There is characteristic parameter storehouse, the characteristic parameter storehouse contains effective model of the default acoustical characteristic parameters that can form gain source of sound Enclose.
The gain source of sound judgement unit connects the characteristic parameter storehouse, and the acoustical characteristic parameters extracted described in comparison Whether in the effective range of features described above parameter library.
Specifically, the gain source of sound includes volume gain source of sound and/or quadrature gain source of sound;When the gain source of sound is During quadrature gain source of sound, input angle of the corresponding acoustical characteristic parameters for source of sound relative to the speech input device of equipment;When When the gain source of sound is volume gain source of sound, corresponding acoustical characteristic parameters are the volume of source of sound.
Preferably, the voice-input device is multiple microphones, and forms microphone array.When microphone array receives During phonetic entry, by being sampled to voice, handling, the process such as calculating, it is defeated relative to the voice of equipment to obtain input source of sound Enter the input angle of device microphone array, and output this to the gain source of sound judgement unit.
Further, the gain source of sound judgement unit further includes volume detecting unit, to detect the volume of phonetic entry Size.
Finally, Fig. 6 is refer to, it illustrates the speech control system in embodiment six, the system of embodiment six corresponds to The voice interactive method of embodiment two or embodiment three.The embodiment is also built-in with compared to embodiment five, the microprocessor Scene matching unit, the scene matching unit connection content recognition unit, with differentiate voice content whether with equipment residing for appoint Business scene matches.
In addition, described instruction execution unit is also connected to the scene matching unit, when scene matching unit and gain sound When the differentiation result of source judgement unit is certainly, described instruction execution unit performs the voice that the corresponding voice content is answered and refers to Order.
In the present embodiment, the storing device for storing has described in scene identity storehouse, phonetic order collection and embodiment five Characteristic parameter storehouse.The scene identity storehouse contain with equipment needed for the corresponding scene identifiers of task distribution that handle;Institute Predicate sound instruction set is the set of the available phonetic order of corresponding each scene identifiers.
In concrete structure, microprocessor further includes task processing unit, and for each task of processing equipment, it is connected to The scene identity storehouse of the storage device and scene matching unit.When equipment starts a certain task, the task processing unit The scene identifiers of the corresponding task are exported to scene matching unit.
The scene matching unit connects the phonetic order collection of the storage device, and the field is received in scene matching unit After scape identifier, the scene matching unit extracts all available languages under the corresponding task scene according to the scene identifiers Sound instructs.In addition, phonetic entry is converted into the false plan voice identical with above-mentioned phonetic order form by the content recognition unit Instruction, and scene matching unit is output this to, the vacation is intended phonetic order and above-mentioned available voice by scene matching unit Instruction compares one by one, to differentiate whether voice content matches with the task scene residing for equipment.
In this way, pass through scene identity storehouse, the cooperation of phonetic order collection, task processing unit, content recognition unit so that field Scape matching unit can differentiate whether the content of phonetic entry matches with the task scene residing for equipment.Sentence in conjunction with gain source of sound Other unit so that instruction execution unit is according to the differentiation of scene matching unit and gain source of sound judgement unit as a result, choosing whether Perform user command.
The foregoing is merely the preferred embodiment of the present invention, not thereby limits its scope of the claims, every to utilize the present invention The equivalent structure transformation that specification and accompanying drawing content are made, is directly or indirectly used in other related technical areas, similarly It is included within the scope of the present invention.

Claims (16)

1. a kind of voice interactive method for smart machine, it is characterised in that comprise the following steps:
Step S1:Phonetic entry is received, identifies the voice content of the phonetic entry;
Step S2:The acoustical characteristic parameters of above-mentioned phonetic entry are extracted, and differentiate whether the voice of this input forms increasing according to it Beneficial source of sound;If being determined as forming gain source of sound, S3 is performed;
Step S3:Directly perform phonetic order corresponding with the voice content.
2. a kind of voice interactive method for smart machine as claimed in claim 1, it is characterised in that performing step S1 While also perform step A1:Task scene residing for collecting device;
After step S1 and step A1 is performed, before performing step S3, following steps are also performed:
Step A2:Differentiate whether the voice content matches with above-mentioned task scene;
If the differentiation result of step S2 and step A2 is affirmative, step S3 is performed.
3. a kind of voice interactive method for smart machine as claimed in claim 2, it is characterised in that the step A2 exists Performed before step S2, if the differentiation result of step A2 is affirmative, perform step S2.
A kind of 4. voice interactive method for smart machine as claimed in claim 1, it is characterised in that the step S2 bags Include following steps:
Step S21:The characteristic parameter storehouse of component source of sound, this feature parameter library contain the default acoustics that can form gain source of sound The effective range of characteristic parameter;
Step S22:The vocal segments in phonetic entry are extracted, and therefrom extract its acoustical characteristic parameters;
Step S23:The acoustical characteristic parameters extracted are compared whether in the effective range of features described above parameter library, if in scope It is interior, then it is determined as that the voice of this input forms gain source of sound, otherwise, is determined as not forming.
A kind of 5. voice interactive method for smart machine as claimed in claim 4, it is characterised in that:The gain source of sound Include volume gain source of sound and/or quadrature gain source of sound;
When the gain source of sound is quadrature gain source of sound, corresponding acoustical characteristic parameters are defeated relative to the voice of equipment for source of sound Enter the input angle of device;
When the gain source of sound is volume gain source of sound, corresponding acoustical characteristic parameters are the volume of source of sound.
A kind of 6. voice interactive method for smart machine as claimed in claim 5, it is characterised in that:The voice of equipment is defeated It is microphone to enter device;
Multiple microphones are equipped with said device, to form microphone array, when microphone array receives phonetic entry, are led to Cross and the process such as be sampled, handle, calculating to voice, obtain speech input device microphone array of the input source of sound relative to equipment The input angle of row.
7. a kind of voice interactive method for smart machine as claimed in claim 2, it is characterised in that in the step A1 Task scene correspond to equipment needed for processing task;
The step A1 includes the following steps:
Step A11:Corresponding scene identifiers, structure scene identity storehouse are distributed being handled needed for equipment for task;
Step A12:When equipment starts a certain task, the scene identifiers of the corresponding task are exported;
Step A13:Identify the scene identifiers.
A kind of 8. voice interactive method for smart machine as claimed in claim 2, it is characterised in that the step A2 bags Include following steps:
Step A21:Phonetic order collection is built, which is the collection of the available phonetic order under corresponding each task scene Close;
Step A22:Phonetic entry is converted into the voice content of device readable form, and the voice content of the readable form is turned Change the false plan phonetic order identical with above-mentioned phonetic order form;
Step A23:All available phonetic orders under the task scene identified in extraction step A1, using in step A22 Obtained false phonetic order of intending compares one by one with above-mentioned available phonetic order;
Step A24:If phonetic order is intended in vacation covers available phonetic order under a certain task scene, terminate to compare, and It is determined as voice content with task scene to match, otherwise, is determined as mismatching.
A kind of 9. speech control system for smart machine, it is characterised in that including:Voice-input device, microprocessor;
The microprocessor is built-in with gain source of sound judgement unit, instruction execution unit and content recognition unit;The content Recognition unit connects voice-input device to identify the content of phonetic entry;
The gain source of sound judgement unit connects voice-input device, and can extract the acoustical characteristic parameters of phonetic entry, to sentence Whether the voice not inputted forms gain source of sound;
Described instruction execution unit is connected respectively to content recognition unit and gain source of sound judgement unit, when gain source of sound differentiates When the differentiation result of unit is certainly, described instruction execution unit performs the phonetic order that the corresponding voice content is answered.
A kind of 10. speech control system for smart machine as claimed in claim 9, it is characterised in that:Further include storage Device, the storing device for storing have a characteristic parameter storehouse, and the characteristic parameter storehouse contains the default gain source of sound that can form The effective range of acoustical characteristic parameters;
The gain source of sound judgement unit connects the characteristic parameter storehouse, and the acoustical characteristic parameters extracted described in comparing whether In the effective range of features described above parameter library.
A kind of 11. speech control system for smart machine as claimed in claim 10, it is characterised in that:The gain sound Source includes volume gain source of sound and/or quadrature gain source of sound;When the gain source of sound is quadrature gain source of sound, corresponding acoustics Input angle of the characteristic parameter for source of sound relative to the speech input device of equipment;When the gain source of sound is volume gain source of sound When, corresponding acoustical characteristic parameters are the volume of source of sound.
A kind of 12. speech control system for smart machine as claimed in claim 11, it is characterised in that:The voice is defeated It is multiple microphones to enter equipment, and forms microphone array;
When microphone array receives phonetic entry, by being sampled to voice, handling, the process such as calculating, obtain and input sound Source relative to the speech input device microphone array of equipment input angle, and output this to the gain source of sound differentiate it is single Member;
The gain source of sound judgement unit further includes volume detecting unit, to detect the volume of phonetic entry.
A kind of 13. speech control system for smart machine as claimed in claim 9, it is characterised in that:
The microprocessor is also built-in with scene matching unit, scene matching unit connection content recognition unit, to differentiate language Whether sound content matches with the task scene residing for equipment;
Described instruction execution unit is also connected to the scene matching unit, when scene matching unit and gain source of sound judgement unit Differentiation result when being certainly, described instruction execution unit performs the phonetic order that the corresponding voice content is answered.
A kind of 14. speech control system for smart machine as claimed in claim 13, it is characterised in that:Further include storage Device, the storing device for storing have scene identity storehouse, phonetic order collection;
The scene identity storehouse contain with equipment needed for the corresponding scene identifiers of task distribution that handle;The phonetic order Integrate the set of the available phonetic order as each scene identifiers of correspondence.
A kind of 15. speech control system for smart machine as claimed in claim 14, it is characterised in that:Microprocessor is also Including task processing unit, for each task of processing equipment, it is connected to the scene identity storehouse of the storage device and field Scape matching unit;When equipment starts a certain task, the task processing unit exports the corresponding task to scene matching unit Scene identifiers.
A kind of 16. speech control system for smart machine as claimed in claim 15, it is characterised in that:The scene The phonetic order collection of the storage device is connected with unit, after scene matching unit receives the scene identifiers, the field Scape matching unit extracts all available phonetic orders under the corresponding task scene according to the scene identifiers;
Phonetic entry is converted into the false plan phonetic order identical with above-mentioned phonetic order form by the content recognition unit, and will It is exported to scene matching unit, scene matching unit compares the false phonetic order of intending with above-mentioned available phonetic order one by one It is right, to differentiate whether voice content matches with the task scene residing for equipment.
CN201711407315.8A 2017-12-22 2017-12-22 A kind of voice interactive method and speech control system for smart machine Pending CN107910003A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711407315.8A CN107910003A (en) 2017-12-22 2017-12-22 A kind of voice interactive method and speech control system for smart machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711407315.8A CN107910003A (en) 2017-12-22 2017-12-22 A kind of voice interactive method and speech control system for smart machine

Publications (1)

Publication Number Publication Date
CN107910003A true CN107910003A (en) 2018-04-13

Family

ID=61870713

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711407315.8A Pending CN107910003A (en) 2017-12-22 2017-12-22 A kind of voice interactive method and speech control system for smart machine

Country Status (1)

Country Link
CN (1) CN107910003A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108831455A (en) * 2018-05-25 2018-11-16 四川斐讯全智信息技术有限公司 A kind of method and system of intelligent sound box streaming interaction
CN108877791A (en) * 2018-05-23 2018-11-23 百度在线网络技术(北京)有限公司 Voice interactive method, device, server, terminal and medium based on view
CN109637531A (en) * 2018-12-06 2019-04-16 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
CN110706707A (en) * 2019-11-13 2020-01-17 百度在线网络技术(北京)有限公司 Method, apparatus, device and computer-readable storage medium for voice interaction
CN112787899A (en) * 2021-01-08 2021-05-11 青岛海尔特种电冰箱有限公司 Equipment voice interaction method, computer readable storage medium and refrigerator
CN112882394A (en) * 2021-01-12 2021-06-01 北京小米松果电子有限公司 Device control method, control apparatus, and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1372071B1 (en) * 2002-04-08 2008-01-23 Matsushita Electric Industrial Co., Ltd. Management of software components in an image processing system
US20120253803A1 (en) * 2011-03-30 2012-10-04 Motonobu Sugiura Voice recognition device and voice recognition method
CN104967726A (en) * 2015-04-30 2015-10-07 努比亚技术有限公司 Voice instruction processing method, voice instruction processing device and mobile terminal
CN105094807A (en) * 2015-06-25 2015-11-25 三星电子(中国)研发中心 Method and device for implementing voice control
CN106157955A (en) * 2015-03-30 2016-11-23 阿里巴巴集团控股有限公司 A kind of sound control method and device
CN106254612A (en) * 2015-06-15 2016-12-21 中兴通讯股份有限公司 A kind of sound control method and device
CN107146622A (en) * 2017-06-16 2017-09-08 合肥美的智能科技有限公司 Refrigerator, voice interactive system, method, computer equipment, readable storage medium storing program for executing
CN107316641A (en) * 2017-06-30 2017-11-03 联想(北京)有限公司 A kind of sound control method and electronic equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1372071B1 (en) * 2002-04-08 2008-01-23 Matsushita Electric Industrial Co., Ltd. Management of software components in an image processing system
US20120253803A1 (en) * 2011-03-30 2012-10-04 Motonobu Sugiura Voice recognition device and voice recognition method
CN106157955A (en) * 2015-03-30 2016-11-23 阿里巴巴集团控股有限公司 A kind of sound control method and device
CN104967726A (en) * 2015-04-30 2015-10-07 努比亚技术有限公司 Voice instruction processing method, voice instruction processing device and mobile terminal
CN106254612A (en) * 2015-06-15 2016-12-21 中兴通讯股份有限公司 A kind of sound control method and device
CN105094807A (en) * 2015-06-25 2015-11-25 三星电子(中国)研发中心 Method and device for implementing voice control
CN107146622A (en) * 2017-06-16 2017-09-08 合肥美的智能科技有限公司 Refrigerator, voice interactive system, method, computer equipment, readable storage medium storing program for executing
CN107316641A (en) * 2017-06-30 2017-11-03 联想(北京)有限公司 A kind of sound control method and electronic equipment

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108877791A (en) * 2018-05-23 2018-11-23 百度在线网络技术(北京)有限公司 Voice interactive method, device, server, terminal and medium based on view
CN108877791B (en) * 2018-05-23 2021-10-08 百度在线网络技术(北京)有限公司 Voice interaction method, device, server, terminal and medium based on view
US11727927B2 (en) 2018-05-23 2023-08-15 Baidu Online Network Technology (Beijing) Co., Ltd. View-based voice interaction method, apparatus, server, terminal and medium
CN108831455A (en) * 2018-05-25 2018-11-16 四川斐讯全智信息技术有限公司 A kind of method and system of intelligent sound box streaming interaction
CN109637531A (en) * 2018-12-06 2019-04-16 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and air-conditioning
CN109637531B (en) * 2018-12-06 2020-09-15 珠海格力电器股份有限公司 Voice control method and device, storage medium and air conditioner
CN110706707A (en) * 2019-11-13 2020-01-17 百度在线网络技术(北京)有限公司 Method, apparatus, device and computer-readable storage medium for voice interaction
US11393490B2 (en) 2019-11-13 2022-07-19 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus, device and computer-readable storage medium for voice interaction
CN112787899A (en) * 2021-01-08 2021-05-11 青岛海尔特种电冰箱有限公司 Equipment voice interaction method, computer readable storage medium and refrigerator
CN112787899B (en) * 2021-01-08 2022-10-28 青岛海尔特种电冰箱有限公司 Equipment voice interaction method, computer readable storage medium and refrigerator
CN112882394A (en) * 2021-01-12 2021-06-01 北京小米松果电子有限公司 Device control method, control apparatus, and readable storage medium

Similar Documents

Publication Publication Date Title
CN107910003A (en) A kind of voice interactive method and speech control system for smart machine
WO2021093449A1 (en) Wakeup word detection method and apparatus employing artificial intelligence, device, and medium
US9117449B2 (en) Embedded system for construction of small footprint speech recognition with user-definable constraints
US9741343B1 (en) Voice interaction application selection
US10811005B2 (en) Adapting voice input processing based on voice input characteristics
US20160019886A1 (en) Method and apparatus for recognizing whisper
WO2017012511A1 (en) Voice control method and device, and projector apparatus
US20140379334A1 (en) Natural language understanding automatic speech recognition post processing
JP2016502829A (en) Terminal voice control method, apparatus, terminal, and program
US20130289996A1 (en) Multipass asr controlling multiple applications
CN101923857A (en) Extensible audio recognition method based on man-machine interaction
EP3608906A1 (en) System for processing user voice utterance and method for operating same
KR102563817B1 (en) Method for processing user voice input and electronic device supporting the same
US20170110131A1 (en) Terminal control method and device, voice control device and terminal
CN110223687B (en) Instruction execution method and device, storage medium and electronic equipment
CN110706707B (en) Method, apparatus, device and computer-readable storage medium for voice interaction
CN109712623A (en) Sound control method, device and computer readable storage medium
US11437022B2 (en) Performing speaker change detection and speaker recognition on a trigger phrase
CN109859752A (en) A kind of sound control method, device, storage medium and voice joint control system
CN109979446A (en) Sound control method, storage medium and device
US11620996B2 (en) Electronic apparatus, and method of controlling to execute function according to voice command thereof
CN113421573B (en) Identity recognition model training method, identity recognition method and device
CN106584486A (en) Voice recognition based industrial robot control system and method
CN114999496A (en) Audio transmission method, control equipment and terminal equipment
CN101299333A (en) Built-in speech recognition system and inner core technique thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180413