CN109360563B - Voice control method and device, storage medium and air conditioner - Google Patents

Voice control method and device, storage medium and air conditioner Download PDF

Info

Publication number
CN109360563B
CN109360563B CN201811505078.3A CN201811505078A CN109360563B CN 109360563 B CN109360563 B CN 109360563B CN 201811505078 A CN201811505078 A CN 201811505078A CN 109360563 B CN109360563 B CN 109360563B
Authority
CN
China
Prior art keywords
voice
instruction
user
analysis
identity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811505078.3A
Other languages
Chinese (zh)
Other versions
CN109360563A (en
Inventor
韩雪
王慧君
张新
毛跃辉
陶梦春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN201811505078.3A priority Critical patent/CN109360563B/en
Publication of CN109360563A publication Critical patent/CN109360563A/en
Application granted granted Critical
Publication of CN109360563B publication Critical patent/CN109360563B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

The invention discloses a voice control method, a voice control device, a storage medium and an air conditioner, wherein the method comprises the following steps: acquiring a voice instruction for controlling equipment to be controlled; determining whether the identity of the user sending the voice instruction belongs to a set identity range; if the user identity belongs to the set identity range, calling an inertial voice system matched with the user identity to carry out semantic analysis on the voice command; or if the user identity does not belong to the set identity range, calling a set default voice system to perform semantic analysis on the voice instruction. The scheme of the invention can solve the problem that the success rate of voice recognition is low when the language used by the user is not in accordance with the selected language system and the user language cannot be recognized, thereby achieving the effect of improving the success rate of voice recognition.

Description

Voice control method and device, storage medium and air conditioner
Technical Field
The invention belongs to the technical field of voice control, particularly relates to a voice control method, a voice control device, a storage medium and an air conditioner, and particularly relates to a method and a device for realizing a voice air conditioner capable of automatically switching a language system, a storage medium and an AI air conditioner.
Background
Nowadays, household appliances air conditioners are more and more intelligent, and the increase of voice functions becomes a popular trend, so that the trouble of use is increased for people with poor voice success while convenience is brought to the life of people. Such as the elderly, the harbors and the australian who are accustomed to speaking cantonese. Although there is a dialect recognition technology in the market, the dialect system and the mandarin system are recorded into the voice system, before the voice is parsed, it is necessary to determine whether the dialect system or the mandarin system, and then parse the user voice according to the determined language system. This method has a drawback that if the language used by the user does not conform to the selected language system, the user language cannot be recognized, and the languages spoken by the old, adult, and child may be different in a home at present, or the dialect and mandarin are used alternately during the use of the user, and if the voice device has only one language system, the recognition of the user's voice will be low.
Disclosure of Invention
The present invention is directed to provide a voice control method, a voice control device, a storage medium, and an air conditioner, which are used to solve the problem in the prior art that a user language cannot be recognized when a language used by the user does not conform to a selected language system in a manner of analyzing the user voice according to a determined language system or a mandarin system, so that a success rate of voice recognition is low, and achieve an effect of improving a success rate of voice recognition.
The invention provides a voice control method, which comprises the following steps: acquiring a voice instruction for controlling equipment to be controlled; determining whether the identity of the user sending the voice instruction belongs to a set identity range; if the user identity belongs to the set identity range, calling an inertial voice system matched with the user identity to carry out semantic analysis on the voice command; or if the user identity does not belong to the set identity range, calling a set default voice system to perform semantic analysis on the voice instruction.
Optionally, the obtaining a voice instruction for controlling the device to be controlled includes: acquiring a voice instruction which is acquired by a voice acquisition module and used for controlling equipment to be controlled; the voice acquisition module is arranged on any one of a side of the equipment to be controlled, the environment to which the equipment to be controlled belongs and a client side; and/or, the voice acquisition module comprises: a microphone; and/or, the setting of the identity scope comprises: setting a voiceprint range; wherein, determining whether the identity of the user who sends the voice command belongs to a set identity range comprises: recognizing voice print information contained in the voice command; determining whether the voiceprint information is within the set voiceprint range; if the voiceprint information is in the set voiceprint range, determining that the user identity belongs to the set identity range; or if the voiceprint information is not in the set voiceprint range, determining that the user identity does not belong to the set identity range.
Optionally, invoking an inertial voice system matched with the user identity to perform semantic parsing on the voice instruction includes: according to the corresponding relation between the set identity and the set voice system, determining the set voice system corresponding to the set identity which is the same as the user identity in the corresponding relation as an inertial voice system matched with the user identity; performing semantic analysis on the voice command according to a customary semantic library of the inertial voice system to obtain semantic keywords which are determined based on the inertial voice system and matched with the voice command; and/or calling a set default voice system to perform semantic analysis on the voice instruction, wherein the semantic analysis comprises the following steps: and performing semantic analysis on the voice instruction according to a default semantic library of the default voice system to obtain a semantic keyword which is determined based on the default voice system and is matched with the voice instruction.
Optionally, the method further comprises: under the condition that the semantic analysis of the voice command by calling the inertial voice system matched with the user identity fails, determining whether the first analysis frequency of the analysis failure is greater than or equal to a first set frequency and/or whether the first analysis duration of the analysis failure is greater than or equal to a first set duration; if the first analysis times are larger than or equal to the first set times and/or the first analysis duration is larger than or equal to the first set duration, calling other voice systems in the set voice system except the inertial voice system to perform semantic analysis; or if the first analysis times are less than the first set times and/or the first analysis duration is less than the first set duration, continuing to use the inertial voice system to perform semantic analysis on the voice instruction; or, under the condition that the analysis of calling a set default voice system to carry out semantic analysis on the voice command fails, determining whether the second analysis frequency of the analysis failure is greater than or equal to a second set frequency and/or whether the second analysis duration of the analysis failure is greater than or equal to a second set duration; if the second analysis times are greater than or equal to the second set times and/or the second analysis duration is greater than or equal to the second set duration, calling other voice systems except the default voice system in the set voice system to perform semantic analysis; or if the second analysis times are less than the second set times and/or the second analysis duration is less than the second set duration, continuing to use the default voice system to perform semantic analysis on the voice instruction.
Optionally, the method further comprises: and under the condition that the semantic analysis of the voice instruction is successfully carried out by calling a set default voice system, or under the condition that the semantic analysis of other voice systems except the default voice system in the set voice system is successfully carried out, or under the condition that the semantic analysis of the voice instruction is successfully carried out by continuously using the default voice system, determining the current voice system which successfully carries out the voice analysis of the voice instruction of the user identity which does not belong to the set identity range as the inertial voice system of the user identity, and storing the user identity into the set identity range.
Optionally, wherein the voice instruction includes: the system comprises a mandarin voice instruction, a foreign language voice instruction and a dialect voice instruction corresponding to a user native to a user to be used of the device to be controlled; and/or, the voice system of said settlement, comprising: the system comprises a common speech sound system, a foreign language speech sound system and a dialect speech sound system corresponding to the native place of a user to be used of the device to be controlled; the default speech system, comprising: a common speech-sound system; the idiomatic speech system, comprising: the voice control system comprises any one of a common speaking voice system, a foreign language voice system and a dialect voice system corresponding to the user of the to-be-used user of the to-be-controlled device.
Optionally, the method further comprises: storing the user identity of a user to be used of the equipment to be controlled, and establishing a corresponding relation between the user identity and the voice system for inertial use; and/or controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction; wherein, control the equipment that waits to be controlled and carry out with the control command that the pronunciation instruction corresponds includes: if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
In accordance with the above method, another aspect of the present invention provides a voice control apparatus, including: the device comprises an acquisition unit, a control unit and a control unit, wherein the acquisition unit is used for acquiring a voice instruction for controlling the device to be controlled; the control unit is used for determining whether the identity of the user sending the voice command belongs to a set identity range or not; the control unit is further used for calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction if the user identity belongs to the set identity range; or, the control unit is further configured to invoke a set default voice system to perform semantic analysis on the voice instruction if the user identity does not belong to the set identity range.
Optionally, the acquiring unit acquires a voice instruction for controlling the device to be controlled, and includes: acquiring a voice instruction which is acquired by a voice acquisition module and used for controlling equipment to be controlled; the voice acquisition module is arranged on any one of a side of the equipment to be controlled, the environment to which the equipment to be controlled belongs and a client side; and/or, the voice acquisition module comprises: a microphone; and/or, the setting of the identity scope comprises: setting a voiceprint range; wherein, the control unit determines whether the identity of the user who sends the voice command belongs to a set identity range, and the method comprises the following steps: recognizing voice print information contained in the voice command; determining whether the voiceprint information is within the set voiceprint range; if the voiceprint information is in the set voiceprint range, determining that the user identity belongs to the set identity range; or if the voiceprint information is not in the set voiceprint range, determining that the user identity does not belong to the set identity range.
Optionally, the invoking, by the control unit, an idiom voice system matched with the user identity to perform semantic parsing on the voice instruction includes: according to the corresponding relation between the set identity and the set voice system, determining the set voice system corresponding to the set identity which is the same as the user identity in the corresponding relation as an inertial voice system matched with the user identity; performing semantic analysis on the voice command according to a customary semantic library of the inertial voice system to obtain semantic keywords which are determined based on the inertial voice system and matched with the voice command; and/or the control unit calls a set default voice system to perform semantic analysis on the voice instruction, and the semantic analysis comprises the following steps: and performing semantic analysis on the voice instruction according to a default semantic library of the default voice system to obtain a semantic keyword which is determined based on the default voice system and is matched with the voice instruction.
Optionally, the method further comprises: the control unit is further configured to determine whether a first analysis frequency of the analysis failure is greater than or equal to a first set frequency and/or whether a first analysis duration of the analysis failure is greater than or equal to a first set duration under the condition that the analysis of calling the speech instruction to perform semantic analysis by the speech system matched with the user identity fails; the control unit is further configured to call other voice systems except the inertial voice system in the set voice system to perform semantic analysis if the first analysis frequency is greater than or equal to the first set frequency and/or the first analysis duration is greater than or equal to the first set duration; or, the control unit is further configured to continue to use the speech command to perform semantic analysis on the speech command by using the inertial speech system if the first analysis frequency is less than the first set frequency and/or the first analysis duration is less than the first set duration; or, the control unit is further configured to determine, when an analysis of calling a set default voice system to perform semantic analysis on the voice instruction fails, whether a second analysis frequency of the analysis failure is greater than or equal to a second set frequency and/or whether a second analysis duration of the analysis failure is greater than or equal to a second set duration; the control unit is further configured to call other voice systems except the default voice system in the set voice system to perform semantic analysis if the second analysis frequency is greater than or equal to the second set frequency and/or the second analysis duration is greater than or equal to the second set duration; or, the control unit is further configured to continue to use the default speech system to perform semantic analysis on the speech instruction if the second analysis frequency is less than the second set frequency and/or the second analysis duration is less than the second set duration.
Optionally, the method further comprises: the control unit is further configured to determine, when the semantic analysis of the voice command is successfully performed by calling the set default voice system, or when the semantic analysis of other voice systems except the default voice system is successfully performed by calling the set default voice system, or when the semantic analysis of the voice command is successfully performed by continuing to use the default voice system, a current voice system in which the analysis of the voice command of the user identity that does not belong to the set identity range is successfully performed is determined as the speech system for the user identity, and store the user identity into the set identity range.
Optionally, wherein the voice instruction includes: the system comprises a mandarin voice instruction, a foreign language voice instruction and a dialect voice instruction corresponding to a user native to a user to be used of the device to be controlled; and/or, the voice system of said settlement, comprising: the system comprises a common speech sound system, a foreign language speech sound system and a dialect speech sound system corresponding to the native place of a user to be used of the device to be controlled; the default speech system, comprising: a common speech-sound system; the idiomatic speech system, comprising: the voice control system comprises any one of a common speaking voice system, a foreign language voice system and a dialect voice system corresponding to the user of the to-be-used user of the to-be-controlled device.
Optionally, the method further comprises: the control unit is also used for storing the user identity of the user to be used of the equipment to be controlled and establishing the corresponding relation between the user identity and the voice system for the inertial use; and/or the control unit is further used for controlling the equipment to be controlled to execute the control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction; wherein, the control unit controls the equipment to be controlled to execute the control instruction corresponding to the voice instruction, and the control method comprises the following steps: if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
In accordance with another aspect of the present invention, there is provided an air conditioner including: the voice control device described above.
In accordance with the above method, a further aspect of the present invention provides a storage medium comprising: the storage medium has stored therein a plurality of instructions; the instructions are used for loading and executing the voice control method by the processor.
In accordance with the above method, another aspect of the present invention provides an air conditioner, comprising: a processor for executing a plurality of instructions; a memory to store a plurality of instructions; wherein the instructions are stored in the memory, and loaded by the processor and used for executing the voice control method.
According to the scheme of the invention, if the air-conditioning voice system cannot identify the user instruction in the using process, the language system which is consistent with the native language of the user is automatically switched according to the native language of the user, so that the user instruction is identified, and the success rate of voice identification is improved.
Further, according to the scheme of the invention, if the air-conditioning voice system cannot identify the user instruction in the using process, the language system which is consistent with the native language of the user is automatically switched according to the native language of the user, so that the user instruction is identified, and the using convenience of the user is improved.
Further, according to the scheme of the invention, if the air-conditioning voice system cannot identify the user instruction in the using process, the language system which is consistent with the native language of the user is automatically switched according to the native language of the user, so that the user instruction is identified, and the user experience is improved.
Furthermore, according to the scheme of the invention, the recognition is tried for several times when the voice system cannot recognize the user instruction, and the voice system is automatically switched if the user instruction cannot be recognized, so that the influence on the use of the user due to the false recognition can be avoided.
Furthermore, according to the scheme of the invention, the language system which is consistent with the language of the user's native place is automatically switched according to the native place of the user, so that the user instruction is identified, the success rate of voice identification can be improved, and the convenience and the humanized experience of the user can be further improved.
Therefore, according to the scheme of the invention, the user identity is identified according to the user instruction, and then the voice system for the inertial use is called according to the user identity to identify the user instruction, so that the problem that the user language cannot be identified when the language used by the user does not conform to the selected language system and the success rate of voice identification is low in the prior art in a mode of analyzing the user voice according to the determined language system or the mandarin system is solved, and therefore, the defects of low success rate of identification, small application range and poor user experience in the prior art are overcome, and the beneficial effects of high success rate of identification, large application range and good user experience are realized.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flow chart illustrating a voice control method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an embodiment of determining whether the identity of the user issuing the voice command belongs to a set identity range according to the method of the present invention;
FIG. 3 is a flow chart illustrating an embodiment of semantic parsing of the voice command by invoking an inertial voice system matching the user identity in the method of the present invention;
FIG. 4 is a flowchart illustrating an embodiment of continuing processing when parsing of the speech command by invoking the inertial speech system fails in the method of the present invention;
FIG. 5 is a flowchart illustrating an embodiment of continuing processing in the case of a failed parsing of the voice command by invoking a default voice system in the method of the present invention;
FIG. 6 is a schematic structural diagram of a voice control apparatus according to an embodiment of the present invention;
fig. 7 is a flowchart illustrating an automatic language switching system according to an embodiment of the air conditioner of the present invention.
The reference numbers in the embodiments of the present invention are as follows, in combination with the accompanying drawings:
102-an obtaining unit; 104-control unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
According to an embodiment of the present invention, a method for controlling speech is provided, as shown in fig. 1, which is a flow chart of an embodiment of the method of the present invention. The voice control method may include: step S110 to step S140.
At step S110, a voice instruction that can be used to control the device to be controlled is acquired. For example: and acquiring a voice instruction which can be used for controlling the equipment to be controlled in the environment to which the equipment to be controlled belongs.
Wherein the voice instruction may include: the system comprises a mandarin voice instruction, a foreign language voice instruction and a dialect voice instruction corresponding to the native place of a user to be used of the device to be controlled.
Therefore, through the voice instructions in various forms, the universality and the convenience of using the voice service by the user are improved.
Optionally, the obtaining of the voice instruction used for controlling the device to be controlled in step S110 may include: and acquiring a voice instruction which is acquired by the voice acquisition module and can be used for controlling the equipment to be controlled.
The voice acquisition module is arranged on any one of a side of the equipment to be controlled, the environment to which the equipment to be controlled belongs and a client side; and/or, the voice acquisition module may include: a microphone.
Therefore, the convenience and flexibility of the user for controlling the equipment to be controlled by using the voice instruction are improved by the voice instruction sending mode in various forms.
At step S120, it is determined whether the identity of the user issuing the voice instruction belongs to a set identity range.
Optionally, the setting the identity scope may include: the voiceprint range is set.
In combination with the schematic flow chart of an embodiment of determining whether the identity of the user who sends the voice command belongs to the set identity range in the method of the present invention shown in fig. 2, a specific process of determining whether the identity of the user who sends the voice command belongs to the set identity range in step S120 is further described, which may include: step S210 to step S240.
Step S210, recognizing the voiceprint information included in the voice command.
For example: in the using process, the voice air-conditioning microphone collects a voice instruction of a user, and firstly, the voiceprint of the voice is recognized.
Step S220, determining whether the voiceprint information is within the set voiceprint range.
Step S230, if the voiceprint information is within the set voiceprint range, determining that the user identity belongs to the set identity range.
For example: if the voiceprint is found in the user identity storage module, the conventional language module corresponding to the voiceprint is preferentially called to analyze and compare the user instruction.
Or, in step S240, if the voiceprint information is not within the set voiceprint range, it is determined that the user identity does not belong to the set identity range.
For example: if the voiceprint identity can not be identified, a voice module corresponding to the default language system is called to analyze and contrast the voice instruction.
Therefore, the user identity is determined by recognizing the voiceprint information protected by the voice command, so that the user identity is determined conveniently and reliably.
In step S130, if the user identity belongs to the set identity range, invoking an inertial voice system matched with the user identity to perform semantic analysis on the voice command.
Optionally, with reference to a flow diagram of an embodiment of the method of the present invention shown in fig. 3, which invokes an utterance voice system matching the user identity to perform semantic parsing on the voice instruction, further describing a specific process of invoking the utterance voice system matching the user identity to perform semantic parsing on the voice instruction in step S130, which may include: step S310 and step S320.
Step S310, according to the corresponding relation between the set identity and the set voice system, the set voice system corresponding to the set identity which is the same as the user identity in the corresponding relation is determined as the inertial voice system matched with the user identity.
Step S320, performing semantic analysis on the voice command according to the customary semantic library of the inertial voice system to obtain semantic keywords which are determined based on the inertial voice system and are matched with the voice command.
Therefore, the voice command of the user is semantically analyzed by calling the inertial voice system matched with the user identity according to the user identity and directly utilizing the inertial voice system used by the user, semantic keywords in the voice command can be quickly and accurately determined, and then the control command corresponding to the voice command is executed according to the semantic keywords, so that the reliability is high, and the user experience is good.
Alternatively, in step S140, if the user identity does not belong to the set identity range, a set default voice system is invoked to perform semantic parsing on the voice command.
For example: in the using process, if the air conditioner voice system cannot identify the user instruction, the voice system can automatically switch to a language system with the same language as the native language of the user according to the native language of the user to identify the user instruction; the voice control process becomes more flexible, user experience is improved, and accuracy of voice recognition is improved.
Therefore, the semantic analysis is carried out on the voice command by directly calling the inertial voice system matched with the user identity when the user identity sending the voice command belongs to the set identity range, and the default voice system set by the equipment to be controlled is called to carry out the semantic analysis on the voice command when the user identity sending the voice command does not belong to the set identity range, so that the semantic analysis can be carried out on the voice command according to the user identity, and the success rate and the recognition efficiency of semantic recognition in the semantic analysis are improved.
Optionally, invoking the set default voice system in step S140 to perform semantic parsing on the voice command may include: and performing semantic analysis on the voice instruction according to a default semantic library of the default voice system to obtain a semantic keyword which is determined based on the default voice system and is matched with the voice instruction.
Therefore, when the user identity corresponding to the voice instruction is not in the set identity range, the default voice system set by the equipment to be controlled is utilized to carry out semantic analysis on the voice instruction, the user whose identity is not in the set identity range can conveniently use the voice instruction to control the equipment to be controlled, and the convenience and the reliability of control are better.
Wherein, the setting voice system may include: a general speech sound system, a foreign language speech system, and a dialect speech system corresponding to the user native to the user to be used of the device to be controlled. The default speech system may include: a general speech sound system. The idiomatic speech system may include: the voice control system comprises any one of a common speaking voice system, a foreign language voice system and a dialect voice system corresponding to the user of the to-be-used user of the to-be-controlled device.
Therefore, convenience and universality of use of the user can be improved through various voice systems.
In an alternative embodiment, the method may further include: any of the following control scenarios continues to be handled.
The first control scenario: and continuing the processing process under the condition that the analysis of calling the inertial voice system to carry out semantic analysis on the voice command fails.
With reference to the flowchart of fig. 4, in an embodiment of the method of the present invention, further describing a specific process of continuing processing when an analysis of invoking the speech system to perform semantic analysis on the speech instruction fails, where the specific process may include: step S410 to step S430.
Step S410, after calling the voice command to perform semantic analysis by the inertial voice system matched with the user identity, determining whether the number of times of first analysis of the semantic analysis failure is greater than or equal to a first set number of times and/or whether the duration of the first analysis of the semantic analysis failure is greater than or equal to a first set duration under the condition that the calling of the voice command to perform semantic analysis by the inertial voice system matched with the user identity fails.
Step S420, if the first analysis time is greater than or equal to the first set time and/or the first analysis duration is greater than or equal to the first set duration, calling other voice systems in the set voice system except the inertial voice system to perform semantic analysis.
Or, in step S430, if the first analysis time is less than the first set time and/or the first analysis duration is less than the first set duration, continuing to perform semantic analysis on the voice command by using the speech-to-speech system.
For example: calling the conventional language module corresponding to the voiceprint to analyze and compare the user instruction, and automatically calling the second language module to recognize the voice if the voice instruction cannot be analyzed into a sentence with clear sentence meaning after three times of analysis.
Therefore, in the process of performing semantic analysis on the voice command by using the inertial voice system, if the analysis fails, whether to switch to another voice system or continue to perform semantic analysis by using the inertial voice system is controlled according to the number of times, time and the like of the failed analysis, so that the method is beneficial to trying to perform the analysis again or using another voice system for analysis as much as possible under the condition that the voice analysis fails, and the reliability of the use of a user can be improved.
The second control scenario: and continuing the processing process under the condition that the default voice system is called to fail to analyze the semantic analysis of the voice command.
With reference to the flowchart of fig. 5, which illustrates an embodiment of continuing processing when the parsing of the default speech system to perform semantic parsing on the speech instruction fails in the method of the present invention, further describing a specific process of continuing processing when the parsing of the default speech system to perform semantic parsing on the speech instruction fails, which may include: step S510 to step S530.
Step S510, after the default voice system is called to perform semantic parsing on the voice command, and under the condition that the default voice system is called to fail in parsing the voice command, determining whether a second parsing time of parsing failure is greater than or equal to a second set time and/or whether a second parsing time of parsing failure is greater than or equal to a second set time.
Step S520, if the second parsing time is greater than or equal to the second set time and/or the second parsing time is greater than or equal to the second set time, invoking other voice systems in the set voice system except the default voice system to perform semantic parsing.
Or, in step S530, if the second parsing time is less than the second set time, and/or the second parsing time is less than the second set time, continuing to perform semantic parsing on the voice command by using the default voice system.
For example: in the using process, if the condition that the air-conditioning voice system cannot recognize the user instruction is met, the voice system can recognize the voice instruction for many times, and the recognition times are not more than three times; if the three times of recognition fail, the voice system can automatically switch to a language system consistent with the native language of the user according to the native language of the user, and the user instruction is recognized.
For example: and calling a voice module corresponding to the default language system to analyze and compare the voice instruction. In the process of analyzing and identifying, if the voice instruction can not be analyzed into a sentence with clear sentence meaning by three times of analysis, the voice module corresponding to the second language system is automatically called to analyze and compare the voice instruction of the user.
Therefore, in the process of performing semantic analysis on the voice command by using the default voice system, if the analysis fails, whether to switch to another voice system or continue to perform semantic analysis by using the default voice system is controlled according to the number of times, time and the like of the failed analysis, so that the method is beneficial to trying to perform analysis again or using another voice system for analysis as much as possible under the condition that the voice analysis fails, and the reliability of use of a user can be improved.
In an alternative embodiment, the method may further include: and under the condition that the semantic analysis of the voice instruction is successfully carried out by calling a set default voice system, or under the condition that the semantic analysis of other voice systems except the default voice system in the set voice system is successfully carried out, or under the condition that the semantic analysis of the voice instruction is successfully carried out by continuously using the default voice system, determining the current voice system which successfully carries out the voice analysis of the voice instruction of the user identity which does not belong to the set identity range as the inertial voice system of the user identity, and storing the user identity into the set identity range.
For example: and automatically calling a voice module corresponding to the second language system to analyze and compare the voice instruction of the user. And after the analysis is successful, recording the times of successfully analyzing the language module called by the user voice, taking the language module with more calling times as a user idiomatic language system, recording the language module in a user identity storage module corresponding to the user voiceprint, and calling the language module preferentially when the voice is identified next time.
Therefore, after the voice command of the user of which the identity does not belong to the set identity range is successfully analyzed, the user identity of the user is stored, the user identity of the user is matched for the voice system of the user according to the successfully analyzed current voice system, the voice system of the user can be quickly and accurately utilized for semantic analysis when the user uses the voice service next time, and the efficiency and the convenience of the user in using the voice service are improved.
In an alternative embodiment, at least one of the following processing scenarios may also be included.
The first processing scenario: the process of pre-storing the user identity, the voice system for inertia and the corresponding relationship thereof can be specifically as follows:
before determining whether the user identity sending the voice command belongs to the set identity range, storing the user identity of the user to be used of the equipment to be controlled, and establishing a corresponding relation between the user identity and the voice system for the inertial use.
For example: before the system is used, a user needs to select and add a local dialect which is used by the user, the default language system of the voice air conditioner is mandarin, the second language system is the dialect which is selected by the user, and the user can adjust the sequence of the local dialect and the dialect according to own habits.
Therefore, the user identity, the inertial voice system and the corresponding relation of the inertial voice system are stored in advance, the inertial voice system can be conveniently and directly called by the user according to the user identity when the user uses the inertial voice system, and the efficiency and accuracy of semantic analysis are improved.
The second case: the process of executing the control instruction corresponding to the voice instruction according to the semantic keyword obtained by semantic analysis may specifically be as follows:
after calling an inertial voice system matched with the user identity to perform semantic analysis on the voice command, or calling a set default voice system to perform semantic analysis on the voice command, controlling equipment to be controlled to execute a control command corresponding to the voice command according to semantic keywords obtained by calling the inertial voice system matched with the user identity to perform semantic analysis on the voice command or semantic keywords obtained by calling the set default voice system to perform semantic analysis on the voice command.
The controlling the device to be controlled to execute the control instruction corresponding to the voice instruction may include: if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
For example: the function supports the function of automatically translating Chinese into English, and aims at the condition that a user needs to speak out singer English names, song names and the like in the using process, but English pronunciation of part of users is difficult. For example: the user wants to listen to the English song "my heart will go on", but the user does not want to speak English, the user can directly use the instruction "i want to listen to the English song and i am heart is permanent", the voice air conditioner can automatically translate "i am heart is permanent" into "my heart will go on", and the instruction is executed.
Therefore, the voice command of the user is executed according to the semantic keyword corresponding to the control command obtained by semantic analysis, the voice control of the user on the device to be controlled can be realized, translation can be carried out according to different language requirements, the use of the user is greatly facilitated, the intelligent degree is high, and the humanization is good.
Through a large number of tests, the technical scheme of the embodiment is adopted, if the situation that the air-conditioning voice system cannot identify the user instruction is met in the using process, the user instruction is automatically switched into the language system which is consistent with the language of the user native place according to the user native place, the user instruction is identified, and the success rate of voice identification is improved.
According to the embodiment of the invention, a voice control device corresponding to the voice control method is also provided. Referring to fig. 6, a schematic diagram of an embodiment of the apparatus of the present invention is shown. The voice control apparatus may include: an acquisition unit 102 and a control unit 104.
In an alternative example, the obtaining unit 102 may be configured to obtain a voice instruction that may be used to control a device to be controlled. For example: and acquiring a voice instruction which can be used for controlling the equipment to be controlled in the environment to which the equipment to be controlled belongs. The specific functions and processes of the acquiring unit 102 are referred to in step S110.
Wherein the voice instruction may include: the system comprises a mandarin voice instruction, a foreign language voice instruction and a dialect voice instruction corresponding to the native place of a user to be used of the device to be controlled.
Therefore, through the voice instructions in various forms, the universality and the convenience of using the voice service by the user are improved.
Optionally, the acquiring unit 102 acquires a voice instruction that may be used to control a device to be controlled, and may include: the obtaining unit 102 may be further specifically configured to obtain a voice instruction, which is collected by the voice collecting module and may be used to control the device to be controlled.
The voice acquisition module is arranged on any one of a side of the equipment to be controlled, the environment to which the equipment to be controlled belongs and a client side; and/or, the voice acquisition module may include: a microphone.
Therefore, the convenience and flexibility of the user for controlling the equipment to be controlled by using the voice instruction are improved by the voice instruction sending mode in various forms.
In an alternative example, the control unit 104 may be configured to determine whether the identity of the user issuing the voice command belongs to a set identity range. The specific function and processing of the control unit 104 are referred to in step S120.
Optionally, the setting the identity scope may include: the voiceprint range is set.
The determining, by the control unit 104, whether the identity of the user who issues the voice instruction belongs to a set identity range may include:
the control unit 104 may be further specifically configured to recognize voiceprint information included in the voice command. The specific functions and processes of the control unit 104 are also referred to in step S210.
For example: in the using process, the voice air-conditioning microphone collects a voice instruction of a user, and firstly, the voiceprint of the voice is recognized.
The control unit 104 may be further configured to determine whether the voiceprint information is within the set voiceprint range. The specific functions and processes of the control unit 104 are also referred to in step S220.
The control unit 104 may be further configured to determine that the user identity belongs to the set identity range if the voiceprint information is within the set voiceprint range. The specific function and processing of the control unit 104 are also referred to in step S230.
For example: if the voiceprint is found in the user identity storage module, the conventional language module corresponding to the voiceprint is preferentially called to analyze and compare the user instruction.
Or, the control unit 104 may be further specifically configured to determine that the user identity does not belong to the set identity range if the voiceprint information is not within the set voiceprint range. The specific function and processing of the control unit 104 are also referred to in step S240.
For example: if the voiceprint identity can not be identified, a voice module corresponding to the default language system is called to analyze and contrast the voice instruction.
Therefore, the user identity is determined by recognizing the voiceprint information protected by the voice command, so that the user identity is determined conveniently and reliably.
In an optional example, the control unit 104 may be further configured to invoke an idiomatic voice system matched with the user identity to perform semantic parsing on the voice command if the user identity belongs to the set identity range. The specific function and processing of the control unit 104 are also referred to in step S130.
Optionally, the invoking, by the control unit 104, an inertial speech system matched with the user identity to perform semantic parsing on the voice instruction may include:
the control unit 104 may be further configured to determine, according to a correspondence between a set identity and a set voice system, a set voice system corresponding to the set identity that is the same as the user identity in the correspondence as an idiom voice system that is matched with the user identity. The specific functions and processes of the control unit 104 are also referred to in step S310.
The control unit 104 may be further configured to perform semantic parsing on the voice command according to a conventional semantic library of the speech-to-speech system, so as to obtain a semantic keyword matched with the voice command and determined based on the speech-to-speech system. The specific functions and processes of the control unit 104 are also referred to in step S320.
Therefore, the voice command of the user is semantically analyzed by calling the inertial voice system matched with the user identity according to the user identity and directly utilizing the inertial voice system used by the user, semantic keywords in the voice command can be quickly and accurately determined, and then the control command corresponding to the voice command is executed according to the semantic keywords, so that the reliability is high, and the user experience is good.
Or, in an optional example, the control unit 104 may be further configured to invoke a set default voice system to perform semantic parsing on the voice command if the user identity does not belong to the set identity range. The specific function and processing of the control unit 104 are also referred to in step S140.
For example: in the using process, if the air conditioner voice system cannot identify the user instruction, the voice system can automatically switch to a language system with the same language as the native language of the user according to the native language of the user to identify the user instruction; the voice control process becomes more flexible, user experience is improved, and accuracy of voice recognition is improved.
Therefore, the semantic analysis is carried out on the voice command by directly calling the inertial voice system matched with the user identity when the user identity sending the voice command belongs to the set identity range, and the default voice system set by the equipment to be controlled is called to carry out the semantic analysis on the voice command when the user identity sending the voice command does not belong to the set identity range, so that the semantic analysis can be carried out on the voice command according to the user identity, and the success rate and the recognition efficiency of semantic recognition in the semantic analysis are improved.
Optionally, the invoking, by the control unit 104, a set default speech system to perform semantic parsing on the speech instruction may include: the control unit 104 may be further configured to perform semantic parsing on the voice instruction according to the default semantic library of the default voice system, so as to obtain a semantic keyword matched with the voice instruction and determined based on the default voice system.
Therefore, when the user identity corresponding to the voice instruction is not in the set identity range, the default voice system set by the equipment to be controlled is utilized to carry out semantic analysis on the voice instruction, the user whose identity is not in the set identity range can conveniently use the voice instruction to control the equipment to be controlled, and the convenience and the reliability of control are better.
Wherein, the setting voice system may include: a general speech sound system, a foreign language speech system, and a dialect speech system corresponding to the user native to the user to be used of the device to be controlled. The default speech system may include: a general speech sound system. The idiomatic speech system may include: the voice control system comprises any one of a common speaking voice system, a foreign language voice system and a dialect voice system corresponding to the user of the to-be-used user of the to-be-controlled device.
Therefore, convenience and universality of use of the user can be improved through various voice systems.
In an alternative embodiment, the method may further include: any of the following control scenarios continues to be handled.
The first control scenario: the process of continuing processing under the condition that the analysis for calling the idiomatic voice system to carry out semantic analysis on the voice command fails is as follows:
the control unit 104 may be further configured to, after invoking the speech instruction to perform semantic parsing by using the speech system matched with the user identity, determine whether a first parsing time of the parsing failure is greater than or equal to a first set time and/or whether a first parsing duration of the parsing failure is greater than or equal to a first set duration under the condition that the invoking of the speech system matched with the user identity fails to perform semantic parsing on the speech instruction. The specific functions and processes of the control unit 104 are also referred to in step S410.
The control unit 104 may be further configured to invoke other voice systems in the set voice system except the inertial voice system to perform semantic analysis if the first analysis time is greater than or equal to the first set time and/or the first analysis duration is greater than or equal to the first set duration. The specific function and processing of the control unit 104 are also referred to in step S420.
Or, the control unit 104 may be further configured to continue to perform semantic analysis on the voice command by using the speech-to-speech system if the first analysis number is smaller than the first set number and/or the first analysis duration is smaller than the first set duration. The specific functions and processes of the control unit 104 are also referred to in step S430.
For example: calling the conventional language module corresponding to the voiceprint to analyze and compare the user instruction, and automatically calling the second language module to recognize the voice if the voice instruction cannot be analyzed into a sentence with clear sentence meaning after three times of analysis.
Therefore, in the process of performing semantic analysis on the voice command by using the inertial voice system, if the analysis fails, whether to switch to another voice system or continue to perform semantic analysis by using the inertial voice system is controlled according to the number of times, time and the like of the failed analysis, so that the method is beneficial to trying to perform the analysis again or using another voice system for analysis as much as possible under the condition that the voice analysis fails, and the reliability of the use of a user can be improved.
The second control scenario: the process of continuing the processing when the default voice system is called to carry out the analysis of the semantic analysis on the voice command fails is specifically as follows:
the control unit 104 may be further configured to, after invoking the set default voice system to perform semantic parsing on the voice command, determine whether a second parsing time of the parsing failure is greater than or equal to a second set time and/or whether a second parsing duration of the parsing failure is greater than or equal to a second set duration when the parsing of the voice command by invoking the set default voice system fails. The specific functions and processes of the control unit 104 are also referred to in step S510.
The control unit 104 may be further configured to invoke other voice systems in the set voice system except the default voice system to perform semantic analysis if the second parsing time is greater than or equal to the second set time and/or the second parsing time is greater than or equal to the second set time. The specific functions and processes of the control unit 104 are also referred to in step S520.
Or, the control unit 104 may be further configured to continue to use the default speech system to perform semantic parsing on the speech instruction if the second parsing time is less than the second set time and/or the second parsing duration is less than the second set duration. The specific functions and processes of the control unit 104 are also referred to in step S530.
For example: in the using process, if the condition that the air-conditioning voice system cannot recognize the user instruction is met, the voice system can recognize the voice instruction for many times, and the recognition times are not more than three times; if the three times of recognition fail, the voice system can automatically switch to a language system consistent with the native language of the user according to the native language of the user, and the user instruction is recognized.
For example: and calling a voice module corresponding to the default language system to analyze and compare the voice instruction. In the process of analyzing and identifying, if the voice instruction can not be analyzed into a sentence with clear sentence meaning by three times of analysis, the voice module corresponding to the second language system is automatically called to analyze and compare the voice instruction of the user.
Therefore, in the process of performing semantic analysis on the voice command by using the default voice system, if the analysis fails, whether to switch to another voice system or continue to perform semantic analysis by using the default voice system is controlled according to the number of times, time and the like of the failed analysis, so that the method is beneficial to trying to perform analysis again or using another voice system for analysis as much as possible under the condition that the voice analysis fails, and the reliability of use of a user can be improved.
In an alternative embodiment, the method may further include: the control unit 104 may be further configured to determine, when the semantic analysis of the voice instruction is successfully performed by calling a set default voice system, or when the semantic analysis of other voice systems except the default voice system is successfully performed by calling the set default voice system, or when the semantic analysis of the voice instruction is successfully performed by continuing to use the default voice system, a current voice system in which the semantic analysis of the voice instruction of the user identity that does not belong to the set identity range is successfully performed is determined as an idiom voice system of the user identity, and store the user identity into the set identity range.
For example: and automatically calling a voice module corresponding to the second language system to analyze and compare the voice instruction of the user. And after the analysis is successful, recording the times of successfully analyzing the language module called by the user voice, taking the language module with more calling times as a user idiomatic language system, recording the language module in a user identity storage module corresponding to the user voiceprint, and calling the language module preferentially when the voice is identified next time.
Therefore, after the voice command of the user of which the identity does not belong to the set identity range is successfully analyzed, the user identity of the user is stored, the user identity of the user is matched for the voice system of the user according to the successfully analyzed current voice system, the voice system of the user can be quickly and accurately utilized for semantic analysis when the user uses the voice service next time, and the efficiency and the convenience of the user in using the voice service are improved.
In an alternative embodiment, at least one of the following processing scenarios may also be included.
The first processing scenario: the process of pre-storing the user identity, the voice system for the inertia and the corresponding relation thereof is as follows:
the control unit 104 may be further configured to store the user identity of the user to be used of the device to be controlled, and establish a corresponding relationship between the user identity and the speech-to-use system before determining whether the user identity that has sent the speech instruction belongs to the set identity range.
For example: before the system is used, a user needs to select and add a local dialect which is used by the user, the default language system of the voice air conditioner is mandarin, the second language system is the dialect which is selected by the user, and the user can adjust the sequence of the local dialect and the dialect according to own habits.
Therefore, the user identity, the inertial voice system and the corresponding relation of the inertial voice system are stored in advance, the inertial voice system can be conveniently and directly called by the user according to the user identity when the user uses the inertial voice system, and the efficiency and accuracy of semantic analysis are improved.
The second case: executing a control instruction corresponding to the voice instruction according to the semantic keyword obtained by semantic analysis, which comprises the following specific steps:
the control unit 104 may be further configured to, after invoking the speech command to perform semantic analysis by using the speech system matched with the user identity, or after invoking the set default speech system to perform semantic analysis on the speech command, control the device to be controlled to execute the control command corresponding to the speech command according to a semantic keyword obtained by invoking the speech system matched with the user identity to perform semantic analysis on the speech command, or a semantic keyword obtained by invoking the set default speech system to perform semantic analysis on the speech command.
The controlling unit 104 controls the device to be controlled to execute the control instruction corresponding to the voice instruction, which may include: the control unit 104 may be further configured to translate the control instruction corresponding to the voice instruction and then execute the translated control instruction if the semantic keyword of the voice instruction includes a translation semantic between a chinese language and a foreign language.
For example: the function supports the function of automatically translating Chinese into English, and aims at the condition that a user needs to speak out singer English names, song names and the like in the using process, but English pronunciation of part of users is difficult. For example: the user wants to listen to the English song "my heart will go on", but the user does not want to speak English, the user can directly use the instruction "i want to listen to the English song and i am heart is permanent", the voice air conditioner can automatically translate "i am heart is permanent" into "my heart will go on", and the instruction is executed.
Therefore, the voice command of the user is executed according to the semantic keyword corresponding to the control command obtained by semantic analysis, the voice control of the user on the device to be controlled can be realized, translation can be carried out according to different language requirements, the use of the user is greatly facilitated, the intelligent degree is high, and the humanization is good.
Since the processes and functions implemented by the apparatus of this embodiment substantially correspond to the embodiments, principles and examples of the method shown in fig. 1 to 5, the description of this embodiment is not detailed, and reference may be made to the related descriptions in the foregoing embodiments, which are not repeated herein.
Through a large number of tests, the technical scheme of the invention is adopted, if the situation that the air-conditioning voice system cannot identify the user instruction is met in the using process, the language system which is consistent with the native language of the user is automatically switched according to the native language of the user, so that the user instruction is identified, and the using convenience of the user is improved.
According to the embodiment of the invention, an air conditioner corresponding to the voice control device is also provided. The air conditioner may include: the voice control device described above.
In an optional implementation mode, aiming at the problem that the success rate of voice recognition is low when the language used by a user does not conform to the selected language system, the invention provides a voice air conditioner capable of automatically switching a language system. Therefore, the voice control process becomes more flexible, the user experience is improved, and the accuracy of voice recognition is improved.
The condition that cannot be identified may include: the voice system does not understand the words of the user, namely the user semantics analyzed by the voice recognition technology do not accord with the language logic, and the equipment cannot obtain the intention of the user; the reason for the unrecognizable reason may be that the language is different, and then the system is switched to another language system.
For example: or the recognition can be tried for several times when the voice system can not recognize the user instruction, and the voice system can be automatically switched if the user instruction can not be recognized. Such as: in the using process, if the condition that the air-conditioning voice system cannot recognize the user instruction is met, the voice system can recognize the voice instruction for many times, and the recognition times are not more than three times; if the three times of recognition fail, the voice system can automatically switch to a language system consistent with the native language of the user according to the native language of the user, and the user instruction is recognized.
For example: the native place or the household registration of the user can be obtained by the user identity card information, the active input of the user, the analysis of the dialect of the user and the like.
In an optional example, in an aspect of the present invention, the voice air conditioner capable of automatically switching the language system may include: the system can automatically switch between the system of Mandarin, local dialect and English language; binding the user identity and the user idiomatic language; support the automatic translation of Chinese song title into English song title.
In an alternative embodiment, reference may be made to the example shown in fig. 7 to illustrate a specific implementation process of the scheme of the present invention.
In an alternative embodiment, the system may include: the device comprises a voice storage module, a user identity storage module and a music translation module.
Optionally, the voice storage module: can be used for storing a mandarin chinese voice module and a dialect voice module.
Optionally, the user identity storage module: can be used for binding and storing user voiceprints and idiomatic languages.
Optionally, the music translation module: the English song name and the corresponding Chinese song name, the English name of the European and American singer and the corresponding Chinese name are used for storing the popular English song name and the corresponding Chinese song name.
In an alternative specific example, referring to the example shown in fig. 7, a specific implementation process of the scheme of the present invention may include:
step 1, before the user uses the system, the user needs to choose to add a local dialect which is used by the user, the default language system of the voice air conditioner is mandarin, the second language system is the dialect which is chosen by the user, and the user can also adjust the sequence of the local dialect and the dialect according to the habit of the user.
Step 2, in the using process, the voice air-conditioning microphone collects the voice instruction of the user, firstly, the voiceprint of the voice is identified:
if the voiceprint is found in the user identity storage module, the conventional language module corresponding to the voiceprint is preferentially called to analyze and compare the user instruction, and if the voice instruction cannot be analyzed into a sentence with clear sentence meaning after three times of analysis, the second language module is automatically called to identify the voice.
If the voiceprint identity can not be identified, a voice module corresponding to the default language system is called to analyze and contrast the voice instruction. In the process of analyzing and identifying, if the voice instruction can not be analyzed into a sentence with clear sentence meaning by three times of analysis, the voice module corresponding to the second language system is automatically called to analyze and compare the voice instruction of the user. And after the analysis is successful, recording the times of successfully analyzing the language module called by the user voice, taking the language module with more calling times as a user idiomatic language system, recording the language module in a user identity storage module corresponding to the user voiceprint, and calling the language module preferentially when the voice is identified next time.
The method solves the problem that the old people are used to speak dialects and children only speak Mandarin under the condition of multiple users at home. When a plurality of users control the air conditioner in a household, voice commands are often not issued by one person, so that language types are in a switching state. For example, a child turns the skin to adjust the air conditioner to 16 degrees in the last second, an old person worrys about cold of the child and adjusts the air conditioner to 26 degrees, the child uses the mandarin to issue an instruction, and the old person uses the dialect to issue a voice instruction.
In addition, the voice air-conditioning music playing function supports the function of automatically translating Chinese into English, and the function aims at the condition that a user needs to speak the singer English name, the singer name and the like in the using process, but English pronunciation of part of users is difficult. For example: the user wants to listen to the English song "my heart will go on", but the user does not want to speak English, the user can directly use the instruction "i want to listen to the English song and i am heart is permanent", the voice air conditioner can automatically translate "i am heart is permanent" into "my heart will go on", and the instruction is executed.
Since the processing and functions of the air conditioner of this embodiment are basically corresponding to the embodiments, principles and examples of the apparatus shown in fig. 6, the description of this embodiment is not given in detail, and reference may be made to the related descriptions in the embodiments, which are not described herein again.
Through a large number of tests, the technical scheme of the invention is adopted, if the situation that the air-conditioning voice system cannot identify the user instruction is met in the using process, the language system which is consistent with the native language of the user is automatically switched according to the native language of the user, the user instruction is identified, and the user experience is improved.
According to an embodiment of the present invention, there is also provided a storage medium corresponding to the voice control method. The storage medium may include: the storage medium has stored therein a plurality of instructions; the instructions are used for loading and executing the voice control method by the processor.
Since the processing and functions implemented by the storage medium of this embodiment substantially correspond to the embodiments, principles, and examples of the methods shown in fig. 1 to fig. 5, details are not described in the description of this embodiment, and reference may be made to the related descriptions in the foregoing embodiments, which are not described herein again.
Through a large number of tests, the technical scheme of the invention is adopted, the recognition is tried for several times when the voice system cannot recognize the user instruction, and the voice system is automatically switched if the user instruction cannot be recognized, so that the influence on the use of the user due to the false recognition can be avoided.
According to the embodiment of the invention, an air conditioner corresponding to the voice control method is also provided. The air conditioner may include: a processor for executing a plurality of instructions; a memory to store a plurality of instructions; wherein the instructions are stored in the memory, and loaded by the processor and used for executing the voice control method.
Since the processing and functions of the air conditioner of this embodiment are basically corresponding to the embodiments, principles and examples of the methods shown in fig. 1 to 5, the description of this embodiment is not detailed, and reference may be made to the related descriptions in the foregoing embodiments, which are not described herein again.
After a large number of tests and verifications, the technical scheme of the invention is adopted to automatically switch the language system which is consistent with the language of the user's native place according to the native place of the user, so as to identify the user instruction, improve the success rate of voice identification and further improve the convenience and humanized experience of the user.
In summary, it is readily understood by those skilled in the art that the advantageous modes described above can be freely combined and superimposed without conflict.
The above description is only an example of the present invention, and is not intended to limit the present invention, and it is obvious to those skilled in the art that various modifications and variations can be made in the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (19)

1. A voice control method, comprising:
acquiring a voice instruction for controlling equipment to be controlled;
determining whether the identity of the user sending the voice instruction belongs to a set identity range;
if the user identity belongs to the set identity range, calling an inertial voice system matched with the user identity to carry out semantic analysis on the voice command; if the user identity does not belong to the set identity range, calling a set default voice system to carry out semantic analysis on the voice instruction; identifying a user identity according to a user instruction, and calling an inertial voice system according to the user identity to identify the user instruction;
further comprising:
under the condition that the semantic analysis of the voice command by calling the inertial voice system matched with the user identity fails, determining whether the first analysis frequency of the analysis failure is greater than or equal to a first set frequency and/or whether the first analysis duration of the analysis failure is greater than or equal to a first set duration;
if the first analysis times are larger than or equal to the first set times and/or the first analysis duration is larger than or equal to the first set duration, calling other voice systems in the set voice system except the inertial voice system to perform semantic analysis; if the first analysis times are less than the first set times and/or the first analysis duration is less than the first set duration, continuing to use the inertial voice system to perform semantic analysis on the voice instruction;
alternatively, the first and second electrodes may be,
under the condition that the analysis of calling a set default voice system to carry out semantic analysis on the voice command fails, determining whether the second analysis frequency of the analysis failure is greater than or equal to a second set frequency and/or whether the second analysis duration of the analysis failure is greater than or equal to a second set duration;
if the second analysis times are greater than or equal to the second set times and/or the second analysis duration is greater than or equal to the second set duration, calling other voice systems except the default voice system in the set voice system to perform semantic analysis; and if the second analysis times are less than the second set times and/or the second analysis duration is less than the second set duration, continuing to use the default voice system to perform semantic analysis on the voice command.
2. The method of claim 1, wherein,
acquiring a voice instruction for controlling a device to be controlled, comprising:
acquiring a voice instruction which is acquired by a voice acquisition module and used for controlling equipment to be controlled;
the voice acquisition module is arranged on any one of a side of the equipment to be controlled, the environment to which the equipment to be controlled belongs and a client side; and/or, the voice acquisition module comprises: a microphone;
and/or the presence of a gas in the gas,
the setting of the identity scope comprises: setting a voiceprint range;
wherein, determining whether the identity of the user who sends the voice command belongs to a set identity range comprises:
recognizing voice print information contained in the voice command;
determining whether the voiceprint information is within the set voiceprint range;
if the voiceprint information is in the set voiceprint range, determining that the user identity belongs to the set identity range;
or if the voiceprint information is not in the set voiceprint range, determining that the user identity does not belong to the set identity range.
3. The method according to claim 1 or 2, wherein,
calling an inertial voice system matched with the user identity to carry out semantic analysis on the voice instruction, wherein the semantic analysis comprises the following steps:
according to the corresponding relation between the set identity and the set voice system, determining the set voice system corresponding to the set identity which is the same as the user identity in the corresponding relation as an inertial voice system matched with the user identity;
performing semantic analysis on the voice command according to a customary semantic library of the inertial voice system to obtain semantic keywords which are determined based on the inertial voice system and matched with the voice command;
and/or the presence of a gas in the gas,
calling a set default voice system to carry out semantic analysis on the voice instruction, wherein the semantic analysis comprises the following steps:
and performing semantic analysis on the voice instruction according to a default semantic library of the default voice system to obtain a semantic keyword which is determined based on the default voice system and is matched with the voice instruction.
4. The method of claim 1, further comprising:
and under the condition that the semantic analysis of the voice instruction is successfully carried out by calling a set default voice system, or under the condition that the semantic analysis of other voice systems except the default voice system in the set voice system is successfully carried out, or under the condition that the semantic analysis of the voice instruction is successfully carried out by continuously using the default voice system, determining the current voice system which successfully carries out the voice analysis of the voice instruction of the user identity which does not belong to the set identity range as the inertial voice system of the user identity, and storing the user identity into the set identity range.
5. The method of claim 1 or 4, wherein,
the voice instruction includes: the system comprises a mandarin voice instruction, a foreign language voice instruction and a dialect voice instruction corresponding to a user native to a user to be used of the device to be controlled;
and/or the presence of a gas in the gas,
the setting voice system comprises: the system comprises a common speech sound system, a foreign language speech sound system and a dialect speech sound system corresponding to the native place of a user to be used of the device to be controlled;
the default speech system, comprising: a common speech-sound system;
the idiomatic speech system, comprising: the voice control system comprises any one of a common speaking voice system, a foreign language voice system and a dialect voice system corresponding to the user of the to-be-used user of the to-be-controlled device.
6. The method of any one of claims 1, 2, and 4, further comprising:
storing the user identity of a user to be used of the equipment to be controlled, and establishing a corresponding relation between the user identity and the voice system for inertial use;
and/or the presence of a gas in the gas,
controlling equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction;
wherein the content of the first and second substances,
controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction, comprising:
if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
7. The method of claim 3, further comprising:
storing the user identity of a user to be used of the equipment to be controlled, and establishing a corresponding relation between the user identity and the voice system for inertial use;
and/or the presence of a gas in the gas,
controlling equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction;
wherein the content of the first and second substances,
controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction, comprising:
if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
8. The method of claim 5, further comprising:
storing the user identity of a user to be used of the equipment to be controlled, and establishing a corresponding relation between the user identity and the voice system for inertial use;
and/or the presence of a gas in the gas,
controlling equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction;
wherein the content of the first and second substances,
controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction, comprising:
if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
9. A voice control apparatus, comprising:
the device comprises an acquisition unit, a control unit and a control unit, wherein the acquisition unit is used for acquiring a voice instruction for controlling the device to be controlled;
the control unit is used for determining whether the identity of the user sending the voice command belongs to a set identity range or not;
the control unit is further used for calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction if the user identity belongs to the set identity range; the control unit is also used for calling a set default voice system to carry out semantic analysis on the voice instruction if the user identity does not belong to the set identity range; identifying a user identity according to a user instruction, and calling an inertial voice system according to the user identity to identify the user instruction;
further comprising:
the control unit is further configured to determine whether a first analysis frequency of the analysis failure is greater than or equal to a first set frequency and/or whether a first analysis duration of the analysis failure is greater than or equal to a first set duration under the condition that the analysis of calling the speech instruction to perform semantic analysis by the speech system matched with the user identity fails;
the control unit is further configured to call other voice systems except the inertial voice system in the set voice system to perform semantic analysis if the first analysis frequency is greater than or equal to the first set frequency and/or the first analysis duration is greater than or equal to the first set duration; the control unit is further configured to continue to use the speech command to perform semantic analysis on the speech command by using the inertial speech system if the first analysis frequency is less than the first set frequency and/or the first analysis duration is less than the first set duration;
alternatively, the first and second electrodes may be,
the control unit is further configured to determine whether a second analysis frequency of the analysis failure is greater than or equal to a second set frequency and/or whether a second analysis duration of the analysis failure is greater than or equal to a second set duration under the condition that the analysis of calling a set default voice system to perform semantic analysis on the voice instruction fails;
the control unit is further configured to call other voice systems except the default voice system in the set voice system to perform semantic analysis if the second analysis frequency is greater than or equal to the second set frequency and/or the second analysis duration is greater than or equal to the second set duration; the control unit is further configured to continue to use the default speech system to perform semantic analysis on the speech instruction if the second analysis frequency is less than the second set frequency and/or the second analysis duration is less than the second set duration.
10. The apparatus of claim 9, wherein,
the acquiring unit acquires a voice instruction for controlling the device to be controlled, and the acquiring unit comprises:
acquiring a voice instruction which is acquired by a voice acquisition module and used for controlling equipment to be controlled;
the voice acquisition module is arranged on any one of a side of the equipment to be controlled, the environment to which the equipment to be controlled belongs and a client side; and/or, the voice acquisition module comprises: a microphone;
and/or the presence of a gas in the gas,
the setting of the identity scope comprises: setting a voiceprint range;
wherein, the control unit determines whether the identity of the user who sends the voice command belongs to a set identity range, and the method comprises the following steps:
recognizing voice print information contained in the voice command;
determining whether the voiceprint information is within the set voiceprint range;
if the voiceprint information is in the set voiceprint range, determining that the user identity belongs to the set identity range;
or if the voiceprint information is not in the set voiceprint range, determining that the user identity does not belong to the set identity range.
11. The apparatus of claim 9 or 10, wherein,
the control unit calls an inertial voice system matched with the user identity to carry out semantic analysis on the voice command, and the semantic analysis comprises the following steps:
according to the corresponding relation between the set identity and the set voice system, determining the set voice system corresponding to the set identity which is the same as the user identity in the corresponding relation as an inertial voice system matched with the user identity;
performing semantic analysis on the voice command according to a customary semantic library of the inertial voice system to obtain semantic keywords which are determined based on the inertial voice system and matched with the voice command;
and/or the presence of a gas in the gas,
the control unit calls a set default voice system to carry out semantic analysis on the voice command, and the semantic analysis comprises the following steps:
and performing semantic analysis on the voice instruction according to a default semantic library of the default voice system to obtain a semantic keyword which is determined based on the default voice system and is matched with the voice instruction.
12. The apparatus of claim 9, further comprising:
the control unit is further configured to determine, when the semantic analysis of the voice command is successfully performed by calling the set default voice system, or when the semantic analysis of other voice systems except the default voice system is successfully performed by calling the set default voice system, or when the semantic analysis of the voice command is successfully performed by continuing to use the default voice system, a current voice system in which the analysis of the voice command of the user identity that does not belong to the set identity range is successfully performed is determined as the speech system for the user identity, and store the user identity into the set identity range.
13. The apparatus of claim 9 or 12, wherein,
the voice instruction includes: the system comprises a mandarin voice instruction, a foreign language voice instruction and a dialect voice instruction corresponding to a user native to a user to be used of the device to be controlled;
and/or the presence of a gas in the gas,
the setting voice system comprises: the system comprises a common speech sound system, a foreign language speech sound system and a dialect speech sound system corresponding to the native place of a user to be used of the device to be controlled;
the default speech system, comprising: a common speech-sound system;
the idiomatic speech system, comprising: the voice control system comprises any one of a common speaking voice system, a foreign language voice system and a dialect voice system corresponding to the user of the to-be-used user of the to-be-controlled device.
14. The apparatus of any one of claims 9, 10, and 12, further comprising:
the control unit is also used for storing the user identity of the user to be used of the equipment to be controlled and establishing the corresponding relation between the user identity and the voice system for the inertial use;
and/or the presence of a gas in the gas,
the control unit is further used for controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction;
wherein the content of the first and second substances,
the control unit controls the equipment to be controlled to execute a control instruction corresponding to the voice instruction, and the control method comprises the following steps:
if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
15. The apparatus of claim 11, further comprising:
the control unit is also used for storing the user identity of the user to be used of the equipment to be controlled and establishing the corresponding relation between the user identity and the voice system for the inertial use;
and/or the presence of a gas in the gas,
the control unit is further used for controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction;
wherein the content of the first and second substances,
the control unit controls the equipment to be controlled to execute a control instruction corresponding to the voice instruction, and the control method comprises the following steps:
if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
16. The apparatus of claim 13, further comprising:
the control unit is also used for storing the user identity of the user to be used of the equipment to be controlled and establishing the corresponding relation between the user identity and the voice system for the inertial use;
and/or the presence of a gas in the gas,
the control unit is further used for controlling the equipment to be controlled to execute a control instruction corresponding to the voice instruction according to a semantic keyword obtained by calling an inertial voice system matched with the user identity to perform semantic analysis on the voice instruction or a semantic keyword obtained by calling a set default voice system to perform semantic analysis on the voice instruction;
wherein the content of the first and second substances,
the control unit controls the equipment to be controlled to execute a control instruction corresponding to the voice instruction, and the control method comprises the following steps:
if the semantic keywords of the voice instruction comprise the translation semantics between the Chinese language and the foreign language, translating the control instruction corresponding to the voice instruction and then executing the translated control instruction.
17. An air conditioner, comprising: a voice controlled device as claimed in any one of claims 9 to 16.
18. A storage medium having a plurality of instructions stored therein; the plurality of instructions for being loaded by a processor and for performing the voice control method of any of claims 1-8.
19. An air conditioner, comprising:
a processor for executing a plurality of instructions;
a memory to store a plurality of instructions;
wherein the instructions are for storage by the memory and for loading and execution by the processor of the voice control method of any of claims 1-8.
CN201811505078.3A 2018-12-10 2018-12-10 Voice control method and device, storage medium and air conditioner Active CN109360563B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811505078.3A CN109360563B (en) 2018-12-10 2018-12-10 Voice control method and device, storage medium and air conditioner

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811505078.3A CN109360563B (en) 2018-12-10 2018-12-10 Voice control method and device, storage medium and air conditioner

Publications (2)

Publication Number Publication Date
CN109360563A CN109360563A (en) 2019-02-19
CN109360563B true CN109360563B (en) 2021-03-02

Family

ID=65331901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811505078.3A Active CN109360563B (en) 2018-12-10 2018-12-10 Voice control method and device, storage medium and air conditioner

Country Status (1)

Country Link
CN (1) CN109360563B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109979455A (en) * 2019-04-03 2019-07-05 深圳市尚可饰科技有限公司 A kind of dialect phonetic AI control method, device and terminal
CN110047467B (en) * 2019-05-08 2021-09-03 广州小鹏汽车科技有限公司 Voice recognition method, device, storage medium and control terminal
CN110931018A (en) * 2019-12-03 2020-03-27 珠海格力电器股份有限公司 Intelligent voice interaction method and device and computer readable storage medium
CN111128125A (en) * 2019-12-30 2020-05-08 深圳市优必选科技股份有限公司 Voice service configuration system and voice service configuration method and device thereof
CN111327935B (en) * 2020-03-02 2021-12-24 彩迅工业(深圳)有限公司 Information interaction platform based on artificial intelligence TV set
CN111312214B (en) * 2020-03-31 2022-12-16 广东美的制冷设备有限公司 Voice recognition method and device for air conditioner, air conditioner and readable storage medium
CN111540353B (en) * 2020-04-16 2022-11-15 重庆农村商业银行股份有限公司 Semantic understanding method, device, equipment and storage medium
CN114783437A (en) * 2022-06-15 2022-07-22 湖南正宇软件技术开发有限公司 Man-machine voice interaction realization method and system and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778946A (en) * 2014-01-10 2015-07-15 中国电信股份有限公司 Voice control method and system
CN106773742B (en) * 2015-11-23 2019-10-25 宏碁股份有限公司 Sound control method and speech control system
CN107945792B (en) * 2017-11-06 2021-05-28 百度在线网络技术(北京)有限公司 Voice processing method and device

Also Published As

Publication number Publication date
CN109360563A (en) 2019-02-19

Similar Documents

Publication Publication Date Title
CN109360563B (en) Voice control method and device, storage medium and air conditioner
US10832682B2 (en) Methods and apparatus for reducing latency in speech recognition applications
US11669300B1 (en) Wake word detection configuration
CN107644638B (en) Audio recognition method, device, terminal and computer readable storage medium
KR100764174B1 (en) Apparatus for providing voice dialogue service and method for operating the apparatus
WO2019134474A1 (en) Voice control method and device
EP3179475A1 (en) Voice wakeup method, apparatus and system
JP4446312B2 (en) Method and system for displaying a variable number of alternative words during speech recognition
CN112262430A (en) Automatically determining language for speech recognition of a spoken utterance received via an automated assistant interface
US8532992B2 (en) System and method for standardized speech recognition infrastructure
KR101213835B1 (en) Verb error recovery in speech recognition
CN111429899A (en) Speech response processing method, device, equipment and medium based on artificial intelligence
US10559303B2 (en) Methods and apparatus for reducing latency in speech recognition applications
CN110689877A (en) Voice end point detection method and device
CN105872687A (en) Method and device for controlling intelligent equipment through voice
CN111949240A (en) Interaction method, storage medium, service program, and device
CN109671427B (en) Voice control method and device, storage medium and air conditioner
CN109817203B (en) Voice interaction method and system
CN109859752A (en) A kind of sound control method, device, storage medium and voice joint control system
CN110931018A (en) Intelligent voice interaction method and device and computer readable storage medium
JP3795350B2 (en) Voice dialogue apparatus, voice dialogue method, and voice dialogue processing program
Stefanovic et al. Voice control system with advanced recognition
US20190279623A1 (en) Method for speech recognition dictation and correction by spelling input, system and storage medium
JPH10187184A (en) Method of selecting recognized word at the time of correcting recognized speech and system therefor
CN110534084B (en) Intelligent voice control method and system based on FreeWITCH

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant